Lambda Architecture
来源:互联网 发布:mac 强制重启finder 编辑:程序博客网 时间:2024/05/21 20:11
Making Sense of it All
Building a well-designed, reliable and functional big data application that caters to a variety of end-user latency requirements can be an extremely challenging proposition. It can be daunting enough to just keep up with the rapid pace of technology innovation happening in this space, let alone building applications that work for the problem at hand. “Start slow and build one application at a time” is perhaps the most common advice given to beginners today. However, there are certain high-level architectural constructs that can help you mentally visualize how different types of applications fit into the big data architecture and how some of these technologies are transforming the existing enterprise software landscape.
Lambda Architecture
Lambda Architecture is a useful framework to think about designing big data applications.Nathan Marz came up with the term Lambda Architecture (LA) for a generic, scalable and fault-tolerant data processing architecture, based on his experience working on distributed data processing systems at Backtype and Twitter.
The LA aims to satisfy the needs for a robust system that is fault-tolerant, both against hardware failures and human mistakes, being able to serve a wide range of workloads and use cases, and in which low-latency reads and updates are required. The resulting system should be linearly scalable, and it should scale out rather than up.
Here’s how it looks like, from a high-level perspective:
The Lambda Architecture as seen in the picture has three major components.
Batch layer that provides the following functionality
- managing the master dataset, an immutable, append-only set of raw data
- pre-computing arbitrary query functions, called batch views.
Serving layer—This layer indexes the batch views so that they can be queried in ad hoc with low latency.
Speed layer—This layer accommodates all requests that are subject to low latency requirements. Using fast and incremental algorithms, the speed layer deals with recent data only.
- All data entering the system is dispatched to both the batch layer and the speed layer for processing.
- The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views.
- The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way.
- The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only.
- Any incoming query can be answered by merging results from batch views and real-time views.
In the case of failure or orderly exit of a topology, the new version of the bolt can read this log and reconstruct the necessary state of the bolt very quickly. Once the log is read, tuples coming from the spout can be processed as if nothing ever happened. Since all tuples that arrived after the last record in the log have not been acknowledged, the spout will replay them so the bolt will get a complete set of tuples.
- Lambda Architecture
- Lambda Architecture
- Big Data Lambda Architecture 翻译
- Lambda Architecture-大数据处理系统经典架构解析
- 数据系统架构——Lambda architecture(Lambda架构)
- Architecture
- ARCHITECTURE
- Lambda
- lambda
- lambda
- lambda
- Lambda
- lambda
- lambda
- Lambda
- Lambda
- lambda
- lambda
- 增加Mybatis-generator生成的Mapper类和Mapper.xml里的方法
- Interesting drink
- MySQL数据库函数大全
- Java日期计算之Joda-Time
- PAT 甲级 1011. World Cup Betting (20)
- Lambda Architecture
- Activity生命周期
- HDU-2844-Coins(多重背包)
- 图像处理网络资源
- UDP套接字编程
- java基础 字符串处理
- kaoshi(imageload接口)
- Java循环
- ubuntu16.04下安装codebocks