Distributed RPC(分布式RPC)-Storm

来源：互联网发布：外贸搜索客户软件编辑：程序博客网时间：2024/04/29 13:51

Distributed RPC(分布式RPC)

The idea behind distributed RPC (DRPC) is to parallelize the computation of really intense functions on the fly using Storm. The Storm topology takes in as input a stream of function arguments, and it emits an output stream of the results for each of those function calls.
DRPC是为了真正实现在storm上高速计算的。Storm topology获取输入方法参数流，发射每个方法调用的结果数据流。

DRPC is not so much a feature of Storm as it is a pattern expressed from Storm’s primitives of streams, spouts, bolts, and topologies. DRPC could have been packaged as a separate library from Storm, but it’s so useful that it’s bundled with Storm.
DRPC并不是Storm的一个特性，不像Storm的streams，spouts，bolts和topology。DRPC可以被打包到一个独立于Storm的包，但是它对Storm是很有用的。

High level overview（高级概述）

Distributed RPC is coordinated by a “DRPC server” (Storm comes packaged with an implementation of this). The DRPC server coordinates receiving an RPC request, sending the request to the Storm topology, receiving the results from the Storm topology, and sending the results back to the waiting client. From a client’s perspective, a distributed RPC call looks just like a regular RPC call. For example, here’s how a client would compute the results for the “reach” function with the argument “http://twitter.com”:
分布式RPC是需要”DRPC server”配合的（storm有一个实现）。DRPC server协调接收RPC请求，发送请求到Storm topology, 接收Storm topology的结果，发送结果给等待的客户端。从客户端来看，一个分布式RPC调用就像一个regular RPC调用。例如这有一个客户端使用参数”http://twitter.com“计算”reach”方法结果：

DRPCClient client = new DRPCClient("drpc-host", 3772);String result = client.execute("reach", "http://twitter.com");

The distributed RPC workflow looks like this:
DRPC工作流程如下：

A client sends the DRPC server the name of the function to execute and the arguments to that function. The topology implementing that function uses a DRPCSpout to receive a function invocation stream from the DRPC server. Each function invocation is tagged with a unique id by the DRPC server. The topology then computes the result and at the end of the topology a bolt called ReturnResults connects to the DRPC server and gives it the result for the function invocation id. The DRPC server then uses the id to match up that result with which client is waiting, unblocks the waiting client, and sends it the result.
一个客户端发送DRPC server方法名和方法执行参数。topology实现该方法，用DRPCSpout接收方法调用数据流从DRPC server。每个方法调用会被DRPC server被打上唯一id标识。topology计算结果，在topology的结尾bolt上调用ReturnResults链接DRPC server，给它该方法调用id的结果。DRPC server用这个id匹配等待的客户端，解锁该等待客户端，发送结果给它。

LinearDRPCTopologyBuilder(线性DRPCTopology构造器)

Storm comes with a topology builder called LinearDRPCTopologyBuilder that automates almost all the steps involved for doing DRPC. These include:
Storm有一个topology构造器叫做 LinearDRPCTopologyBuilder ，可以自动化实现几乎所有的调用DRPC步骤。

Setting up the spout
建立一个spout
Returning the results to the DRPC server
返回DRPC server结果
Providing functionality to bolts for doing finite aggregations over groups of tuples
提供函数化的bolts进行跨tuples分组的有限聚合

Let’s look at a simple example. Here’s the implementation of a DRPC topology that returns its input argument with a “!” appended:
来看一个简单的例子。实现一个DRPC topology，得到输入参数附加一个”!”的例子：

public static class ExclaimBolt extends BaseBasicBolt {    public void execute(Tuple tuple, BasicOutputCollector collector) {        String input = tuple.getString(1);        collector.emit(new Values(tuple.getValue(0), input + "!"));    }    public void declareOutputFields(OutputFieldsDeclarer declarer) {        declarer.declare(new Fields("id", "result"));    }}public static void main(String[] args) throws Exception {    LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("exclamation");    builder.addBolt(new ExclaimBolt(), 3);    // ...}

As you can see, there’s very little to it. When creating the LinearDRPCTopologyBuilder, you tell it the name of the DRPC function for the topology. A single DRPC server can coordinate many functions, and the function name distinguishes the functions from one another. The first bolt you declare will take in as input 2-tuples, where the first field is the request id and the second field is the arguments for that request.LinearDRPCTopologyBuilder expects the last bolt to emit an output stream containing 2-tuples of the form [id, result]. Finally, all intermediate tuples must contain the request id as the first field.
正如你所见，代码很少。当创建LinearDRPCTopologyBuilder时，你告诉topolopy DRPC杭发的名字。一个独立的DRPC server可以和很多方法相关，方法名要求不同。第一个bolt你声明获取输入2个tuples，第一个域是请求id，第二个域是请求的参数。LinearDRPCTopologyBuilder预期最后一个bolt发射一个包含2个tuples[id, result]的流。最后所有tuples必须包含请求id作为第一个域。

In this example, ExclaimBolt simply appends a “!” to the second field of the tuple. LinearDRPCTopologyBuilder handles the rest of the coordination of connecting to the DRPC server and sending results back.
在这个例子里，ExclaimBolt简单的附加一个”!”到tuple的第二个域。LinearDRPCTopologyBuilder控制连接DRPC server的其他部分，发送返回结果。

Local mode DRPC（DRPC本地模式）

DRPC can be run in local mode. Here’s how to run the above example in local mode:
DRPC可以运行在本地模式。这有一个本地模式的例子：

LocalDRPC drpc = new LocalDRPC();LocalCluster cluster = new LocalCluster();cluster.submitTopology("drpc-demo", conf, builder.createLocalTopology(drpc));System.out.println("Results for 'hello':" + drpc.execute("exclamation", "hello"));cluster.shutdown();drpc.shutdown();

First you create a LocalDRPC object. This object simulates a DRPC server in process, just like how LocalCluster simulates a Storm cluster in process. Then you create the LocalCluster to run the topology in local mode. LinearDRPCTopologyBuilder has separate methods for creating local topologies and remote topologies. In local mode the LocalDRPC object does not bind to any ports so the topology needs to know about the object to communicate with it. This is why createLocalTopology takes in the LocalDRPC object as input.
首先，你创建一个LocalDRPC对象。这个对象在进程里模拟一个DRPC server，就像LocalCluster在进程里模拟一个Storm集群。你创建LocalCluster去运行topology在本地模式。LinearDRPCTopologyBuilder有不同的方法创建本地topology和远程topology。在本地模式，LocalDRPC不会绑定任何端口为了topology去进行通信。这是为什么createLocalTopology获取LocalDRPC对象为输入。

After launching the topology, you can do DRPC invocations using the execute method on LocalDRPC.
启动topology后，你可以通过LocalDRPC的execute方法做DRPC调用。

Remote mode DRPC（DRPC远程模式）

Using DRPC on an actual cluster is also straightforward. There’s three steps:
在真实的集群上使用DRPC是很简单的。有三个步骤：

Launch DRPC server(s)
启动DRPC server(s)
Configure the locations of the DRPC servers
配置DRPC servers的位置
Submit DRPC topologies to Storm cluster
提交DRPC topology到Storm集群

Launching a DRPC server can be done with the storm script and is just like launching Nimbus or the UI:
可以通过storm脚本启动DRPC server，就像启动Nimbus和UI：

bin/storm drpc

Next, you need to configure your Storm cluster to know the locations of the DRPC server(s). This is how DRPCSpout knows from where to read function invocations. This can be done through the storm.yaml file or the topology configurations. Configuring this through the storm.yaml looks something like this:
下一步，你需要配置你的Storm集群的DRPC server(s)位置。让DRPCSpout知道从哪进行方法调用。可以通过storm.yaml文件或者topology配置。配置storm.yaml文件如下：

drpc.servers:  - "drpc1.foo.com"  - "drpc2.foo.com"

Finally, you launch DRPC topologies using StormSubmitter just like you launch any other topology. To run the above example in remote mode, you do something like this:
最终，你启动DRPC topologies使用StormSubmitter，就像启动其他topology一样。运行远程模式，如下：

StormSubmitter.submitTopology("exclamation-drpc", conf, builder.createRemoteTopology());

createRemoteTopology is used to create topologies suitable for Storm clusters.
用createRemoteTopology创建何时Storm集群的topologies。

A more complex example（一个更复杂的例子）— 下面的内容不建议看了，在0.8版本后，有Trident类可以实现复杂的DRPC查询实现，请看Trident wiki

The exclamation DRPC example was a toy example for illustrating the concepts of DRPC. Let’s look at a more complex example which really needs the parallelism a Storm cluster provides for computing the DRPC function. The example we’ll look at is computing the reach of a URL on Twitter.
这个DRPC例子是一个说明DRPC概念的例子。来看一个真正需要并行Storm集群提供DRPC方法的计算的例子。这个例子我们要计算一个Twitter URL的reach。

The reach of a URL is the number of unique people exposed to a URL on Twitter. To compute reach, you need to:
一个URL的reach是Twitter上一个URL唯一exposed用户数。为了计算reach，你需要：

Get all the people who tweeted the URL
获得所有tweeted这个URL的所有人
Get all the followers of all those people
获得所有这些用户的followers
Unique the set of followers
排重这个followers集合
Count the unique set of followers
计这个唯一followers集合的count值

A single reach computation can involve thousands of database calls and tens of millions of follower records during the computation. It’s a really, really intense computation. As you’re about to see, implementing this function on top of Storm is dead simple. On a single machine, reach can take minutes to compute; on a Storm cluster, you can compute reach for even the hardest URLs in a couple seconds.
一个独立的reach计算可能调用数千次数据库访问和数千万个follower记录。这是个非常巨大的计算量。正如你所见，基于Storm实现这个方法是很简单的。在一个独立的机器，reach可能需要数分钟计算；作为Storm集群，你可能计算这次URLs只需几秒钟。

A sample reach topology is defined in storm-starter here. Here’s how you define the reach topology:
一个简单的reach topology定义在storm-starter here。如下：

LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("reach");builder.addBolt(new GetTweeters(), 3);builder.addBolt(new GetFollowers(), 12)        .shuffleGrouping();builder.addBolt(new PartialUniquer(), 6)        .fieldsGrouping(new Fields("id", "follower"));builder.addBolt(new CountAggregator(), 2)        .fieldsGrouping(new Fields("id"));

The topology executes as four steps:
topology执行四个步骤：

GetTweeters gets the users who tweeted the URL. It transforms an input stream of [id, url] into an output stream of [id, tweeter]. Each url tuple will map to many tweeter tuples.
GetTweeters得到所有tweeted这个URL的所有用户。转化一个输入流[id, url]到一个输出流[id, tweeter]。每个url tuple可能映射到很多tweeter tuples。
GetFollowers gets the followers for the tweeters. It transforms an input stream of [id, tweeter] into an output stream of[id, follower]. Across all the tasks, there may of course be duplication of follower tuples when someone follows multiple people who tweeted the same URL.GetFollowers得到tweeters的followers。转化一个输入流[id, tweeter]到一个输出流[id, follower]。在所有的tasks里，可能有重复的follower tuples，当有人follows多个人，而这些人都tweeted了同一个URL。
PartialUniquer groups the followers stream by the follower id. This has the effect of the same follower going to the same task. So each task of PartialUniquer will receive mutually independent sets of followers. Once PartialUniquer receives all the follower tuples directed at it for the request id, it emits the unique count of its subset of followers.
PartialUniquer通过follower id分组followers流。这会使相同的follower到同一个task。所以每一个PartialUniquer task 会接收到互相独立的followers集合。一旦PartialUniquer接收了所有的follower tuples针对请求id的，他发射followers子集的唯一count值。
Finally, CountAggregator receives the partial counts from each of the PartialUniquer tasks and sums them up to complete the reach computation.
最后，CountAggregator接收局部counts值从每个PartialUniquer tasks，然后汇总她们到一个全局的reach计算值。

Let’s take a look at the PartialUniquer bolt:
看一下PartialUniquer bolt吧：

public class PartialUniquer extends BaseBatchBolt {    BatchOutputCollector _collector;    Object _id;    Set _followers = new HashSet();    @Override    public void prepare(Map conf, TopologyContext context, BatchOutputCollector collector, Object id) {        _collector = collector;        _id = id;    }    @Override    public void execute(Tuple tuple) {        _followers.add(tuple.getString(1));    }    @Override    public void finishBatch() {        _collector.emit(new Values(_id, _followers.size()));    }    @Override    public void declareOutputFields(OutputFieldsDeclarer declarer) {        declarer.declare(new Fields("id", "partial-count"));    }}

PartialUniquer implements IBatchBolt by extending BaseBatchBolt. A batch bolt provides a first class API to processing a batch of tuples as a concrete unit. A new instance of the batch bolt is created for each request id, and Storm takes care of cleaning up the instances when appropriate.
PartialUniquer通过继承BaseBatchBolt实现了IBatchBolt。一个batch bolt提供第一类API处理一批tuples未作一个具体的单元。一个新的batch bolt实例有每个请求id创建，Storm在需要的时候清理实例。

When PartialUniquer receives a follower tuple in the execute method, it adds it to the set for the request id in an internal HashSet.
当PartialUniquer接收一个follower tuple在execute方法，它添加it到请求id的hash集合。

Batch bolts provide the finishBatch method which is called after all the tuples for this batch targeted at this task have been processed. In the callback, PartialUniquer emits a single tuple containing the unique count for its subset of follower ids.
Batch bolts提供finishBatch方法调用，当这批所有tuples处理完时。在回调时，PartialUniquer发射一个tuple包含该follower ids子集的唯一count值。

Under the hood, CoordinatedBolt is used to detect when a given bolt has received all of the tuples for any given request id. CoordinatedBoltmakes use of direct streams to manage this coordination.
在背后，CoordinatedBolt用于检查，当一个bolt接收了该请求id的所有tuples。CoordinatedBolt确保直接的流管理这种情况。

The rest of the topology should be self-explanatory. As you can see, every single step of the reach computation is done in parallel, and defining the DRPC topology was extremely simple.
topology剩下的部分很好解释。如你所见，每个reach计算的步骤都是并行的，定义DRPC topology很简单。

Non-linear DRPC topologies（非线性DRPC topologies）

LinearDRPCTopologyBuilder only handles “linear” DRPC topologies, where the computation is expressed as a sequence of steps (like reach). It’s not hard to imagine functions that would require a more complicated topology with branching and merging of the bolts. For now, to do this you’ll need to drop down into using CoordinatedBolt directly. Be sure to talk about your use case for non-linear DRPC topologies on the mailing list to inform the construction of more general abstractions for DRPC topologies.

How LinearDRPCTopologyBuilder works

DRPCSpout emits [args, return-info]. return-info is the host and port of the DRPC server as well as the id generated by the DRPC server
constructs a topology comprising of:
- DRPCSpout
- PrepareRequest (generates a request id and creates a stream for the return info and a stream for the args)
- CoordinatedBolt wrappers and direct groupings
- JoinResult (joins the result with the return info)
- ReturnResult (connects to the DRPC server and returns the result)
LinearDRPCTopologyBuilder is a good example of a higher level abstraction built on top of Storm’s primitives

Advanced

KeyedFairBolt for weaving the processing of multiple requests at the same time
How to use CoordinatedBolt directly

转自：http://www.coderzhang.com/blog/%e5%88%86%e5%b8%83%e5%bc%8frpc/