Distributed Systems with ZeroMQ

来源:互联网 发布:期货从业资格证知乎 编辑:程序博客网 时间:2024/04/26 03:00

Distributed Systems with ZeroMQ

08.15.2012
| 17492 views |
Share4
The Enterprise Integration Zone is presented by DZone and FuseSource and JNBridge.  For open source systems based on Apache Camel, ActiveMQ, or ServiceMix, look into FuseSource's training and technology.  For .NET and Java interoperability, JNBridge has the answers.

Departing a bit from my current series on gevent and Python, today Iwant to take a look at a different networking technology that's been gainingtraction:ZeroMQ. So without further ado, let's jump right in...

ZeroMQ design principles

First, ZeroMQ is not a messagebroker. People sometimes mistake it for one because of its name. Actually, ZeroMQ is alibrary thatsupports certain network communication patterns usingsockets. The "MQ" partcomes in because ZeroMQ uses queues internally to buffer messages so that youdon't block your application when sending data. When you say socket.send(...),ZeroMQ actuallyenqueues a message to be sent later by a dedicatedcommunication thread. (This communication thread and its state are encapsulated inthe ZeroMQ Context object used below; most programs will have a singleContext.)

ZeroMQ binding/connecting versus "normal" sockets

Next, keep in mind that ZeroMQ separates the notion of clients andservers from the underlying communication pattern. For instance, you may be usedto creating a socket for receiving requests with a pattern similar to thefollowing:

view sourceprint?
01.fromsocket import socket
02. 
03.sock= socket()
04.sock.bind(('',8080))
05.sock.listen(256)
06.whileTrue:
07.cli= sock.accept()
08.# The following code would probably be handled in a 'worker' thread or
09.# greenlet. It's included here only for example purposes.
10.message= cli.recv(...)
11.response= handle_message(message)
12.cli.send(response)
The following code would probably be handled in a 'worker' thread or greenlet. It's included here only for example purposes.

The client would then connect() to the server and send a request:

view sourceprint?
1.fromsocket import socket
2. 
3.sock= socket()
4.sock.connect(('localhost',8080))
5.sock.send(message)
6.response= sock.recv(...)


In ZeroMQ, either end of the request/response pattern can bind, and either endcan connect. For instance, using thepyzmq library, you can have your"server" (the one who handles requests) connect to the "client" (the one whosends requests). The "server" code then looks like this:

view sourceprint?
01.importzmq
02.context= zmq.Context.instance()
03. 
04.sock= context.socket(zmq.REP)
05.sock.connect('tcp://localhost:8080')
06. 
07.whileTrue:
08.message= sock.recv()
09.response= handle_message(message)
10.sock.send(response)


The "client" code would look like this:

view sourceprint?
1.importzmq
2.context= zmq.Context.instance()
3. 
4.sock= context.socket(zmq.REQ)
5.sock.bind('tcp://*:8080')
6. 
7.sock.send(message)
8.response= sock.recv()


A couple of things deserve attention here. First, as noted above, the "server" isdoing the connecting, and the "client" is doing the binding. Another thing tonote is the address being used. Rather than passing a hostname/port, we passa URI.

ZeroMQ transport types

ZeroMQ supports several different styles of URIs for its transport layer,each of which supports the full gamut of ZeroMQ functionality:

  • tcp://hostname:port sockets let us do "regular" TCP networking
  • inproc://name sockets let us do in-process networking (inter-thread/greenlet) with the same code we'd use for TCP networking
  • ipc:///tmp/filename sockets use UNIX domain sockets for inter-process communication
  • pgm://interface:address:port and epgm://interface:address:port use the OpenPGM library to support multicast over IP (pgm) and over UDP (epgm). Due to the nature of multicast, the pgm and epgm transports can only be used with PUB/SUB socket types (more on this below).

ZeroMQ disconnected operation

One feature that sometimes catches programmers new to ZeroMQ off guard is that itsupportsdisconnected operation. In the code above, for instance, we couldhave started the server first and the client later. With TCP sockets, thiswouldn't work because the server tries to connect() to the client. In ZeroMQ,the connect() will go through "optimistically," assuming that someone's going tobind to that port later.

What's more is that you can have a client start up, bind to port 8080, perform atransaction with the server, and then shutdown.Another client can then startup, bind to port 8080, and perform another transaction. The server just keepshandling requests, happily "connected" to whatever happens to bind to port 8080.

ZeroMQ message encapsulation

One final aspect of ZeroMQ is that it encapsulates communication intomessages that may be composed of multipleparts. Rather than asking ZeroMQto receive a certain number of bytes from the socket, you ask ZeroMQ to receivea singlemessage. You can also send and receive multipart messages using thezmq.SNDMORE and zmq.RECVMORE options. To send a multipart message, just usezmq.SNDMORE as a second argument to each part's send() except the last:

view sourceprint?
1.sock.send(part1, zmq.SNDMORE)
2.sock.send(part2, zmq.SNDMORE)
3.sock.send(part3, zmq.SNDMORE)
4.sock.send(final)

 
The client can then ask if there's more to receive:

view sourceprint?
1.more= True
2.parts= []
3.whilemore:
4.parts.append(sock.recv())
5.more= sock.getsockopt(zmq.RCVMORE)


ZeroMQ communication patterns

A core concept of ZeroMQ that I've alluded to above but not made explicit is thecommunication patterns supported by ZeroMQ. Because of some of the whiz-bangfeatures such as asynchronous communication and disconnected operation, it'snecessary to apply higher-level patterns than just shoving bytes from oneendpoint to another. ZeroMQ implements this by making you specify a socket_typewhen you call zmq.Context.socket(). Each socket type has a set of "compatible"socket types with which it can communicate, and ZeroMQ will raise an exception ifyou try to communicate between incompatible sockets. Here, I'll describe some ofthe basic patterns:

ZeroMQ request/reply pattern

This pattern is fairly classic; one end (with socket_type=zmq.REQ) sends a request andreceives a response. The other end (with socket_type=zmq.REP) receives arequest and sends a response. A simple echo server might use this pattern. Theserver would be the following:

view sourceprint?
01.importsys
02.importzmq
03. 
04.context= zmq.Context()
05.sock= context.socket(zmq.REP)
06.sock.bind(sys.argv[1])
07. 
08.whileTrue:
09.message= sock.recv()
10.sock.send('Echoing: '+ message)


Your client then looks like this:

view sourceprint?
1.importsys
2.importzmq
3.context= zmq.Context()
4. 
5.sock= context.socket(zmq.REQ)
6.sock.connect(sys.argv[1])
7.sock.send(' '.join(sys.argv[2:]))
8.printsock.recv()


Note that in this pattern the zmq.REQ socket must communicate with a seriesof send(), recv() pairs, and the zmq.REP socketmust communicate with aseries of recv(), send() pairs. If you try to send or recv two messagesin a row, ZeroMQ will raise an exception. Thiscan cause problems if you have aserver that crashes, for instance, because you'd leave your client in a "danglingsend" state. To recover, you needsome other mechanism for timing out requests,closing the socket, and retrying with a new, fresh zmq.REQ socket.

ZeroMQ publish/subscribe pattern

In the publish/subscribe pattern, you have a single socket of type zmq.PUB andzero or more connected zmq.SUB sockets. The zmq.PUB socketbroadcastsmessages using send() that the zmq.SUB sockets recv(). Each subscribermust explicitly say what messages it's interested in using the setsockoptmethod. A subscription is a string specifying aprefix of messages thesubscriber is interested in. Thus to subscribe to all messages, the subscriberwould use the call sub_sock.setsockopt(zmq.SUBSCRIBE, ''). Subscribers canalso explicitly unsubscribe from a topic using setsockopt(zmq.UNSUBSCRIBE, ...as well.

One interesting aspect of the zmq.SUB sockets is that they can connect tomultiple endpoints, so that they receive messages fromall the publishers. Forexample, suppose you have a server periodically sending messages:

view sourceprint?
01.importsys
02.importtime
03.importzmq
04. 
05.context= zmq.Context()
06.sock= context.socket(zmq.PUB)
07.sock.bind(sys.argv[1])
08. 
09.whileTrue:
10.time.sleep(1)
11.sock.send(sys.argv[1]+ ':' + time.ctime())


You could have a client connect to multiple servers with the following code:

view sourceprint?
01.importsys
02.importzmq
03. 
04.context= zmq.Context()
05.sock= context.socket(zmq.SUB)
06.sock.setsockopt(zmq.SUBSCRIBE, '')
07. 
08.forarg in sys.argv[1:]:
09.sock.connect(arg)
10. 
11.whileTrue:
12.message=sock.recv()
13.printmessage


To see the multi-subscribe in action, you can start these programs as follows:

view sourceprint?
1.$ python publisher.py tcp://*:8080& python publisher.py tcp://*:8081&
2.$ python subscriber.py tcp://localhost:8080tcp://localhost:8081


ZeroMQ push/pull pattern

Similar to the pub/sub pattern in the push/pull pattern you have one side (thezmq.PUSH socket) that's doing all the sending, and the other side (zmq.PULL)does all the receiving. The difference between push/pull and pub/sub is that inpush/pull each message is routed to a single zmq.PULL socket, whereas inpub/sub each message isbroadcast to all the zmq.SUB sockets. The push/pullpattern is useful for pipelined workloads where a worker process performs someoperations and then sends results along for further processing. It's also usefulfor implementing traditional message queues.

We can see the routing of messages by connecting multiple clients to a singleserver. For this example, we can just change our socket type in the publishercode to be of type zmq.PUSH:

view sourceprint?
01.importsys
02.importtime
03.importzmq
04. 
05.context= zmq.Context()
06.sock= context.socket(zmq.PUSH)
07.sock.bind(sys.argv[1])
08. 
09.whileTrue:
10.time.sleep(1)
11.sock.send(sys.argv[1]+ ':' + time.ctime())


Our client is likewise similar to the subscriber code:

view sourceprint?
01.importsys
02.importzmq
03. 
04.context= zmq.Context()
05.sock= context.socket(zmq.PULL)
06. 
07.forarg in sys.argv[1:]:
08.sock.connect(arg)
09. 
10.whileTrue:
11.message=sock.recv()
12.printmessage


(Note that we can do the same multi-connect trick we did with the pub/sub, aswell.) Now to see the multi-push, multi-pull, we can start two "pushers" and two"pullers":

view sourceprint?
1.$# Start the pushers in one window
2.$ python pusher.py tcp://*:8080& python pusher.py tcp://*:8081&
3.$# Start a puller in another window
4.$ python puller.py tcp://localhost:8080tcp://localhost:8081
5.$# Start another puller in a third window
6.$ python puller.py tcp://localhost:8080tcp://localhost:8081


Conclusion

ZeroMQ provides a handy abstraction for several network communication patternsthat we can use quite easily from Python. If you're thinking of building ahigh-performance distributed system, its certainly worth checking out ZeroMQ asa possible transport layer. Here, I've barely scratched the surface ofwhat's possible with ZeroMQ in Python. In future posts, I'll go a bit deeper,covering topics including:

  • flow control with ZeroMQ
  • advanced communication patterns and devices
  • using ZeroMQ with gevent

I'd love to hear how you're using (or are thinking of using) ZeroMQ for buildingPython applications. In particular, are there any questions you have about ZeroMQthat I might be able to answer in successive posts? Are you using ZeroMQ already,and if so, have you run into any issues? Tell me about it in the comments below!

Published at DZone with permission of its author, Rick Copeland. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:
Enterprise Integration is a huge problem space for developers, and with so many different technologies to choose from, finding the most elegant solution can be tricky. FuseSource provides its own gamut of training, services, and tools based on ApacheCamel,ActiveMQ, CXF, and ServiceMix.  JNBridge provides the technology tobridge .NET and Javaapplications at a low level for maximum performance.

原创粉丝点击