python的多进程编程(3)

来源:互联网 发布:上瘾网络剧是哪里的 编辑:程序博客网 时间:2024/05/06 06:43
  • 闲话不多说,接着翻译python2.7.10 multiporcess部分

16.6.2.2 Pipes and Queus

在使用多进程时,通常使用传递消息来进行进程间通信,并且避免使用任何同步原语,例如,锁
传递消息可以使用Pipe() (连接2个进程)或者一个队列(queue,允许多个生产者和消费者)
Queue,multiporcessing.queues.SimpleQueue和JoinableQueue类型都是继承自标准库中Queue.Queue class的多生产者,多消费者的先进先出队列.不同之处是,Queue 缺少task_done()和join()方法(这2个方法在Python2.5才加入Queue.Queue)
如果你使用JoinalbeQueue,那么必须调用JoinableQueue.task_done()方法来删除queue里的每个任务,否则,使用信号量计数未完成的任务的数量最终可能溢出,引发一个异常
注意:同样可以通过manager object来创建一个shared queue(译者注:前面有介绍)
注意:multiprocessing使用Queue.Empty和Queue.Full异常来标识timeout.这2个异常不在multiprocessing命名空间,所以如果你需要import them from Queue.
注意:当一个对象put到queue,对象就会pickled,并且一个后台线程稍后会把pickled数据flush到一个隐藏的pipe.这种结果,是有些令人惊讶,但是使用中应该不会困难,前提是你使用manager创建queue.
1. 在把一个对象put到空queue后,会有一个极小的延迟时间,然后queue的empty方法才会return False,get_nowait()才会return,而不是抛出异常:Queue.Empty.(译者注:这个延迟时间极小,写过测试代码,0.01秒级别测试不出来,原文是用的infinitesimal,意思是极小,微量,无穷小的,无限小的)
2. 如果多个进程是队列中的对象,可能队列会在另一端无序地接受新的对象.然而,如果队列中的对象是由同一个进程操作的,那么队列中的对象总是按照预期顺序排列

警告:如果一个进程在使用Queue时,被Process.terminate()或者os.kill()方法结束进程,那么队列中的数据可能损坏.这会造成:当其他进程稍后试图使用队列时会抛出异常.
警告:正如上文提到的,如果一个子进程已经put一个元素到队列(并且这个进程没有使用JoinalbeQueue.cancel_join_thread),那么进程就不会结束直到all buffered items has been flushed to the pipe.
这就意味着:如果你试图join该进程,你可能会得到一个死锁,除非你确定你put到queue里的所有items已经consumed(消费).同样的,如果子进程是非后台进程,那么父进程会挂起(hang on exit), when it tries to join all its non-deamonic children.
注意:使用manager创建的queue不会有这个问题.

multiprocessing.Pipe([duplex])
Returns a pair (conn1, conn2) of Connection objects representing the ends of a pipe.
If duplex is True (the default) then the pipe is bidirectional. If duplex is False then the pipe is unidirectional: conn1 can only be used for receiving messages and conn2 can only be used for sending messages.

class multiprocessing.Queue([maxsize])
Returns a process shared queue implemented using a pipe and a few locks/semaphores. When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.
The usual Queue.Empty and Queue.Full exceptions from the standard library’s Queue module are raised to signal timeouts.
Queue implements all the methods of Queue.Queue except for task_done() and join().

qsize()
Return the approximate size of the queue. Because of multithreading/multiprocessing semantics, this number is not reliable.
Note that this may raise NotImplementedError on Unix platforms like Mac OS X where sem_getvalue() is not implemented.

empty()
Return True if the queue is empty, False otherwise. Because of multithreading/multiprocessing semantics, this is not reliable.

full()
Return True if the queue is full, False otherwise. Because of multithreading/multiprocessing semantics, this is not reliable.

put(obj[, block[, timeout]])
Put obj into the queue. If the optional argument block is True (the default) and timeout is None (the default), block if necessary until a free slot is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Queue.Full exception if no free slot was available within that time. Otherwise (block is False), put an item on the queue if a free slot is immediately available, else raise the Queue.Full exception (timeout is ignored in that case).

put_nowait(obj)
Equivalent to put(obj, False).

get([block[, timeout]])
Remove and return an item from the queue. If optional args block is True (the default) and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Queue.Empty exception if no item was available within that time. Otherwise (block is False), return an item if one is immediately available, else raise the Queue.Empty exception (timeout is ignored in that case).

get_nowait()
Equivalent to get(False).
Queue has a few additional methods not found in Queue.Queue. These methods are usually unnecessary for most code:

close()
Indicate that no more data will be put on this queue by the current process. The background thread will quit once it has flushed all buffered data to the pipe. This is called automatically when the queue is garbage collected.

  • join_thread()*
    Join the background thread. This can only be used after close() has been called. It blocks until the background thread exits, ensuring that all data in the buffer has been flushed to the pipe.
    By default if a process is not the creator of the queue then on exit it will attempt to join the queue’s background thread. The process can call cancel_join_thread() to make join_thread() do nothing.

cancel_join_thread()
Prevent join_thread() from blocking. In particular, this prevents the background thread from being joined automatically when the process exits – see join_thread().
A better name for this method might be allow_exit_without_flush(). It is likely to cause enqueued data to lost, and you almost certainly will not need to use it. It is really only there if you need the current process to exit immediately without waiting to flush enqueued data to the underlying pipe, and you don’t care about lost data.

Note
This class’s functionality requires a functioning shared semaphore implementation on the host operating system. Without one, the functionality in this class will be disabled, and attempts to instantiate a Queue will result in an ImportError. See issue 3770 for additional information. The same holds true for any of the specialized queue types listed below.

class multiprocessing.queues.SimpleQueue
It is a simplified Queue type, very close to a locked Pipe.

empty()
Return True if the queue is empty, False otherwise.

get()
Remove and return an item from the queue.

put(item)
Put item into the queue.

class multiprocessing.JoinableQueue([maxsize])
JoinableQueue, a Queue subclass, is a queue which additionally has task_done() and join() methods.

task_done()
Indicate that a formerly enqueued task is complete. Used by queue consumer threads. For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete.
If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue).
Raises a ValueError if called more times than there were items placed in the queue.

join()
Block until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the queue. The count goes down whenever a consumer thread calls task_done() to indicate that the item was retrieved and all work on it is complete. When the count of unfinished tasks drops to zero, join() unblocks.

0 0
原创粉丝点击