让ThreadPoolExecutor的workQueue占满时自动阻塞submit()方法

来源：互联网发布：匈牙利语翻译软件编辑：程序博客网时间：2024/06/14 11:58

使用Java的ThreadPoolExecutor可以并发地执行一些任务，它的基本用法是：

（1）

创建一个 ThreadPoolExecutor 对象

ThreadPoolExecutor executor = new ThreadPoolExecutor(corePoolSize, maximumPoolSize, keepAliveTime, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>(workQueueSize));

这里使用的构造函数是：

/** * Creates a new <tt>ThreadPoolExecutor</tt> with the given initial * parameters and default thread factory and rejected execution handler. * It may be more convenient to use one of the {@link Executors} factory * methods instead of this general purpose constructor. * * @param corePoolSize the number of threads to keep in the * pool, even if they are idle. * @param maximumPoolSize the maximum number of threads to allow in the * pool. * @param keepAliveTime when the number of threads is greater than * the core, this is the maximum time that excess idle threads * will wait for new tasks before terminating. * @param unit the time unit for the keepAliveTime * argument. * @param workQueue the queue to use for holding tasks before they * are executed. This queue will hold only the <tt>Runnable</tt> * tasks submitted by the <tt>execute</tt> method. */public ThreadPoolExecutor(int corePoolSize,                          int maximumPoolSize,                          long keepAliveTime,                          TimeUnit unit,                          BlockingQueue<Runnable> workQueue) {    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,         Executors.defaultThreadFactory(), defaultHandler);}

简要地说明一下：

第1个参数 corePoolSize 是保留在线程池中的线程数，即使这个线程是空闲的，它也会被一直保留着；

第2个参数 maximumPoolSize 是线程池中最大允许的线程数；

最后一个参数 workQueue 用于保存执行之前的task，即task会被从 workQueue 里不断地取出来，再放到线程池里去执行。

文章来源： http://www.codelast.com/

（2）

利用创建的 ThreadPoolExecutor 对象，在需要的地方不断地提交（submit）任务

executor.submit(new MyTask());

其中，MyTask是一个你自定义的类，例如：

public class MyTask implements Runnable {  @Override  public void run() {    //TODO:  }}

我们需要在 run() 方法中写执行一个具体任务的代码。

文章来源： http://www.codelast.com/

由于 workQueue 的大小有限，当我们 submit() 任务太快的时候（也就是说，task占满了workQueue里的所有空间，此时又有新的task要被提交了），会导致无法再将新的task放到workQueue中，此时，submit() 方法会抛出异常，表明“我已经吃不消了，请你分配任务慢一点”。

这在某些应用场景下是OK的，例如，在一个抓取网页的系统（爬虫）中，提交一个抓取网页的task只是一瞬间的事，而网页抓取过程通常会是速度瓶颈，所以，如果此系统负载已经非常高了，那么我们可以放弃掉一部分URL不去抓取，也不能让系统不断地积压无数URL，导致系统最终被压垮。

但是在另一些应用场景下，这却是不能接受的，因为我们不能因为系统的处理速度慢，就丢掉我们一定必须要执行的task，因为这些task非常重要，就算是多耗一些时间，让系统慢慢处理也好，但是却不能损失一个task。总之，“等得起”，但是“丢不起”。

文章来源： http://www.codelast.com/

所以问题就来了，如何在ThreadPoolExecutor的workQueue全满的情况下，使得submit()方法能block在那里，一直等到有资源了，再继续提交task？

有好多种方法可以实现类似的效果：

「1」让ThreadPoolExecutor使用自己实现的RejectedExecutionHandler，在其中阻塞式地将task放到workQueue中

这是网上很多教程提供的一个方法。

所以我们先说一下RejectedExecutionHandler是个什么鬼，它和ThreadPoolExecutor有什么关系。

ThreadPoolExecutor可以设置一个“拒绝策略”，这是指当一个task被拒绝添加到线程池中时，采取的处理措施，例如：

executor.setRejectedExecutionHandler(new ThreadPoolExecutor.DiscardPolicy());

这使得当task被拒绝添加到线程池中时，ThreadPoolExecutor会采用“丢弃”策略来对待这个任务，即这个task被丢弃了。

那么，如何利用RejectedExecutionHandler来阻塞submit()？

首先要知道，submit()方法是调用了workQueue的offer()方法来塞入task，而offer()方法是非阻塞的，当workQueue已经满的时候，offer()方法会立即返回false，并不会阻塞在那里等待workQueue有空出位置，所以要让submit()阻塞，关键在于改变向workQueue添加task的行为，所以，有这样一种方法：

executor.setRejectedExecutionHandler(new RejectedExecutionHandler() {  @Override  public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {    if (!executor.isShutdown()) {      try {        executor.getQueue().put(r);      } catch (InterruptedException e) {      }    }  }});

其中，重写的rejectedExecution()方法调用了getQueue()方法，得到了workQueue，再调用其put()方法，将task放到workQueue中，而这个put()方法是阻塞的：

/*** Inserts the specified element into this queue, waiting if necessary for space to become available.*/void put(E e) throws InterruptedException;

这就达到了想要的效果：当workQueue满时，submit()一个task会导致调用我们自定义的RejectedExecutionHandler，而我们自定义的RejectedExecutionHandler会保证该task继续被尝试用阻塞式的put()到workQueue中。

文章来源： http://www.codelast.com/

尽管这种方法非常简单，但是使用它是非常不好的，原因包括但不限于：

[1]

ThreadPoolExecutor的API不建议这样做

/** * Returns the task queue used by this executor. Access to the * task queue is intended primarily for debugging and monitoring. * This queue may be in active use.  Retrieving the task queue * does not prevent queued tasks from executing. * * @return the task queue */public BlockingQueue<Runnable> getQueue() {    return workQueue;}

可见，API已经说明了：getQueue()主要是用于调试和监控。

[2] 可能会导致死锁等...（未仔细研究）

文章来源： http://www.codelast.com/

「2」

使用CallerRunsPolicy

executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());

当使用这种拒绝策略时，如果workQueue满了，ThreadPoolExecutor就会在调用者线程（即生产者线程）中执行要提交的task——生产者线程帮消费者线程干了自己“不应该干的活”。

使用这种方法的时候，生产者线程在task被执行完之前将不再能提交新的task，除此之外貌似没有什么其他问题——除了感觉有点“怪怪的”，因为生产者线程一人饰演两个角色。

文章来源： http://www.codelast.com/

「3」使用自己重写了offer()方法的BlockingQueue

由于submit()是调用workQueue的offer()方法来添加task的，而offer()是非阻塞的，所以，如果我们自己实现一个BlockingQueue，其offer()方法是阻塞的，那么，就可以用它和ThreadPoolExecutor配合，来实现submit()方法在workQueue满时的阻塞效果了（来自 StackOverflow ）：

public class LimitedQueue<E> extends LinkedBlockingQueue<E> {  public LimitedQueue(int maxSize) {    super(maxSize);  }  @Override  public boolean offer(E e) {    // turn offer() and add() into a blocking calls (unless interrupted)    try {      put(e);      return true;    } catch (InterruptedException ie) {      Thread.currentThread().interrupt();    }    return false;  }}

然后在构造ThreadPoolExecutor对象的时候，最后一个参数workQueue使用这个LimitedQueue类的对象即可。

2 0