javascript task pool的设计与实现

来源：互联网发布：自己网络用语怎么说编辑：程序博客网时间：2024/06/09 14:56

先简单描述一下这个 task pool的设计初衷，毕竟脱离实际谈设计都是耍流氓。

我需要这样一个Task Pool：

1. 能够指定任务运行的最大并发数量，超出运行池的任务进行排队；

2. 每一个任务都能够具备相当的灵活性；

3. 任务执行完成后能够根据任务的执行结果进行进一步的处理；

4. js中有大量的异步任务和回调，因此经常会有回调的顺序和依赖问题（例如等待多个异步请求执行完成再执行下一步）；

5. 最终业务上使用简单，最好一行代码搞定；

结合一个ajax请求纳入任务池管理的具体业务，在进一步解释设计和实现之前介绍一下结构分层，这是我在设计过程中的基本原则：

1. 通用的task pool抽象: MergeableTaskPool （与业务无关，主要解决任务的调度、并发和排队的问题）；

2. 通用的batch处理：BatchTaskPool （与业务无关，主要解决等待多个任务处理完成的问题，这多个任务之间可以并行）；

3. 与业务相关的具体task manager：PooledAjaxManager（与业务相关，定义具体业务操作，离业务较近，因此封装需要足够轻量，摆脱重复干脏活累活）；

4. 业务层接口：AjaxUtils（离用户/开发者最近的接口，因此易用性是最重要的）；

在看task pool的代码实现前先看一下他的使用方式：

AjaxUtils.batchPost(postTasks,function(task, result) { console.log('post complete: ' + task.id); },function(batchId) { console.log('batch complete: ' + batchId) });

这里使用了一个工具类来封装ajax请求的调用，通过工具类执行的所有ajax请求都会纳入到 Task Pool的管理范围，只有当前面的请求已经执行完成，才会开始执行后面的请求，以队列的方式排队执行。当然既然叫做Task Pool，多个请求之间需要具备并发的能力。

从这里展现了业务接口最终的使用方式很简单（业务层的使用方式必需要有亲和性，否则就失去了封装的意义）：

1. 把每一个要通过ajax操作的信息打包成一个一个的Task对象（url, data 等）

2. Task对象可以一个一个的进入队列，也可以一批次的压入队列

3. 对于单个任务提供了回调方法，并且能够把业务处理的结果也带回给回调方法

4. 如果是批量处理，那么除了单个任务执行完成能够收到回调以外，整个批次处理完成也能收到一个独立的批次完成的回调

提供给最终业务使用的封装之下是具体业务的逻辑实现，工具类的实现很容易（这里可以看到，ajax请求允许同时存在3个并发）：

var AJAX_POOLED_MANAGER = new PooledAjaxManager(3);  static batchPost(postTasks, callback, batchCallabck) {  AJAX_POOLED_MANAGER.addBatch(BaseUtils.uniqId(), postTasks, callback, batchCallabck);  }

具体业务层的实现也需要保持清爽，如果每个业务层pool的实现都要写大段的代码，那么Task Pool就没有任何设计可言了：

class PooledAjaxManager extends BatchTaskPool {constructor(limit) {super(limit);}process(taskId, task, resolve) {AjaxUtils.post(task.url, task.data, function(success, data) {resolve({success: success, data: data});});}}

简单描述下上面这段代码，这个是具体业务层manager的实现，与通用Task Pool解耦的思想也就包含在其中。这段代码只做了两件事：

1. 每一个manager可以独立指定任务的并发数量；

2. process 实现具体业务调用的逻辑，执行业务调用所需的数据都包含在之前传入pool的task对象中了。

这里比较重要的是resolve参数，因为javascript里处处都充满了异步处理和回调，因此怎么把异步做得更简单非常重要。因此我们提供了一个resolve方法，只要调用resolve方法就表明这个任务可以结束了，这和Promise的思想是相似的。同时resolve能够把任务的处理结果带回到业务层，也就是最上层封装单个任务执行完成通知回调的result参数。

这里没有贴出AjaxUtils.post方法的实现，因为实际上就是jQuery $.post 的简单调用而已，唯一的区别是把jquery ajax的 success 和 error 回调统一成了一个。

再下一层，当然就是通用 Task Pool的具体实现了，show me the code阶段（由于具体业务的需要）：

class BatchTaskPool {  constructor(limit) {    this.callbacks = new MapArray(); // { taskId: [ callabck ... ] }    this.taskBatches = new MapArray(); // { taskId: [ batchId ... ] }    this.batchCallbacks = {}; // { batchId: batchCallback }    this.batchTasks = {}; // { batchId: { taskId: } }    this.taskPool = new MergeableTaskPool({      limit: limit || 5,      process: this.process.bind(this),      taskCallback: this.onTaskComplete.bind(this),      complete: this.onTaskPoolIdle.bind(this)    });  }  // add single task  add(task, callback) {    let self = this;    this.taskPool.push(task.id, task);    this.callbacks.add(task.id, callback);  }  // add batch tasks  addBatch(batchId, tasks, callback, batchCallback) {    // map batchId to {taskId:true}    let batchTaskMap = this.batchTasks[batchId] || {};    let manager = this;    BaseUtils.each(tasks, function(task){      manager.callbacks.add(task.id, callback);      manager.taskPool.push(task.id, task);      manager.taskBatches.add(task.id, batchId);      batchTaskMap[task.id] = true;    });    this.batchTasks[batchId] = batchTaskMap;    this.batchCallbacks[batchId] = batchCallback;  }  // process single task  process(taskId, task, resolve) {    console.warn('method "process" should be overrided!')    setTimeout(resolve, 10);  }  // single task complete  onTaskComplete(taskId, task, result) {    let self = this;    let callbacks = this.callbacks.remove(taskId);    let taskBatches = this.taskBatches.get(taskId);    BaseUtils.each(callbacks, function(callback){      ScCallback(callback, self, task, result);    })    // remove taskId     BaseUtils.each(taskBatches, function(batchId){      let batchIds = self.batchTasks[batchId];      if( batchIds ) {        delete batchIds[taskId];        if( BaseUtils.isEmpty(batchIds)) {          let batchCallback = self.batchCallbacks[batchId];          delete self.batchCallbacks[batchId];          self.onBatchComplete.call(this, batchId, batchCallback);        }      }    });  }  // batch complete  onBatchComplete(batchId, batchCallback) {    ScCallback(batchCallback, this, batchId);  }  onTaskPoolIdle() {  }}

这里主要做的事情是为了实现批量回调的逻辑，这一点从constructor里初始化的数据结构里可以看出来。

class MergeableTaskPool {  constructor(options) {    this.options = options || {};    options = this.options;    this.active = 0;    this.limit = options.limit || 5;    this.running = new MapArray();    this.queue = new LinkedSet();    this.cached = new MapArray();  }  // add a task into pool, make it queued on running pool full filled  push(key, task) {    // key existed in running pool, need no wait for queue    if( this.running.get(key)) {      return this.running.add(key, task);    }    // execute task at once while there's quota in running queue    if( this.active < this.limit) {      return this.execute(key, task);    }    // otherwise, queue the task to be scheduled later    return this.cache(key, task);  }  // execute task at one without wating  execute(key, task) {    this.running.add(key, task);    this.active ++;    // perform the task execute process, only execute the first node of the array    ScCallback(this.options.process, this.options, key, task, this.resolve.bind(this, key));  }  // add a task into queue of waiting scehduled  cache(key, task) {    this.queue.push(key);    this.cached.add(key, task);  }  // schedule tasks to make sure the usage of running pool  schedule() {    let count = this.limit - this.active;    let scheduleCount = Math.min(count, this.queue.length());    for( let i = 0; i < scheduleCount; i ++ ) {      let key = this.queue.shift();      let tasks = this.cached.remove(key);      if( tasks && tasks[0]) {        this.execute(key, tasks[0]);      }    }  }  // mark running of key finished  resolve(key, result) {    if( this.running.get(key)) {      let tasks = this.running.remove(key);      this.active --;      // call the complete callback while no more tasks need to be schedule      let task = (tasks && tasks.length && tasks[0]) || {};      this.taskFinished.call(this, key, task, result);      if( this.active == 0 && this.queue.isEmpty() ) {        return this.taskPoolEmpty();      }    }    setTimeout( this.schedule.bind(this), 10);  }  taskFinished(key, task, result) {    ScCallback(this.options.taskCallback, this, key, task, result);  }  // call the complete callback while no more tasks need to be schedule  taskPoolEmpty() {      ScCallback(this.options.complete, this);  }}

这里才是最底层的任务池实现，这里做了一点特殊处理，主要是为了避免重复处理单个任务。

这里也展现了resolve方法的具体实现，其中 setTimeout 10ms的做法还有待商榷。

在实现时基础数据结构的使用也很重要，虽然这里用的都是最简单的，但是有了这些基础结构以后就事半工倍了，平时工作中不时进行抽象和沉淀也是一个好习惯。

虽然这个设计和实现是围绕具体业务逻辑展开的，但是在做的时候能够更多地去思考他的更多的应用场景。例如，不妨试着回答一个问题：这里上层业务是为了让ajax请求单独new了一个专用ajax manager，如果业务里任务没有这么强的共性但是又需要batch处理的能力怎么办呢？那么回答应该也很简单，只需要把业务处理的function也打包进task对象就可以了，上层manager封装的process方法只需要调用task.func并确保resolve方法的调用就行了。

0 0