[翻译]High Performance JavaScript(014)

来源:互联网 发布:网络社区营销成功案例 编辑:程序博客网 时间:2024/05/16 14:33

Recursion Patterns  递归模式

 

    When you run into a call stack size limit, your first step should be to identify any instances of recursion in the code. To that end, there are two recursive patterns to be aware of. The first is the straightforward recursive pattern represented in the factorial() function shown earlier, when a function calls itself. The general pattern is as follows:

    当你陷入调用栈尺寸限制时,第一步应该定位在代码中的递归实例上。为此,有两个递归模式值得注意。首先是直接递归模式为代表的前面提到的factorial()函数,即一个函数调用自身。其一般模式如下:

 

function recurse(){
  recurse();
}
recurse();

    This pattern is typically easy to identify when errors occur. A second, subtler pattern involves two functions:

    当发生错误时,这种模式比较容易定位。另外一种模式称为精巧模式,它包含两个函数:

 

function first(){
  second();
}
function second(){
  first();
}
first();

    In this recursion pattern, two functions each call the other, such that an infinite loop is formed. This is the more troubling pattern and a far more difficult one to identify in large code bases.

    在这种递归模式中,两个函数互相调用对方,形成一个无限循环。这是一个令人不安的模式,在大型代码库中定位错误很困难。

 

    Most call stack errors are related to one of these two recursion patterns. A frequent cause of stack overflow is an incorrect terminal condition, so the first step after identifying the pattern is to validate the terminal condition. If the terminal condition is correct, then the algorithm contains too much recursion to safely be run in the browser and should be changed to use iteration, memoization, or both.

    大多数调用栈错误与这两种模式之一有关。常见的栈溢出原因是一个不正确的终止条件,所以定位模式错误的第一步是验证终止条件。如果终止条件是正确的,那么算法包含了太多层递归,为了能够安全地在浏览器中运行,应当改用迭代,制表,或两者兼而有之。

 

译者注:memoization,没错,就是这么个单词,译为“制表”,不是memorization!

        memoization,又称tabulation,源自键盘上的tab键。

        后面章节还会详解,简而言之,就是用一个数组栈记录每次递归的结果,如果某值曾经计算过,那么

        直接从数组栈中以查表法获得结果,而不必重复计算。

 

Iteration  迭代

 

    Any algorithm that can be implemented using recursion can also be implemented using iteration. Iterative algorithms typically consist of several different loops performing different aspects of the process, and thus introduce their own performance issues. However, using optimized loops in place of long-running recursive functions can result in performance improvements due to the lower overhead of loops versus that of executing a function.

    任何可以用递归实现的算法都可以用迭代实现。迭代算法通常包括几个不同的循环,分别对应算法过程的不同方面,也会导致自己的性能为题。但是,使用优化的循环替代长时间运行的递归函数可以提高性能,因为运行一个循环比反复调用一个函数的开销要低。

 

    As an example, the merge sort algorithm is most frequently implemented using recursion. A simple JavaScript implementation of merge sort is as follows:

    例如,合并排序算法是最常用的以递归实现的算法。一个简单的JavaScript实现的合并排序算法如下:

 

function merge(left, right){
  var result = [];
  while (left.length > 0 && right.length > 0){
    if (left[0] < right[0]){
      result.push(left.shift());
    } else {
      result.push(right.shift());
    }
  }
  return result.concat(left).concat(right);
}
function mergeSort(items){
  if (items.length == 1) {
    return items;
  }
  var middle = Math.floor(items.length / 2),
  left = items.slice(0, middle),
  right = items.slice(middle);
  return merge(mergeSort(left), mergeSort(right));
}

    The code for this merge sort is fairly simple and straightforward, but the mergeSort() function itself ends up getting called very frequently. An array of n items ends up calling mergeSort() 2 * n –1 times, meaning that an array with more than 1,500 items would cause a stack overflow error in Firefox.

    这个合并排序代码相当简单直接,但是mergeSort()函数被调用非常频繁。一个具有n个项的数组总共调用mergeSort()达2 * n - 1次,也就是说,对一个超过1500个项的数组操作,就可能在Firefox上导致栈溢出。

 

    Running into the stack overflow error doesn't necessarily mean the entire algorithm has to change; it simply means that recursion isn't the best implementation. The merge sort algorithm can also be implemented using iteration, such as:

    程序陷入栈溢出错误并不一定要修改整个算法;它只是意味着递归不是最好的实现方法。合并排序算法还可以用迭代实现,如下:

 

//uses the same mergeSort() function from previous example
function mergeSort(items){
  if (items.length == 1) {
    return items;
  }
  var work = [];
  for (var i=0, len=items.length; i < len; i++){
    work.push([items[i]]);
  }
  work.push([]); //in case of odd number of items
  for (var lim=len; lim > 1; lim = (lim+1)/2){
    for (var j=0,k=0; k < lim; j++, k+=2){
      work[j] = merge(work[k], work[k+1]);
    }
    work[j] = []; //in case of odd number of items
  }
  return work[0];
}

    This implementation of mergeSort() does the same work as the previous one without using recursion. Although the iterative version of merge sort may be somewhat slower than the recursive option, it doesn't have the same call stack impact as the recursive version. Switching recursive algorithms to iterative ones is just one of the options for avoiding stack overflow errors.

    此mergeSort()实现与前面的函数实现同样功能而没有使用递归。虽然迭代版本的合并排序可能比递归版本的慢一些,但它不会像递归版本那样影响调用栈。将递归算法切换为迭代只是避免栈溢出错误的方法之一。

 

Memoization  制表

 

    Work avoidance is the best performance optimization technique. The less work your code has to do, the faster it executes. Along those lines, it also makes sense to avoid work repetition. Performing the same task multiple times is a waste of execution time. Memoization is an approach to avoid work repetition by caching previous calculations for later reuse, which makes memoization a useful technique for recursive algorithms.

    减少工作量就是最好的性能优化技术。代码所做的事情越少,它的运行速度就越快。根据这些原则,避免重复工作也很有意义。多次执行相同的任务也在浪费时间。制表,通过缓存先前计算结果为后续计算所重复使用,避免了重复工作。这使得制表成为递归算法中有用的技术。

 

    When recursive functions are called multiple times during code execution, there tends to be a lot of work duplication. The factorial() function, introduced earlier in "Recursion" on page 73, is a great example of how work can be repeated multiple times by recursive functions. Consider the following code:

    当递归函数多次被调用时,重复工作很多。在factorial()函数中(在前面介绍过的阶乘函数),是一个递归函数重复多次的典型例子。考虑下面的代码:

 

var fact6 = factorial(6);
var fact5 = factorial(5);
var fact4 = factorial(4);

    This code produces three factorials and results in the factorial() function being called a total of 18 times. The worst part of this code is that all of the necessary work is completed on the first line. Since the factorial of 6 is equal to 6 multiplied by the factorial 5, the factorial of 5 is being calculated twice. Even worse, the factorial of 4 is being calculated three times. It makes far more sense to save those calculations and reuse them instead of starting over anew with each function call.

    此代码生成三个阶乘结果,factorial()函数总共被调用了18次。此代码中最糟糕的部分是,所有必要的计算已经在第一行代码中执行过了。因为6的阶乘等于6乘以5的阶乘,所以5的阶乘被计算了两次。更糟糕的是,4的阶乘被计算了三次。更为明智的方法是保存并重利用它们的计算结果,而不是每次都重新计算整个函数。

 

    You can rewrite the factorial() function to make use of memoization in the following way:

    你可以使用制表技术来重写factorial()函数,如下:

 

function memfactorial(n){
  if (!memfactorial.cache){
    memfactorial.cache = {
      "0": 1,
      "1": 1
    };
  }
  if (!memfactorial.cache.hasOwnProperty(n)){
    memfactorial.cache[n] = n * memfactorial (n-1);
  }
  return memfactorial.cache[n];
}

    The key to this memoized version of the factorial function is the creation of a cache object. This object is stored on the function itself and is prepopulated with the two simplest factorials: 0 and 1. Before calculating a factorial, this cache is checked to see whether the calculation has already been performed. No cache value means the calculation must be done for the first time and the result stored in the cache for later usage. This function is used in the same manner as the original factorial() function:

    这个使用制表技术的阶乘函数的关键是建立一个缓存对象。此对象位于函数内部,并预置了两个最简单的阶乘:0和1。在计算阶乘之前,首先检查缓存中是否已经存在相应的计算结果。没有对应的缓冲值说明这是第一次进行此数值的计算,计算完成之后结果被存入缓存之中,以备今后使用。此函数与原始版本的factorial()函数用法相同。

 

var fact6 = memfactorial(6);
var fact5 = memfactorial(5);
var fact4 = memfactorial(4);

    This code returns three different factorials but makes a total of eight calls to memfactorial(). Since all of the necessary calculations are completed on the first line, the next two lines need not perform any recursion because cached values are returned.

    此代码返回三个不同的阶乘值,但总共只调用memfactorial()函数八次。既然所有必要的计算都在第一行代码中完成了,那么后两行代码不会产生递归运算,因为直接返回缓存中的数值。

 

    The memoization process may be slightly different for each recursive function, but generally the same pattern applies. To make memoizing a function easier, you can define a memoize() function that encapsulates the basic functionality. For example:

    制表过程因每种递归函数而略有不同,但总体上具有相同的模式。为了使一个函数的制表过程更加容易,你可以定义一个memoize()函数封装基本功能。例如:

 

function memoize(fundamental, cache){
  cache = cache || {};
  var shell = function(arg){
    if (!cache.hasOwnProperty(arg)){
      cache[arg] = fundamental(arg);
    }
    return cache[arg];
  };
  return shell;
}

    This memoize() function accepts two arguments: a function to memoize and an optional cache object. The cache object can be passed in if you'd like to prefill some values; otherwise a new cache object is created. A shell function is then created that wraps the original (fundamental) and ensures that a new result is calculated only if it has never previously been calculated. This shell function is returned so that you can call it directly, such as:

    此memoize()函数接收两个参数:一个用来制表的函数和一个可选的缓存对象。如果你打算预设一些值,那么就传入一个预定义的缓存对象;否则它将创建一个新的缓存对象。然后创建一个外壳函数,将原始函数(fundamential)包装起来,确保只有当一个此前从未被计算过的值传入时才真正进行计算。计算结果由此外壳函数返回,你可以直接调用它,例如:

 

//memoize the factorial function
var memfactorial = memoize(factorial, { "0": 1, "1": 1 });
//call the new function
var fact6 = memfactorial(6);
var fact5 = memfactorial(5);
var fact4 = memfactorial(4);

    Generic memoization of this type is less optimal that manually updating the algorithm for a given function because the memoize() function caches the result of a function call with specific arguments. Recursive calls, therefore, are saved only when the shell function is called multiple times with the same arguments. For this reason, it's better to manually implement memoization in those functions that have significant performance issues rather than apply a generic memoization solution.

    这种通用制表函数与人工更新算法相比优化较少,因为memoize()函数缓存特定参数的函数调用结果。当代码以同一个参数多次调用外壳函数时才能节约时间(译者注:如果外壳函数内部还存在递归,那么内部的递归就不能享用这些中间运算结果了)。因此,当一个通用制表函数存在显著性能问题时,最好在这些函数中人工实现制表法。

 

Summary  总结

 

    Just as with other programming languages, the way that you factor your code and the algorithm you choose affects the execution time of JavaScript. Unlike other programming languages, JavaScript has a restricted set of resources from which to draw, so optimization techniques are even more important.

    正如其他编程语言,代码的写法和算法选用影响JavaScript的运行时间。与其他编程语言不同的是,JavaScript可用资源有限,所以优化技术更为重要。

 

• The for, while, and do-while loops all have similar performance characteristics, and so no one loop type is significantly faster or slower than the others.

  for,while,do-while循环的性能特性相似,谁也不比谁更快或更慢。

 

• Avoid the for-in loop unless you need to iterate over a number of unknown object properties.

  除非你要迭代遍历一个属性未知的对象,否则不要使用for-in循环。

 

• The best ways to improve loop performance are to decrease the amount of work done per iteration and decrease the number of loop iterations.

  改善循环性能的最好办法是减少每次迭代中的运算量,并减少循环迭代次数。

 

• Generally speaking, switch is always faster than if-else, but isn’t always the best solution.

  一般来说,switch总是比if-else更快,但并不总是最好的解决方法。

 

• Lookup tables are a faster alternative to multiple condition evaluation using if-else or switch.

  当判断条件较多时,查表法比if-else或者switch更快。

 

• Browser call stack size limits the amount of recursion that JavaScript is allowed to perform; stack overflow errors prevent the rest of the code from executing.

  浏览器的调用栈尺寸限制了递归算法在JavaScript中的应用;栈溢出错误导致其他代码也不能正常执行。

 

• If you run into a stack overflow error, change the method to an iterative algorithm or make use of memoization to avoid work repetition.

  如果你遇到一个栈溢出错误,将方法修改为一个迭代算法或者使用制表法可以避免重复工作。

 

    The larger the amount of code being executed, the larger the performance gain realized from using these strategies.

    运行的代码总量越大,使用这些策略所带来的性能提升就越明显。