如何生成均匀分布随机整数

来源：互联网发布：连接不到windows 编辑：程序博客网时间：2024/05/16 09:38

前几天在水木上看到一个帖子，问如何用硬件实现一个0-56的随机数。这个问题初看起来不是很难，但是仔细想想还是蛮难实现的，尤其是希望能够尽量少的占用芯片面积时。

由这个问题，我想到另外一个稍微简单一些的问题，就是如何在程序中生成一个[0, N-1] 的随机整数。我们知道，C语言的标准库中有个 rand() 函数，这个函数可以生成[0, RAND_MAX] 之间的随机整数，并且理论上来说生成的随机整数是均匀分布的。我们就以此为基础来构造一个[0, N-1] 的均匀分布的随机整数。

要生成[0, N-1] 的随机整数，大多数的书上给出的方法是这样的：

rand() % N

这样生成的随机数确实是在[0, N-1] 之间，但是却不一定是均匀分布的。或者说只有当 N 可以写为 2^m 的形式(m 为整数)时这样生成的随机数才是均匀的。因为rand() 实际上生成的是一个随机比特序列，这个比特序列的每一位为0或为1 的概率是相等的。所以只有当[0, N-1]的随机数可以用这个比特序列的子序列来表示的时候才是均匀分布的。

也就是说用上面的方法可以直接生成 [0, 63] 的均匀分布随机数，但是却无法生成[0, 56]的均匀分布随机数。

但是既然能生成 [0, 63] 的均匀分布随机数了，在这样的随机数中将抽样结果落在[57, 63] 的那一部分刨掉剩下的就是[0, 56]的均匀分布随机数了。按照这样的思想，可以写出下面的代码：

int x;do{    x = rand() % 64;}while ( x > 56);cout << x endl;

按照类似的思路，可以写出一个生成[0, N-1] 的均匀分布随机整数的函数。

int randn(int n){    int max = RAND_MAX - RAND_MAX % n;    int x;    do    {        x = rand();    }while ( x >= max );    return x % n;}

在我用的开发环境（MinGW）上 RAND_MAX 为 0x7FFF，也就是32767。当 n = 16385 时，上面函数的运行效率最低，大约要舍弃一半的rand() 函数的结果。下面是个小的测试程序。

int count = 0;int randn(int n){    int max = RAND_MAX - RAND_MAX % n;    int x;    do    {        x = rand();        count ++;    }while ( x >= max );    return x % n;}int main(){    int i, x;    srand(0);    for(i = 0; i < 100000; ++i)    {        x = randn(16385);    }    cout << count << endl;    return 0;}

我这里运行的结果是 199995。也就是为了产生10万个随机数，调用了差不多20万次 rand() 函数。

另外，我在网上还找到了一个 Java 语言的实现。代码如下：

    /**     * Returns a pseudorandom, uniformly distributed {@code int} value     * between 0 (inclusive) and the specified value (exclusive), drawn from     * this random number generator's sequence.  The general contract of     * {@code nextInt} is that one {@code int} value in the specified range     * is pseudorandomly generated and returned.  All {@code n} possible     * {@code int} values are produced with (approximately) equal     * probability.  The method {@code nextInt(int n)} is implemented by     * class {@code Random} as if by:     *  <pre> {@code     * public int nextInt(int n) {     *   if (n <= 0)     *     throw new IllegalArgumentException("n must be positive");     *     *   if ((n & -n) == n)  // i.e., n is a power of 2     *     return (int)((n * (long)next(31)) >> 31);     *     *   int bits, val;     *   do {     *       bits = next(31);     *       val = bits % n;     *   } while (bits - val + (n-1) < 0);     *   return val;     * }}</pre>     *     * <p>The hedge "approximately" is used in the foregoing description only     * because the next method is only approximately an unbiased source of     * independently chosen bits.  If it were a perfect source of randomly     * chosen bits, then the algorithm shown would choose {@code int}     * values from the stated range with perfect uniformity.     * <p>     * The algorithm is slightly tricky.  It rejects values that would result     * in an uneven distribution (due to the fact that 2^31 is not divisible     * by n). The probability of a value being rejected depends on n.  The     * worst case is n=2^30+1, for which the probability of a reject is 1/2,     * and the expected number of iterations before the loop terminates is 2.     * <p>     * The algorithm treats the case where n is a power of two specially: it     * returns the correct number of high-order bits from the underlying     * pseudo-random number generator.  In the absence of special treatment,     * the correct number of <i>low-order</i> bits would be returned.  Linear     * congruential pseudo-random number generators such as the one     * implemented by this class are known to have short periods in the     * sequence of values of their low-order bits.  Thus, this special case     * greatly increases the length of the sequence of values returned by     * successive calls to this method if n is a small power of two.     *     * @param n the bound on the random number to be returned.  Must be     *        positive.     * @return the next pseudorandom, uniformly distributed {@code int}     *         value between {@code 0} (inclusive) and {@code n} (exclusive)     *         from this random number generator's sequence     * @exception IllegalArgumentException if n is not positive     * @since 1.2     */    public int nextInt(int n) {        if (n <= 0)            throw new IllegalArgumentException("n must be positive");        if ((n & -n) == n)  // i.e., n is a power of 2            return (int)((n * (long)next(31)) >> 31);        int bits, val;        do {            bits = next(31);            val = bits % n;        } while (bits - val + (n-1) < 0);        return val;    }

其中 next(31) 是生成一个31比特的随机整数。判断条件

while (bits - val + (n-1) < 0)

是怎么来的我没想明白，但是测试了一下结果没有问题。

另外，这个代码中用

(n & -n) == n

来判断 n 是否是 2 的整数次幂也很巧妙。要是让我来写，肯定写不出这么精彩的实现。

不过，这个代码的运行效率与我写的那个简单的代码基本相当，相比来说我那个代码还要更易读一些。