比特计算 - 整型中1的个数计算

来源:互联网 发布:那英 唱功 知乎 编辑:程序博客网 时间:2024/04/28 21:26

今天看到国外国际象棋程序beobulf中一段计算64位长整数中1 bit的个数的计算法。它使用inbits表代表8位整数的1的个数的统计表,然后将64位分8段分别查表累计,程序如下,得到了一个快速算法。理论上这个算法够快了,除非你做一张更大的统计表。

/* A list of the number of bits in numbers from 0-255.  This is used in the  * bit counting algorithm.  Thanks to Dann Corbit for this one. */static int inbits[256] = {  0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,  4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,};/* This algorithm thanks to Dann Corbit.  Apparently it's faster than * the standard one.  */int Count(const BITBOARD B) {  return inbits[(unsigned char) B] +    inbits[(unsigned char) (B >> 8)] +    inbits[(unsigned char) (B >> 16)] +    inbits[(unsigned char) (B >> 24)] +    inbits[(unsigned char) (B >> 32)] +    inbits[(unsigned char) (B >> 40)] +    inbits[(unsigned char) (B >> 48)] +    inbits[(unsigned char) (B >> 56)];}
  当我想有没有更简洁的方法时,这让我想到过去一个面试题,要用一行代码检查一个数是否是2的幂。当时没做出来,之后3天才想到判据是:  0 == ( x^(x-1) ),当然于事无补。

  这时感觉这个判据是可以用在整形中1的个数的计算中的,当走过2条街后就想到了。当然这个算法在平均时间上不如上面算法,最大要循环64次(x=2^64-1时)才能获得结果。而且循环中还有跳转指令耗时,不过和逐个比特检查相比效率要高,还有优点就是简洁。

int bitcount( u64 x ){   int bitcnt;   for( bitcnt = 0 ; x ; bitcnt++ )   {       x = x ^ (x-1);   }   return bitcnt;}

此即 『Hacker's Delight』中 “Figure 5-3 Counting 1-bits in a sparsely populated word.”


可能存在的一个改进算法,能做的就是减少可能的跳转循环过程,这也是beobulf累计时没用循环的原因,你懂的,ok。

int bitcount2( u64 x ){    int bitcnt;    for( bitcnt = 0 ; x ; )    {        x = x ^ (x-1);       bitcnt += !!x;        x = x ^ (x-1);       bitcnt += !!x;        x = x ^ (x-1);       bitcnt += !!x;        x = x ^ (x-1);       bitcnt += !!x;    }    return bitcnt;}


附(2012-6-18)

最近看hacker's delight, 5-1开篇提到的算法,对于已知字长的“1”数目计算方法相当好。对于32位整数的计算方法如下。

x = (x & 0x55555555) + ((x >> 1) & 0x55555555);x = (x & 0x33333333) + ((x >> 2) & 0x33333333);x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF);
以上算法不用查询表,也可以在log2(32)=5次计算后得到结果,除赋值外一共需要20个逻辑运算。考虑到其中某些步骤不存在进位影响计算结果的危险,进一步优化后的算法只需要 15个逻辑运算。

int populate32(unsigned x) {    x = x - ((x >> 1) & 0x55555555);    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);    x = (x + (x >> 4)) & 0x0F0F0F0F;    x = x + (x >> 8);    x = x + (x >> 16);    return x & 0x0000003F;}
64位版本可以是

int populate64(unsigned x) {    x = x - ((x >> 1) & 0x5555555555555555);    x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);    x = (x + (x >> 4)) & 0x0F0F0F0F0F0F0F0F;    x = x + (x >> 8);    x = x + (x >> 16);    x = x + (x >> 32);    return x & 0x0000007F;}

	
				
		
原创粉丝点击