有限状态机

来源：互联网发布：通达信布林线指标源码编辑：程序博客网时间：2024/05/21 00:18

有限状态机

今天，在leetcode上碰到一道关于有限状态机的题，如下：
Given an array of integers, every element appears three times except for one. Find that single one.
Note: Your algorithm should have a linear runtime complexity. Could you implement it without using extra memory?
题目链接
以前，只碰到过“数组中所有数字只出现2次，只有一个出现1次，找这个数的问题”，每次循环异或数组中元素，最后的结果就是single one。这次换作出现3次就懵逼了，主要原因，没有使用过有限状态机，应该说是连概念都没有，所以这次一定要好好记录一下：
关于这道题的解释Discussion中woshidaishu解释的很不错，这里引用一下他的解释：

The code seems tricky and hard to understand at first glance. However, if you consider the problem in Boolean algebra form, everything becomes clear.
What we need to do is to store the number of ‘1’s of every bit. Since each of the 32 bits follow the same rules, we just need to consider 1 bit. We know a number appears 3 times at most, so we need 2 bits to store that. Now we have 4 state, 00, 01, 10 and 11, but we only need 3 of them.
In this solution, 00, 01 and 10 are chosen. Let ‘ones’ represents the first bit, ‘twos’ represents the second bit. Then we need to set rules for ‘ones’ and ‘twos’ so that they act as we hopes. The complete loop is 00->10->01->00(0->1->2->3/0).
(1). For ‘ones’, we can get ‘ones = ones ^ A[i]; if (twos == 1) then ones = 0’, that can be tansformed to ‘ones = (ones ^ A[i]) & ~twos’.
(2). Similarly, for ‘twos’, we can get ‘twos = twos ^ A[i]; if (ones* == 1) then twos = 0’ and ‘twos = (twos ^ A[i]) & ~ones’. Notice that ‘ones*’ is the value of ‘ones’ after calculation, that is why twos is calculated later.
Here is another example. If a number appears 5 times at most, we can write a program using the same method. Now we need 3 bits and the loop is 000->100->010->110->001. The code looks like this:

int singleNumber(int A[], int n) {    int na = 0, nb = 0, nc = 0;    for(int i = 0; i < n; i++){        nb = nb ^ (A[i] & na);        na = (na ^ A[i]) & ~nc;        nc = nc ^ (A[i] & ~na & ~nb);    }    return na & ~nb & ~nc;}

or even like that:

int singleNumber(int A[], int n) {    int twos = 0xffffffff, threes = 0xffffffff, ones = 0;    for(int i = 0; i < n; i++){        threes = threes ^ (A[i] & twos);        twos = (twos ^ A[i]) & ~ones;        ones = ones ^ (A[i] & ~twos & ~threes);    }    return ones;}

上面的解释还是蛮不错的，但是还是不够详细，这里有几点补充：

这里状态转换过程，是低位在前，高位在后，这点要注意啊，以前数字电路里都是低位在后，高位在前，有点不一样。
还有就是出现5次，这个扩展题的理解。首先异或本质是位加法运算，所以na nb nc 与A[i] 执行异或运算，实际上是对各位执行加法运算，而加法运算是受条件约束的，如低位进位，状态机状态个数。
中间位的nb 在状态转换时只受到上一次来自低位na 进位的影响，也就是说，上一次低位na 为1 时nb 才进行状态转换，所以代码为nb = nb ^ (A[i] & na); 至于为什么先进行nb 的运算，而不是na ，是因为na 是来自上一次na 的计算结果，而不是本次的。
100->010->110->001->000 这是状态转换过程，发现只有当低位、中位都为0 时我们的最高位才会进行状态转换，所以才会有nc = nc ^ (A[i] & ~na & ~nb); ，其中，na 、nb 都是本次的状态，与上次的无关，也就是只要本次低位、中位计算结果全为0 时，最高位nc 就要状态转换。所以nc 是在na 、nb 计算之后算。
由于na 是最低位，所以它无时无刻都在变。只是 001->000 这个状态要单独处理，因为它不符合加法的运算规则，000这时na 的状态是受上一次高位nc 状态影响的，因为只有上一次高位nc 为1 时na 的状态才不会改变。所以代码为na = (na ^ A[i]) & ~nc;
这种代码是具体问题，具体分析的，但还是有迹可循的，首先只有最低位和最高位需要单独思考代码，中间位其实只受到上一次来自低位进位的影响。还有就是分析时可以把各位拆开来看，可能这样更容易理解，如上题可写成na : 0->1->0->1->0 、nb : 0->0->1->1->0 、nc : 0->0->0->0->1 。
补充/总结：
每次与数组的元素的异或运算，执行的位加的运算，这样就实现了状态转换
除了最低位和最高位之外的其他位，都受到它的上一次低位的影响(nb = nb ^ (A[i] & na); 和 nc = nc ^ (A[i] & ~na & ~nb);)，而最低位受到上一次最高位的影响(na = (na ^ A[i]) & ~nc;)。最高位的确是受到其低位的影响，但是按照这个思路，无论哪位都受到上一次中其他位的影响，循环中无法决定哪位运算在前，哪位运算在后，所以，这里最高位的状态的状态由本次的其他位计算，所以，最高位的计算放在最后。
总结一下，除了最高位，最低位，其他位的计算，从低到高，依次计算。最低位是在倒数第二次计算(它前面的位都需要它上一次的状态)，最高位最后计算(它需要本次其他位计算后的状态)。

0 0