HDU - 3695 Computer Virus on Planet Pandora AC自动机+优化

来源:互联网 发布:sql多张表合并 union 编辑:程序博客网 时间:2024/05/26 09:57

Computer Virus on Planet Pandora

 HDU - 3695


Aliens on planet Pandora also write computer programs like us. Their programs only consist of capital letters (‘A’ to ‘Z’) which they learned from the Earth. On 
planet Pandora, hackers make computer virus, so they also have anti-virus software. Of course they learned virus scanning algorithm from the Earth. Every virus has a pattern string which consists of only capital letters. If a virus’s pattern string is a substring of a program, or the pattern string is a substring of the reverse of that program, they can say the program is infected by that virus. Give you a program and a list of virus pattern strings, please write a program to figure out how many viruses the program is infected by. 
Input
There are multiple test cases. The first line in the input is an integer T ( T<= 10) indicating the number of test cases. 

For each test case: 

The first line is a integer n( 0 < n <= 250) indicating the number of virus pattern strings. 

Then n lines follows, each represents a virus pattern string. Every pattern string stands for a virus. It’s guaranteed that those n pattern strings are all different so there 
are n different viruses. The length of pattern string is no more than 1,000 and a pattern string at least consists of one letter. 

The last line of a test case is the program. The program may be described in a compressed format. A compressed program consists of capital letters and 
“compressors”. A “compressor” is in the following format: 

qxqx 

q is a number( 0 < q <= 5,000,000)and x is a capital letter. It means q consecutive letter xs in the original uncompressed program. For example, 6K6K means 
‘KKKKKK’ in the original program. So, if a compressed program is like: 

AB2D2DE7K7K

It actually is ABDDEKKKKKKKG after decompressed to original format. 

The length of the program is at least 1 and at most 5,100,000, no matter in the compressed format or after it is decompressed to original format. 
Output
For each test case, print an integer K in a line meaning that the program is infected by K viruses. 
Sample Input
32ABDCBDACB3ABCCDEGHIABCCDEFIHG4ABBACDEEBBBFEEEA[2B]CD[4E]F
Sample Output
032
Hint
In the second case in the sample input, the reverse of the program is ‘GHIFEDCCBA’, and ‘GHI’ is a substring of the reverse, so the program is infected by virus ‘GHI’.


Source
HDU - 3695
My Solution
题意:给出n个模式串,然后给出一个主串(不过给出的形式是[number]char[number]char),求在主串中模式串或者模式串的逆串能匹配的个数。
AC自动机+优化
用n个模式串建立AC自动机,然后按照主串正反读2遍就好了,然后就是注意优化,即避免重复访问,碰到危险节点计算以后就标记为-1,这样下次如果再遍历到自动机的这里就直接break然后去遍历主串的下一个字符就好。此外,常用的AC自动机的写法是用map<char, int> mp;来储存字母表的,这里要改成s[i] - 'A'直接上,比较前者是O(logn)后者是O(1)的,然后感觉是卡过了,⊙﹏⊙‖∣。
复杂度 略大于O(n)
#include <iostream>#include <cstdio>#include <string>#include <cstring>#include <queue>#include <map>#include <sstream>using namespace std;typedef long long LL;const int CHAR_SIZE = 26;const int MAX_SIZE = 2.5e5 + 8;struct AC_Machine{    int ch[MAX_SIZE][CHAR_SIZE], danger[MAX_SIZE], fail[MAX_SIZE];    int sz;    inline void init(){        sz = 1;        memset(ch[0], 0, sizeof ch[0]);        memset(danger, 0, sizeof danger);    }    inline void _insert(char *s){        int n = strlen(s);        int u = 0, c;        for(int i = 0; i < n; i++){            c = s[i] - 'A';            if(!ch[u][c]){                memset(ch[sz], 0, sizeof ch[sz]);                danger[sz] = 0;                ch[u][c] = sz++;            }            u = ch[u][c];        }        danger[u]++;    }    inline void _build(){        queue<int> Q;        fail[0] = 0;        for(int c = 0, u; c < CHAR_SIZE; c++){            u = ch[0][c];            if(u){Q.push(u); fail[u] = 0;}        }        int r;        while(!Q.empty()){            r = Q.front();            Q.pop();            //danger[r] |= danger[fail[r]];            for(int c = 0, u; c < CHAR_SIZE; c++){                u = ch[r][c];                if(!u){ch[r][c] = ch[fail[r]][c]; continue; }                fail[u] = ch[fail[r]][c];                Q.push(u);            }        }    }}ac;const int maxn = 5.1e6 + 8;char s[maxn];int main(){    #ifdef LOCAL    freopen("4.in", "r", stdin);    //freopen("4.out", "w", stdout);    #endif // LOCAL    //ios::sync_with_stdio(false); //cin.tie(0);    int T, n, len, i, v, ans, now, tmp;    char ch;    //for(int i = 0; i < 26; i++){mp[(i + 'A')] = i;}    scanf("%d", &T);    while(T--){        scanf("%d", &n);        ac.init();        while(n--){            scanf("%s", s);            ac._insert(s);        }        ac._build();        len = 0, v = 1; getchar();        while(true){            ch = getchar();            if(ch == '['){                scanf("%d", &v);            }            else if(ch == ']') ;            else if(isalpha(ch)){                while(v--){                    s[len] = ch;                    len++;                }                v = 1;            }            else break;        }        s[len] = '\0';        //printf("%s\n", s);        ans = 0;  now = 0;        for(i = 0; i < len; i++){            now = ac.ch[now][s[i] - 'A'];            tmp = now;            while(tmp){                if(ac.danger[tmp] > 0){                    ans += ac.danger[tmp];                    ac.danger[tmp] = -1;                }                else if(ac.danger[tmp] == -1)break;                tmp = ac.fail[tmp];            }        }        now = 0;        for(i = len - 1; i >= 0; i--){            now = ac.ch[now][s[i] - 'A'];            tmp = now;            while(tmp){                if(ac.danger[tmp] > 0){                    ans += ac.danger[tmp];                    ac.danger[tmp] = -1;                }                else if(ac.danger[tmp] == -1)break;                tmp = ac.fail[tmp];            }        }        printf("%d\n", ans);    }    return 0;}

Thank you!
                                                                                                                                             ------from ProLights


0 0
原创粉丝点击