Oulipo

来源：互联网发布：2016网络流行词经典版编辑：程序博客网时间：2024/05/16 04:48

Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A','B','C', …,'Z'} and two finite strings over that alphabet, a wordW and a textT, count the number of occurrences ofW inT. All the consecutive characters of W must exactly match consecutive characters ofT. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B','C', …,'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the stringW).
One line with the text T, a string over {'A', 'B','C', …,'Z'}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the wordW in the textT.

Sample Input

3BAPCBAPCAZAAZAZAZAVERDIAVERDXIVYERDIAN

Sample Output

130解题思路：设w的匹配指针为p,t的匹配指针为cur,初始时p=0;cur=0;1,若t和w的当前字符相同，则p++,cur++;2，若t和w的当前字符不同，则分析：若未分析完w的所有字符（p>=0）,则根据next数组左移的指针(p=next[p]);若分析完了w的所有字符（p<0），则t的下一个字符与w的首字符匹配（++cur,p=0）。3，若匹配成功（p=w的长度），则单词w在t中出现的频率+1，则根据next数组左移的指针(p=next[p]);数组定义在主函数里竟然超时！！！代码：#include <iostream>#include <cstring>using namespace std;const int maxw=10010;const int maxt=1000010;char w[maxw],t[maxt];int suffix[maxw];int match(char w[],char s[],int next[]){    int cnt=0;    int wlen=strlen(w);    int slen=strlen(s);    int p,cur;    cur=0;p=0;    while(cur<slen)    {        if(s[cur]==w[p])        {            ++cur;            ++p;        }        else            if(p>=0)                p=next[p];            else            {                ++cur;                p=0;            }            if(p==wlen)            {                ++cnt;                p=next[p];            }    }    return cnt;}int main(){    int n;    cin>>n;    while(n--)    {        //char w[maxw],t[maxt];        cin>>w>>t;        //int suffix[maxw];        suffix[0]=-1;        suffix[1]=0;        int p=0;        for(int cur=2;cur<=strlen(w);cur++)        {            while(p>=0&&w[p]!=w[cur-1])                p=suffix[p];            suffix[cur]=++p;        }        int l;        l=match(w,t,suffix);        cout<<l<<endl;    }    return 0;}

解法二：

#include <iostream>#include <cstring>using namespace std;const int maxn=10005;const int maxn1=1000005;char s[maxn1];char t[maxn];int lens,lent;void getnext(char t[],int next[]){int i,j;i=0;j=-1;next[0]=-1;while(i<lent){if(j==-1||t[i]==t[j]){i++;j++;next[i]=j;}elsej=next[j];}}int numkmp(char s[],char t[]){int i=0,j=0,k=0;int next[maxn];getnext(t,next);while(i<lens){if(j==-1||s[i]==t[j]){i++;j++;}else{j=next[j];}if(j==lent){k++;j=next[j];}}return k;}int main(){int i,j;int n;cin>>n;while(n--){cin>>t;lent=strlen(t);cin>>s;lens=strlen(s);int num;    num=numkmp(s,t);cout<<num<<endl;}return 0;}

0 0