hdu3336Count the string(KMP的next[]数组的应用+KMP模板)

来源：互联网发布：旋转矩阵计算旋转角度编辑：程序博客网时间：2024/06/05 15:13

Countthe string

TimeLimit : 2000/1000ms (Java/Other) Memory Limit : 32768/32768K(Java/Other)

TotalSubmission(s) : 22 Accepted Submission(s) : 11

Problem Description

It is well known that AekdyCoin isgood at string problems as well as number theory problems. When given a strings, we can write down all the non-empty prefixes of this string. For example: s:"abab" The prefixes are: "a", "ab","aba", "abab" For each prefix, we can count the times itmatches in s. So we can see that prefix "a" matches twice,"ab" matches twice too, "aba" matches once, and"abab" matches once. Now you are asked to calculate the sum of thematch times for all the prefixes. For "abab", it is 2 + 2 + 1 + 1 =6. The answer may be very large, so output the answer mod 10007.

Input

The first line is a single integerT, indicating the number of test cases. For each case, the first line is aninteger n (1 <= n <= 200000), which is the length of string s. A linefollows giving the string s. The characters in the strings are all lower-caseletters.

Output

For each case, output only onenumber: the sum of the match times for all the prefixes of s mod 10007.

Sample Input

1

4 abab

Sample Output

6 Author

foreverlin@HNU

Source

HDOJ Monthly Contest – 2010.03.06

题意：给定一字符串，求它所有的前缀出现的次数的和。
这题很纠结，一开始不知道怎么做，如果直接统计子串在主串中出现的次数，orz···肯定TLE，后来发现这题可以直接从next数组入手，因为next数组表示的是子串中最长公共前后缀串的长度，如果用dt[i]表示该字符串前i个字符中出现任意以第i个字符结尾的前缀的次数，它的递推式是 dt[i]=d[next[i]]+1,即以第i个字符结尾的前缀数等于以第next[i]个字符为结尾的前缀数加上它自己本身，这里要好好理解一下，不太好解释。
举个例子：
             i 1 2 3 4 5 6
        字符串 a b a b a b
        dt[i] 1 1 2 2 3 3
        aba中出现的前缀为a,aba,所以dt[3]是2，ababa中出现的前缀为a,aba,ababa,所以dt[5]是3，当i=5时,next[5]=3，所以dt[i]=dt[next[i]]+1
理解了上面的部分就很简单了，后面直接套next数组的模板

http://www.cnblogs.com/jackge/archive/2013/04/20/3032942.html

样例分析():

dt[i]= P[1..i]使前后缀相同的前缀种类数+本身

a a 1

ab ab 1

aba a aba 2

abab ab abab 2

ababa a aba ababa 3 ==该串使前后缀相同的前缀种类数(dt[next[i]))+本身(1)

ababab ab abab ababab 3

有3个a,3个ab,2个aba,2个abab,1个ababa,1个ababab

   i 1 2 3 4 5 6 7
     字符串 a b a b t a b
       dt[i] 1 1 2 2 1 2 2

a a 1

ab ab 1

aba a aba 2

abab ab abab 2 dt[i]=dt[next[i]]+1 dt[abab]=dt[ab]+abab

ababt ababt 1

ababta a ababta 2 !

ababtab ab ababtab 2 !

有3个a,3个ab,1个aba,1个abab,1个ababt,1个ababta,1个ababtab

dt[1]+…+dt[i]表示P[1..i]的前缀出现总数(即答案)

dt[i]表示的是在P[1…i-1]+P[i]后要补充的前缀出现次数

(因为这个串(这个后缀但与前缀相同)又出现了一遍，所以重新要加上他这个串里面所有出现过的前缀次数)

例如

aba a aba 2

ababa a aba ababa 3

在abab加上a之后又出现aba所以又要加一遍aba里面出现过的前缀（a,aba）（因为不需要加上ab是因为前面已经加过了在abab中）

例如

aba a aba

abab ab abab 2 表示在aba加上b后出现了新的与前缀相同的串ab,以及改串本身

=dt[next[i]]+1中为什么是next[i]是因为串不断更新的是末尾，可能出现新的与前缀相同的串也应该是末尾，而末尾出现的新的与前缀相同的后缀前面一定计算过了(例，dt[ababa]=dt[aba]+1,dt[aba]前面已经计算过了)，直接加上就好了

#include<string.h>#include<stdio.h>#define MAXN 200010int n,m;char P[MAXN],T[MAXN];char tmp[MAXN];int next[MAXN];int dist[MAXN];void makenext(){int q,k;next[0]=0;//m=strlen(P);for(q=1,k=0;q<m;q++){while(k>0&&P[q]!=P[k])k=next[k-1];if(P[q]==P[k]){k++;}next[q]=k;}}int main(){int t,i,sum;scanf("%d",&t);while(t--){memset(dist,0,sizeof(dist));sum=0;scanf("%d",&m);scanf("%s",&P);makenext();for(i=0;i<m;i++){dist[i+1]=dist[next[i]]+1;sum+=dist[i+1];}printf("%d\n",sum%10007);}return 0;}

这里有我参考的模板(我自己加了注释)

http://www.cnblogs.com/c-cloud/p/3224788.html

#include<stdio.h>#include<string.h>void makeNext(const char P[],int next[])  //next[i]数组就是表示前面P[0]...P[i]最大的前缀和后缀相同的子串的长度{    int q,k;    int m = strlen(P);    next[0] = 0;    for (q = 1,k = 0; q < m; ++q)    {        while(k > 0 && P[q] != P[k])   //不断往前缩短区间来寻找在P[q-k]...P[q-1]该区间的最大的k(即寻找k的最大值——寻找最大的相同的前后缀长度k)            k = next[k-1];           //又因为当next[k-1]>0是头部和尾部有一部分是相同的(P[0]..P[next[k-1]]==P[q-next[k-1]]...P[q-1])        if (P[q] == P[k])        {            k++;        }        next[q] = k;    }}void kmp(const char T[],const char P[],int next[])  //kmp就是一个当有一个字符匹配失败时，将之前匹配过的匹配串的开头拿到匹配失败的前一个字符从后往前去对，尽可能{                                                       //使能对到的串最大，不需要重新移回去对    int n,m;    int i,q;    n = strlen(T);    m = strlen(P);    makeNext(P,next);    for (i = 0,q = 0; i < n; ++i)                //不需要回溯，完全依靠q来调整匹配串(模式串)的位置    {                                    //这个循环用于回到上一个最长的跟T[q-k]...T[q-1]的子串==P[0]...P[k-1]        while(q > 0 && P[q] != T[i])   //在abcdab abcdabcdabde例子中,当i==6时            q = next[q-1];            //第一次P[q] != T[i](' '),q回到2（'c'）,第二次P[q] != T[i](' '),q回到0('a'),（因为是0跳出循环）        if (P[q] == T[i])          //第三次'A'!=' '，之后就从下一个重新开始匹配(不用回到之前的‘b’)        {            q++;        }        if (q == m)        {            printf("Pattern occurs with shift:%d\n",(i-m+1));        }    }}int main(){    int i;    int next[20]={0};    char T[] = "abcdab abcdabcdabde";    char P[] = "abcdabd";    printf("%s\n",T);    printf("%s\n",P );    // makeNext(P,next);    kmp(T,P,next);    for (i = 0; i < strlen(P); ++i)    {        printf("%d ",next[i]);    }    printf("\n");    return 0;}

阅读全文

1 0