KMP算法的next、next value数组代码实现及POJ3461
来源:互联网 发布:合肥高铁枢纽 知乎 编辑:程序博客网 时间:2024/05/19 00:15
昨天中午弄懂了数组的手工计算方法之后,根据书上例题解出了一道KMP算法的匹配题。
我用了next 和nextval两种解决方法,其实就是数组实现的代码片不同。
w表示给定的模式字符串
next数组代码实现如下:
int next[maxw],j=0,i;next[0]=-1;next[1]=0;for(i=2; i<=strlen(w); ++i){ while(j>=0&&w[j]!=w[i-1]) j=next[j]; next[i]=++j;}
nextval数组代码实现如下:
int nextval[maxw],i=0,j=-1;nextval[0]=-1; while(i<strlen(w)) { if(j==-1||w[i]==w[j]) { ++i; ++j; if (w[i]!=w[j]) nextval[i]=j; else nextval[i]=nextval[j]; } else j=nextval[j]; }
下面贴一道POJ上的例题:
Memory Limit: 65536KTotal Submissions: 29204
Accepted: 11704
Description
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
- One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
- One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
Sample Input
3BAPCBAPCAZAAZAZAZAVERDIAVERDXIVYERDIAN
Sample Output
130
这道题就是求匹配过程中子串在主串中出现了多少次。
next版:
#include <cstdio>#include <cstring>#include <iostream>using namespace std;const int maxw=10000+10;const int maxt=1000000+10;int match(char w[],char s[],int next []){ int cnt=0,p=0,cur=0,slen,wlen; slen=strlen(s); wlen=strlen(w); while(cur<slen) { if(s[cur]==w[p]) { ++cur; ++p; } else if(p>=0) { p=next[p]; } else { ++cur; p=0; } if(p==wlen) { ++cnt; p=next[p]; } } return cnt;}int main(){ int loop; scanf("%d",&loop); while(loop--) { char w[maxw],t[maxt]; scanf("%s%s",w,t); int next[maxw],p=0,cur; next[0]=-1; next[1]=0; for(cur=2; cur<=strlen(w); ++cur) { while(p>=0&&w[p]!=w[cur-1]) p=next[p]; next[cur]=++p; } printf("%d\n",match(w,t,next)); } return 0;}
nextval版:
#include <cstdio>#include <cstring>#include <iostream>using namespace std;const int maxw=10000+10;const int maxt=1000000+10;int match(char w[],char s[],int next []){ int cnt=0,p=0,i=0,slen,wlen; slen=strlen(s); wlen=strlen(w); while(i<slen) { if(s[i]==w[p]) { ++i; ++p; } else if(p>=0) { p=next[p]; } else { ++i; p=0; } if(p==wlen) { ++cnt; p=next[p]; } } return cnt;}int main(){ int loop; scanf("%d",&loop); while(loop--) { char w[maxw],t[maxt]; scanf("%s%s",w,t); int nextval[maxw],i,j; i=0; nextval[0]=-1; j=-1; while(i<strlen(w)) { if(j==-1||w[i]==w[j]) { ++i; ++j; if (w[i]!=w[j]) nextval[i]=j; else nextval[i]=nextval[j]; } else j=nextval[j]; } printf("%d\n",match(w,t,nextval)); } return 0;}
比较内存占用和运行时长,发现就这道题而言,nextval的用时要少但是内存占用较多。
总之,nextval是对next优化改进后的方法,效率会提高。
我对KMP算法的初步学习大概就是这么多认识,当然是木有BF算法那么好理解,但是KMP又快又好用阿~~
希望我能就这么坚持下去吧,即使脑子木有人家那么灵活但是如果多花花时间能弄出来我也是挺开心哒~~
- KMP算法的next、next value数组代码实现及POJ3461
- kmp next 数组代码 及 kmp 算法
- KMP算法的next、next value数组的手工计算
- KMP算法与next数组的代码初步实现
- KMP算法之最终实现及 NEXT数组的优化
- KMP算法及next数组
- KMP算法的next数组
- KMP 算法 next数组
- KMP算法--next数组
- KMP算法+NEXT数组
- KMP算法next数组
- 计算KMP模式匹配算法中next数组的代码分析及改进型KMP算法中nextval数组代码分析
- KMP算法及Next数组求解方法
- KMP算法及next数组详解
- python求解next数组实现KMP算法
- KMP的next数组
- KMP算法中的NEXT数组的应用
- 关于KMP算法的NEXT数组解释
- extjs type类型
- List<T> 排序
- mysql int类型默认值设置为空,结果会自动转成0。
- css各属性顺序
- Fast Arrangement-3577-线段树成段更新
- KMP算法的next、next value数组代码实现及POJ3461
- Win8.1以及win10以上系统 安装msi文件方法
- Magento SOAP API V2 开放接口修改订单状态
- SQL优化基础 使用索引(一个小例子)
- 解决“com.android.dex.DexIndexOverflowException: method ID not in [0, 0xffff]: 65536”问题
- 多项式求和
- 人民币贬值三重冲击波
- hdu5372 Segment Game 树状数组
- 猴子吃香蕉