AC自动机+DP 改变字符串中的‘?’使得在字典中匹配到的次数最多 codechef Lucy and Question Marks
来源:互联网 发布:c语言双斜杠是什么意思 编辑:程序博客网 时间:2024/06/05 13:30
Lucy and Question Marks
Long ago Lucy had written some sentences in her textbook. She had recently found those her notes. But because of the large amount of time that had passed, some letters became difficult to read. Her notes are given to you as a string with question marks in places of letters that are impossible to read.
Lucy remembers that those sentences definitely made some sense. So now she wants to restore them. She thinks that the best way to restore is to replace all the question marks by latin letters in such a way that the total sum of occurrences of all the strings from her dictionary in it is maximal. And it is normal if some word occurs in her dictionary two or more times. In this case you just have to count every word as much times as it occurs in the dictionary.
You will be given the string itself and the dictionary. Please output the maximal possible number of occurrences of dictionary words and lexicographically minimal string with this number of occurrences.
Input
The first line of the input contains an integer T denoting the number of test cases. The description of T test cases follows.
The first line of every test case consists of two integers N and M - the length of the string, written by Lucy and the number of words in the dictionary. The second line of the test case consists of the string itself - Ncharacters, each is either a question mark or a small latin letter.
Then, M lines follow. Each line consist of a single string of small latin letters - the word from the dictionary.
Output
For each test case, output a two lines. The first line should contain the maximal number of occurrences. The second line should contain lexicographically minimal string with the maximal number of occurrences of the words from the dictionary.
Example
Input:37 4???????abbaabax5 3?ac??bacdcdexa8 2?a?b?c?decxdzzzOutput:9abababa2bacde1aaabecxd
Scoring
Subtask 1 (16 points): T = 50, 1 <= N <= 8, 1 <= M <= 10. Only the characters a, b and c and question marks occur in the string. Only the characters a, b, and c occur in the dictionary words. All the words in the dictionary consist of no more than 10 letters.
Subtask 2 (32 points): T = 50, 1 <= N <= 100, 1 <= M <= 100. Only the characters a, b and question marks occur in the string. Only the characters a and b occur in the dictionary words. All the words in the dictionary consist of no more than 10 letters.
Subtask 3 (52 points): T = 10, 1 <= N <= 1000, 1 <= M <= 1000. Total length of all the dictionary strings will not exceed 1000.
Time limit for the last subtask equals to 2 sec. For the first two subtasks it is 1 sec.
QMARKS - Editorial
Problem Link:
Practice
Contest
Difficulty:
Easy-Medium
Pre-requisites:
Aho-Corasick, DP
Explanation:
In order to pass the first sub task it's sufficient to implement exponential-time brute force solution. In order to go further some knowledge about Aho-Corasick algo will be required. A lot of articles on Aho-Corasick can be found on the net.
Let's solve the inverse problem first. Consider that you have a set of strings D and a string T and now it's required to calculate the total number of occurences of all the strings from D in S. This problem is a standard for Aho-Corasick algo. The standard solution builds a trie from the set of strings D with O(total length of all the strings from D) nodes. Then, suffix links are calculated and with the usage of suffix links it's possible to calculate the number of strings that end in every node of a trie and in every it's suffix. The next step is turning a trie in the automaton with O(states*alphabet) transitions. After this, you will have an automaton on which you can make N steps in order to calculate the number of occurences all the required substrings. This is the brief description of the inverse problem solution. More detailed description can be found in almost any Aho-Corasick tutorial, because this "inverse" problem is actually a well known one.
Now, how to solve the original problem. There is a DP soltuion. As it was mentioned before, there'll be O(total length of strings from D) states in the automaton. So it's possible to have a DP state of the form (number of letters already processed, current position in the automaton). The transition then is quite straightforward: if the current symbol is a question mark, then you can have 26 possible choices. Otherwise, the choice is unique - you can not use all the symbols but the current one. This way you can get the maximal number of occurences.
In order to restore the string itself, you can act greedily. You can iterate through the symbols of the string S, starting from the first one. If the current character is a letter, then there's only one choice. Otherwise, you can iterate through all the possible characters, namely 'a' to 'z' and choose the transition to the state with the maximal DP value in it (if there are several such transitions, you can choose the one with the minimal character). It becomes possible if your DP state is (the size of the current suffix, the position in the automaton), because adding a symbol is just a transition from one suffix to another, smaller one and in this case, the DP will contain all the necessary information about the remaining part of the string.
Setter's Solution:
Can be found here
Tester's Solution:
Can be found here
#include <iostream>#include <cstring>#include <cstdio>#include <algorithm>using namespace std;int T,n,m,i,num,q,ls,j,trie[1005][26],enwei[1005],G[1005][26],dp[1005][1005],c,choi,Link[1005],pv[1005],pch[1005],ew[1111];char a[1005],s[1005];int getlink(int k);int Go(int k,int j);int getlink(int k){ // suffix link standard calculationif(Link[k]==0)if(k==1||pv[k]==1)Link[k]=1;else Link[k]=Go(getlink(pv[k]),pch[k]);return Link[k];}int Go(int k,int j){ // Aho-Corasick's automaton transitionif(G[k][j]==0)if(trie[k][j]!=0)G[k][j]=trie[k][j];elseG[k][j]=k==1?1:Go(getlink(k),j);return G[k][j];}int main (int argc, char * const argv[]) {scanf("%d",&T);for(;T;T--){scanf("%d%d",&n,&m);for(i=1;i<=n;i++){a[i]=getchar();while((a[i]<'a'||a[i]>'z')&&(a[i]!='?'))a[i]=getchar();}num=1;gets(s);for(i=1;i<=m;i++){gets(s);ls=strlen(s);q=1;for(j=0;j<ls;j++)if(!trie[q][s[j]-'a']){ // building the trietrie[q][s[j]-'a']=++num; // new transitionpv[num]=q;pch[num]=s[j]-'a'; // parent vertice and character for the nodeq=num;}else q=trie[q][s[j]-'a'];++enwei[q]; // number of strings the end in this node}for(i=1;i<=num;i++){ // calculating the number of strings that end in the node and all it's suffixesj=i;ew[j]=0;while(j>1){ew[i]+=enwei[j];j=getlink(j);}}for(i=1;i<=num;i++)enwei[i]=ew[i];for(i=0;i<=n;i++)for(j=1;j<=num;j++)dp[i][j]=-1000000000; // dp initialization // dp[i][j] - answer for the substring [i; N] when the current node of the automaton is jfor(j=1;j<=num;j++)dp[n][j]=enwei[j]; for(i=n-1;i>=0;i--)for(j=1;j<=num;j++){ // dp calculationif(a[i+1]=='?')for(c=0;c<26;c++)dp[i][j]=max(dp[i][j],enwei[j]+dp[i+1][Go(j,c)]);else dp[i][j]=max(dp[i][j],enwei[j]+dp[i+1][Go(j,a[i+1]-'a')]);}printf("%d\n",dp[0][1]); // optimal result: all the characters of the string are processed and we start in the first node (like in the standard algo)for(q=1,i=1;i<=n;i++){if(a[i]!='?')choi=a[i]-'a';else{ // if there's only one optionchoi=0;for(j=0;j<26;j++)if(dp[i][Go(q,j)]>dp[i][Go(q,choi)])choi=j; // otherwise we should just take the most optimal one}putchar('a'+choi);q=Go(q,choi);}puts("");for(i=1;i<=num;i++){enwei[i]=Link[i]=pv[i]=pch[i]=ew[i]=0;for(j=0;j<26;j++)trie[i][j]=G[i][j]=0;}} return 0;}
#include <cstdio>#include <memory.h>#include <cmath>#include <iostream>#include <algorithm>#include <string>using namespace std;const int inf = 1e8;int i, j, n, m, v, cnt;char a[1033];int t[1033][26];int pch[1033], pv[1033];int terminal[1033];int reach[1033], link[1033];int mem[1033][26];int f[1003][1003];char q;int go(int v, char c);int get_link(int v){//printf("%d\n", v);if (link[v] == 0)if (v == 1 || pv[v] == 1) link[v] = 1;else link[v] = go(get_link(pv[v]), pch[v]);return link[v];}int go(int v, char c){if (mem[v][c] == 0)if (t[v][c] != 0) mem[v][c] = t[v][c];else if (v == 1) mem[v][c] = 1;else mem[v][c] = go(get_link(v), c);return mem[v][c];}int main(){//freopen("input.txt", "r", stdin);//freopen("output.txt", "w", stdout);int tc;scanf("%d", &tc);while (tc--){memset(mem, 0, sizeof(mem));memset(t, 0, sizeof(t));memset(link, 0, sizeof(link));memset(terminal, 0, sizeof(terminal));memset(reach, 0, sizeof(reach));scanf("%d%d\n", &n, &m);for (i = 1; i <= n; i++)a[i] = getchar();scanf("\n");int cnt = 1, v;for (i = 1; i <= m; i++){q = getchar();v = 1;while (q != '\n'){//putchar(q);q -= 'a';if (t[v][q] == 0){cnt++;t[v][q] = cnt;pch[cnt] = q;pv[cnt] = v;}v = t[v][q];q = getchar();}terminal[v]++;//printf("\n");}for (i = 1; i <=n; i++)for (j = 1; j <= cnt; j++)f[i][j] = - inf;for (i = 1; i <= cnt; i++){v = i;while(v > 1){reach[i] += terminal[v];v = get_link(v);}}for (i = 1; i <= cnt; i++)f[n + 1][i] = reach[i];for (i = n; i > 0; i--)for (j = 1; j <= cnt; j++){if (a[i] != '?') f[i][j]=f[i + 1][go(j, a[i] - 'a')] + reach[j];else{for (q = 0; q < 26; q++)f[i][j] = max(f[i][j], f[i + 1][go(j, q)] + reach[j]);}}printf("%d\n", f[1][1]);v = 1;for (i = 1; i <= n; i++){if (a[i] == '?') {q = 'a';int best = f[i + 1][go(v, 0)];for (char c = 1; c < 26; c++)if (f[i + 1][go(v, c)] > best){best = f[i + 1][go(v, c)];q = c + 'a';}}else q = a[i];printf("%c", q);v = go(v, q - 'a');}printf("\n");}}
- AC自动机+DP 改变字符串中的‘?’使得在字典中匹配到的次数最多 codechef Lucy and Question Marks
- AC自动机应用 多模式匹配 多个单词在文章中出现的次数-C语言实现
- 字符串匹配-AC自动机
- 字符串匹配的三个算法(KMP+字典树+AC自动机)
- 在字符串中查找出现次数最多的子串
- 【AC自动机+DP】匹配(match)
- hdu3065 病毒侵袭持续中(AC自动机,统计每个字符串出现的次数)
- 字符串匹配之AC自动机
- 数据结构-字符串匹配AC自动机
- AC自动机---求找出每个字符串出现的具体次数
- hdu3065 AC自动机-每个标准串在模式串中出现的次数
- 一个字符串在另一个字符串中匹配的次数
- 字符串中出现次数最多的字符
- UVA 1449 Dominating Patterns (LA4670) 出现次数最多的子串 ac自动机
- AC自动机模板 LA4670 Dominating Patterns 出现次数最多的字串 BNUOJ11552 UVA1449
- 出现次数最多的单词 AC自动机 UVA 1449 Dominating Patterns
- AC自动机应用(2)LA 4670出现次数最多的子串
- AC自动机(出现次数最多的子串,LA 4670)
- 【Android】四大组件(4)ContentProvider
- Myeclipse快捷键
- HTTP协议详解
- Python 格式化输出整理
- 在Centos下安装部署SVN的步骤和遇到的问题
- AC自动机+DP 改变字符串中的‘?’使得在字典中匹配到的次数最多 codechef Lucy and Question Marks
- 表单脚本——JavaScript高级程序设计笔记(11)
- 如何获取播放H264原始数据文件的宽高信息(from SPS PPS)
- MySQL 5.5 和 5.6 默认参数值的差异
- 常见的浏览器分辨率
- CSS实现缩略图鼠标移上去显示大图
- 生成随机验证码
- SSIS Catalog项目版本管理
- 利用CSS布局做一个简单的荣誉证书