HDU 3341 Lost's revenge (AC自动机 + dp[优化好多= =])
来源:互联网 发布:c语言入门经典例题 编辑:程序博客网 时间:2024/06/06 03:10
题意:
给你n 个DNA 串, 最后给你一个匹配串, 问你匹配串随便排列后 最多能匹配多少DNA串?
思路:
一个串匹配多个串, 是AC自动机。
考虑dp
因为是随便排列, 因此就得考虑 用x1 个A, x2 个C, x3 个G, x4 个T 能匹配多少个串, 这样不会有顺序问题。
刚开始考虑的是 dp[i][j][k] 表示构造串的第i 位, 目前在自动机的j 结点, ACGT数量状态为K 的最大方案数。
然而光考虑如何雅压缩ACGT的状态了, 这个时间复杂度 是承受不了的。
看了网上题解后 , 大家都是二维dp。
感觉优化的很厉害。。
其实第一维没有用, 这样只会造成空间的浪费。
我们直接令dp[i][j] 表示 目前在自动机的i 结点, ACGT数量状态为j 的方案数。
其实想一想 每个结点都有j 个状态, 不会造成循环重复转移这个问题。
在转移上也有个小问题。
要先枚举ACGT的数量状态这一维, 在枚举 在自动机哪个结点i。
因为刚开始肯定是数量很小的, 逐步向数量多转移。 因为要先枚举 数量状态j。
如果枚举在哪个节点的话, 这个结点可能会被后面的节点重新转移到(自动机fail 指针 会指向深度小的节点)。 所以注意一下好了。
哦对, 在说一下,如和存取ACGT数量状态。
这个题不仅卡时间 , 还卡内存。
因此这一步要好好优化一下。
字符串的长度最大是40.
我们记录Hash[40][40][40][40]的话, 会MLE。
但又不是每个字母都可以到40, 只是总和是40.
因此我们可以把字符串存下来后, 依次枚举每个字符的数量, 然后四层循环开始存状态。
把每个循环 离散化成一个数字,依次累加即可。
我们算一下, 长度是40, 那么分成四部分乘积最大,显然长度是10. 因此空间复杂度是 11 * 11 * 11 * 11 , 开15w 就很稳了。
代码参考一下把, 相当挫:
#include <cstdio>#include <cstring>#include <algorithm>#include <queue>#include <vector>using namespace std;int get(char ch){ if (ch == 'A') return 0; if (ch == 'C') return 1; if (ch == 'G') return 2; if (ch == 'T') return 3;}const int maxn = 500 + 7;int Hash[41][41][41][41];int cur;int dp[maxn][15000];int mp[15000];struct Trie{ int L, root; int next[maxn][4]; int fail[maxn]; int flag[maxn]; int sum[maxn]; void init(){ L = 0; root = newnode(); } int newnode(){ for (int i = 0; i < 4; ++i){ next[L][i] = -1; } flag[L] = 0; sum[L] = 0; return L++; } void insert(char* s){ int len = strlen(s); int nod = root; for (int i = 0; i < len; ++i){ int id = get(s[i]); if (next[nod][id] == -1){ next[nod][id] = newnode(); } nod = next[nod][id]; } flag[nod]++; } void bfs(){ fail[root] = root; queue<int>q; for (int i = 0; i < 4; ++i){ if (next[root][i] == -1){ next[root][i] = root; } else { fail[next[root][i]] = root; q.push(next[root][i]); } } while(!q.empty()){ int u = q.front(); q.pop(); for (int i = 0; i < 4; ++i){ if (next[u][i] == -1){ next[u][i] = next[fail[u] ][i]; } else { fail[next[u][i] ] = next[fail[u] ][i]; q.push(next[u][i]); } } } for (int i = 0; i < L; ++i){ int tmp = i; while(tmp != root){ sum[i] += flag[tmp]; tmp = fail[tmp]; } } } void solve(char* s){ cur = 0; int cnt[4] = {0}; int goal; int len = strlen(s); for (int i = 0; i < len; ++i){ cnt[get(s[i]) ]++; } for (int i = 0; i <= cnt[0]; ++i){ for (int j = 0; j <= cnt[1]; ++j){ for (int k = 0; k <= cnt[2]; ++k){ for (int l = 0; l <= cnt[3]; ++l){ Hash[i][j][k][l] = cur++; int v = 0; v = v * 100 + i; v = v * 100 + j; v = v * 100 + k; v = v * 100 + l; mp[cur-1] = v; } } } } goal = Hash[cnt[0] ][cnt[1] ][cnt[2] ][cnt[3] ]; memset(dp,-1,sizeof dp); dp[0][0] = 0; int la[4] = {0}; for (int k = 0; k < cur; ++k){ for (int j = 0; j < L; ++j){ for (int l = 0; l < 4; ++l){ if (dp[j][k] == -1) continue; int v = mp[k]; la[3] = v % 100; v /= 100; la[2] = v % 100; v /= 100; la[1] = v % 100; v /= 100; la[0] = v % 100; v /= 100; int nx = next[j][l]; la[l]++; if (la[l] > cnt[l])continue; int id = Hash[la[0] ][la[1] ][la[2] ][la[3] ]; dp[nx][id] = max(dp[nx][id], dp[j][k] + sum[nx]); } } } int ans = 0; for (int i = 0; i < L; ++i){ ans = max(ans, dp[i][goal]); } printf("%d\n", ans); }}ac;char s[60];int main(){ int n, ks = 0; while(~scanf("%d", &n) && n){ ac.init(); for (int i = 0; i < n; ++i){ scanf("%s", s); ac.insert(s); } scanf("%s", s); ac.bfs(); printf("Case %d: ", ++ks); ac.solve(s); } return 0;}/**1ACACAC**/
Lost's revenge
Time Limit: 15000/5000 MS (Java/Others) Memory Limit: 65535/65535 K (Java/Others)Total Submission(s): 4366 Accepted Submission(s): 1214
One noon, when Lost was lying on the bed, the Spring Brother poster on the wall(Lost is a believer of Spring Brother) said hello to him! Spring Brother said, "I'm Spring Brother, and I saw AekdyCoin shames you again and again. I can't bear my believers were being bullied. Now, I give you a chance to rearrange your gene sequences to defeat AekdyCoin!".
It's soooo crazy and unbelievable to rearrange the gene sequences, but Lost has no choice. He knows some genes called "number theory gene" will affect one "level of number theory". And two of the same kind of gene in different position in the gene sequences will affect two "level of number theory", even though they overlap each other. There is nothing but revenge in his mind. So he needs you help to calculate the most "level of number theory" after rearrangement.
For each testcase, first line is number of "number theory gene" N(1<=N<=50). N=0 denotes the end of the input file.
Next N lines means the "number theory gene", and the length of every "number theory gene" is no more than 10.
The last line is Lost's gene sequences, its length is also less or equal 40.
All genes and gene sequences are only contains capital letter ACGT.
3ACCGGTCGAT1AAAAA0
Case 1: 3Case 2: 2
- HDU 3341 Lost's revenge (AC自动机 + dp[优化好多= =])
- hdu 3341 Lost's revenge(AC自动机+DP)
- HDU 3341 Lost's revenge(AC自动机+DP)
- 【HDU】3341 Lost's revenge AC自动机+变进制+DP
- Hdu 3341 Lost's revenge (ac自动机+dp+hash)
- HDU - 3341 Lost's revenge(AC自动机+DP)
- [AC自动机+dp+变进制状压] hdu 3341 Lost's revenge
- hdu 3341 Lost's revenge(AC自动机+变进制状压DP)
- HDU 3341 Lost's revenge AC自动机 + 变进制状压DP
- hdu 3341 Lost's revenge (ac自动机+状压dp)
- hdu 3341Lost's revenge(ac自动机+dp)
- hdu 3341 Lost's revenge(dp+Ac自动机)
- HDU 3341 Lost's revenge (AC自动机 + DP)
- HDU 3341-Lost's revenge(AC自动机+DP+hash)
- HDU 3341 Lost's revenge(AC自动机+DP+变进制优化)
- HDU 3341 Lost's revenge (AC自动机+DP,5级)
- HDU 3341 Lost's revenge(AC自动机+状态压缩DP)
- 【AC自动机】 HDOJ 3341 Lost's revenge
- Redis-数据结构-压缩列表-ziplist
- java中的进程与线程解析
- 字典树
- 【正一专栏】如果不是生活所迫,谁会愿意苟且
- SSH整合之pom.xml
- HDU 3341 Lost's revenge (AC自动机 + dp[优化好多= =])
- Linux常用命令整理
- 纠错学习原则
- Linux中执行shell脚本的4种方法总结
- java乱码问题解决:GBK和UTF-8互转尾部乱码问题分析
- 《面试-回溯法》 ---五种经典的算法问题之回溯法python
- 日常问题总结(6)
- Vi与Vim
- 04 java类的继承和静态公有成员