【POJ 3691】【hdu 2457】DNA repair 中文题意&题解&代码(C++)

来源:互联网 发布:土豆网络蓝色蜘蛛网 编辑:程序博客网 时间:2024/05/29 03:21

DNA repair

Time Limit: 2000MS Memory Limit: 65536K

Description

Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters ‘A’, ‘G’ , ‘C’ and ‘T’. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA “AAGCAG” to “AGGCAC” to eliminate the initial causing disease segments “AAG”, “AGC” and “CAG” by changing two characters. Note that the repaired DNA can still contain only characters ‘A’, ‘G’, ‘C’ and ‘T’.

You are to help the biologists to repair a DNA by changing least number of characters.

Input
The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases.
The following N lines gives N non-empty strings of length not greater than 20 containing only characters in “AGCT”, which are the DNA segments causing inherited disease.
The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in “AGCT”, which is the DNA to be repaired.

The last test case is followed by a line containing one zeros.

Output
For each test case, print a line containing the test case number( beginning with 1) followed by the
number of characters which need to be changed. If it’s impossible to repair the given DNA, print -1.

Sample Input

2
AAA
AAG
AAAG
2
A
TG
TGAATG
4
A
G
C
T
AGT
0

Sample Output

Case 1: 1
Case 2: 4
Case 3: -1


题意:
给出一些不合法的模式DNA串,给出一个原串,问最少需要修改多少个字符,使得原串中不包含非法串。


题解:
博主所做的第一道ac自动机+dp的题,居然sb的因忘记使用写好的建fail指针的函数而被卡了两个多小时,很惊讶的是两个小时我居然都没发现这个漏洞,真想把电脑砸了。
首先写好dp方程:
1 . dp [ i ] [ j ] 表示处理到前i个字符且停留在trie树上编号为j的节点时,字符串所修改的最小次数。
2. 我们可以发现这样表示的情况下 可以用dp[i][j]转移到dp [ i+1 ][ j的四个儿子(ATCG)]上,判断一下当转移到 tr[ j ].ch[ k ] 时,tr[ j ].ch[ k ] 是否是病毒的末端,如果不是的话,还需要判断此时 k 儿子和 t[ i ]是否相等,不想等的话才给dp转移时加一,具体看代码,表示这个dp仔细想想还是挺好懂的。。。。


代码:

#include<iostream>#include<algorithm>#include<stdio.h>#include<string.h>#include<queue>using namespace std;struct node{    int ch[4];    void init()    {        for (int i=0;i<4;i++)        ch[i]=0;    }}tr[2010];queue<int>q;int ll=0,ans,n,tot,flag[1010],fail[1010],dp[1005][1010];char s[205],t[1005];int getval(char x){    if (x=='A') return 0;    if (x=='T') return 1;    if (x=='C') return 2;    if (x=='G') return 3;}void add(){    int now=0;    int len=strlen(s);    for (int i=0;i<len;i++)    {        int tmp=getval(s[i]);        if (!tr[now].ch[tmp])        {            tot++;            tr[now].ch[tmp]=tot;            tr[tot].init();            flag[tot]=0;            fail[tot]=0;        }        now=tr[now].ch[tmp];    }    flag[now]=1;}void acatm(){    for (int i=0;i<4;i++)    if (tr[0].ch[i]) q.push(tr[0].ch[i]);    while(!q.empty())    {        int now=q.front();q.pop();        for (int i=0;i<4;i++)        {            if (tr[now].ch[i])            {                fail[tr[now].ch[i]]=tr[fail[now]].ch[i];                if (flag[tr[fail[now]].ch[i]]) flag[tr[now].ch[i]]=1;                q.push(tr[now].ch[i]);            }            else tr[now].ch[i]=tr[fail[now]].ch[i];        }    }}int main(){    while(scanf("%d",&n))    {        ll++;        if (n==0) return 0;        tot=0;tr[tot].init();        for (int i=1;i<=n;i++)        {            scanf("%s",s);            add();        }        acatm();        scanf("%s",t);        int len=strlen(t);        for (int i=0;i<=len;i++)        for (int j=0;j<=tot;j++)        dp[i][j]=9999;        dp[0][0]=0;        for (int i=0;i<len;i++)        for (int j=0;j<=tot;j++)        {            int tmp=getval(t[i]);//想一下这里为什么是t[i]而不是t[i+1]            if (dp[i][j]>=9999) continue;            for (int k=0;k<4;k++)            if (!flag[tr[j].ch[k]])            {                int tmp2=tr[j].ch[k];                dp[i+1][tmp2]=min(dp[i+1][tmp2],dp[i][j]+(k!=tmp));            }        }        ans=9999;        for (int i=0;i<=tot;i++)        ans=min(ans,dp[len][i]);//想一下为什么时dp[len][x]而不是dp[len-1][x]        printf("Case %d: ",ll);        if (ans==9999) printf("-1\n");        else printf("%d\n",ans);            }}

因为我们的dp所定义的时处理前i个而不是第i个,所以dp[1][x]所表示的实际上是处理了完了第0位,这样更新 dp[2][y] 时要用 t[1] (即字符串的第二位)来比较,这样的话ans当然要和dp[len][x]来比较了…….

0 0
原创粉丝点击