AC自动机(good)codeforces86C

来源:互联网 发布:美工简历 编辑:程序博客网 时间:2024/06/06 19:18

C. Genetic engineering
time limit per test
2 seconds
memory limit per test
256 megabytes
input
standard input
output
standard output

"Multidimensional spaces are completely out of style these days, unlike genetics problems" — thought physicist Woll and changed his subject of study to bioinformatics. Analysing results of sequencing he faced the following problem concerning DNA sequences. We will further think of a DNA sequence as an arbitrary string of uppercase letters "A", "C", "G" and "T" (of course, this is a simplified interpretation).

Let w be a long DNA sequence and s1, s2, ..., sm — collection of short DNA sequences. Let us say that the collection filters w iff w can be covered with the sequences from the collection. Certainly, substrings corresponding to the different positions of the string may intersect or even cover each other. More formally: denote by |w| the length of w, let symbols of w be numbered from 1 to |w|. Then for each position iin w there exist pair of indices l, r (1 ≤ l ≤ i ≤ r ≤ |w|) such that the substring w[l ... r] equals one of the elements s1, s2, ..., sm of the collection.

Woll wants to calculate the number of DNA sequences of a given length filtered by a given collection, but he doesn't know how to deal with it. Help him! Your task is to find the number of different DNA sequences of length n filtered by the collection {si}.

Answer may appear very large, so output it modulo 1000000009.

Input

First line contains two integer numbers n and m (1 ≤ n ≤ 1000, 1 ≤ m ≤ 10) — the length of the string and the number of sequences in the collection correspondently.

Next m lines contain the collection sequences si, one per line. Each si is a nonempty string of length not greater than 10. All the strings consist of uppercase letters "A", "C", "G", "T". The collection may contain identical strings.

Output

Output should contain a single integer — the number of strings filtered by the collection modulo 1000000009 (109 + 9).

Sample test(s)
input
2 1A
output
1
input
6 2CATTACT
output
2
Note

In the first sample, a string has to be filtered by "A". Clearly, there is only one such string: "AA".

In the second sample, there exist exactly two different strings satisfying the condition (see the pictures below).


题意:用给出的字符串构造长度为n的字符串,可以重叠

思路参考了别人的。。。刚开始想的不对,不知道怎么处理最后重叠的部分

首先我们可以定下两维状态,dp[i][j]表示构造长度为i的串,走到了j号节点。但是这样的状态显然不是最优的,因为我可以当前这个字符没匹配上,但是下一个在加入下一个字符的时候,构成的字符串把当前这个字符给匹配进去了。于是,可以加一维状态,可以理解为给未被匹配上的的字符预留一些长度。因此我定义的状态就是dp[i][j][k]表示长度为i,走到了j号节点,有k个字符还未被匹配上。状态定义好了,接下来就是怎么转移了。在自动机上,我记录了一个信息,val[i]表示如果走i这个节点,能匹配的最长的字符串长度是多少。

#include<iostream>#include<cstdio>#include<string>#include<cstring>#include<vector>#include<cmath>#include<queue>#include<stack>#include<map>#include<set>#include<algorithm>using namespace std;typedef long long LL;const int maxn=15*15;const int SIGMA_SIZE=4;const int MOD=1000000009;int N,M;char s[20];LL dp[1111][111][15];struct AC{    int ch[maxn][26],val[maxn];    int fail[maxn];    int sz;    void clear(){memset(ch[0],0,sizeof(ch[0]));sz=1;}    int idx(char x)    {        if(x=='A')return 0;        if(x=='T')return 1;        if(x=='G')return 2;        return 3;    }    void insert(char *s,int id)    {        int n=strlen(s);        int u=0;        for(int i=0;i<n;i++)        {            int c=idx(s[i]);            if(!ch[u][c])            {                memset(ch[sz],0,sizeof(ch[sz]));                val[sz]=0;                ch[u][c]=sz++;            }            u=ch[u][c];        }        val[u]=n;    }    void getfail()    {        queue<int> q;        fail[0]=0;        int u=0;        for(int c=0;c<SIGMA_SIZE;c++)        {            u=ch[0][c];            if(u){fail[u]=0;q.push(u);}        }        while(!q.empty())        {            int r=q.front();q.pop();            val[r]=max(val[r],val[fail[r]]);            for(int c=0;c<SIGMA_SIZE;c++)            {                u=ch[r][c];                if(!u){ch[r][c]=ch[fail[r]][c];continue;}                q.push(u);                int v=fail[r];                while(v&&!ch[v][c])v=fail[v];                fail[u]=ch[v][c];            }        }    }    void solve()    {        memset(dp,0,sizeof(dp));        dp[0][0][0]=1;        for(int i=0;i<N;i++)        {            for(int j=0;j<sz;j++)            {                for(int k=0;k<=10;k++)                for(int c=0;c<SIGMA_SIZE;c++)                {                    int t=ch[j][c];                    if(val[t]>=k+1)dp[i+1][t][0]=(dp[i+1][t][0]+dp[i][j][k])%MOD;                    else dp[i+1][t][k+1]=(dp[i+1][t][k+1]+dp[i][j][k])%MOD;                }            }        }        LL ans=0;        for(int i=0;i<sz;i++)            ans=(ans+dp[N][i][0])%MOD;        cout<<ans<<endl;    }}ac;int main(){    scanf("%d%d",&N,&M);    ac.clear();    for(int i=1;i<=M;i++)    {        scanf("%s",s);        ac.insert(s,i);    }    ac.getfail();    ac.solve();    return 0;}




0 0
原创粉丝点击