SGU 142 (字符串hash)

来源:互联网 发布:扫金矿软件 编辑:程序博客网 时间:2024/05/04 15:57

 SGU 142

Description

Kevin has invented a new algorithm to crypt and decrypt messages, which he thinks is unbeatable. The algorithm uses a very large key-string, out of which a keyword is found out after applying the algorithm. Then, based on this keyword, the message is easily crypted or decrypted. So, if one would try to decrypt some messages crypted with this algorithm, then knowing the keyword would be enough. Someone has found out how the keyword is computed from the large key-string, but because he is not a very experienced computer programmer, he needs your help. The key-string consists of N characters from the set {'a','b'}. The keyword is the shortest non-empty string made up of the letters 'a' and 'b', which is not contained as a contiguous substring (also called subsequence) inside the key-string. It is possible that more than one such string exists, but the algorithm is designed in such a way that any of these strings can be used as a keyword. Given the key-string, your task is to find one keyword.


Input


The first line contains the integer number N, the number of characters inside the key-string (1 <= N <= 500 000). The next line contains N characters from the set {'a','b'} representing the string.


Output


The first line of output should contain the number of characters of the keyword. The second line should contain the keyword.


Sample Input


 11
aabaaabbbab
Sample Output


 4

aaaa


题意:给出一个长度为N的串S(N≤500000)。这个串只包含“a”,“b”两种字母。找一个长度为L的串T,也是由“a”,“b”两种字母组成,使得该串不是串S的子串且长度L尽可能小

思路:对于某一个长度L,串S中最多包含N-L+1个长度为L的子串。而长度为L的串一共有2^L个。若N-L+1<2^L,则必有某个长度为L的串T不是S的子串。而N≤500000,因此N-19+1<2^19,所以无论如何,L=19必定是一个可行的L。因此我们只需对长度不超过19的子串进行统计。将S转成用01串来表示,0表示a,1表示b,然后找出所有长度不超过19的子串放入一个match数组中,标记为true。然后从小到大枚举串T,若!match[T],则已找到答案。


AC代码:

#include <iostream>#include <cmath>#include <cstdio>#include <cstring>using namespace std;const int maxn=500005;int main(){    //freopen("1.txt","r",stdin);    int n,s1,ans_len;    char s[maxn];    bool match[1<<20];    int ans[20];    scanf("%d",&n);    scanf("%s",s);    int len=floor(log(n)/log(2))+1;  //原始串长度化为2进制时的长度    int total=1<<(len+1);               memset(match,false,sizeof(match));    for(int i=0;i<n;i++)       //找所有长度不超过len的子串    {        int num=1;        for(int j=0;j<len&&i+j<n;j++)             {            if(s[i+j]=='a')            num<<=1;            else            num=num<<1|1;            match[num]=true;        }    }    for(int i=2;i<total;i++)    if(!match[i])    {        s1=i;        break;    }    ans_len=floor(log(s1)/log(2));    int j=ans_len;    while(s1>1)    {        ans[j--]=s1&1;        s1>>=1;    }    printf("%d\n",ans_len);    for(int i=1;i<=ans_len;i++)    printf("%c",ans[i]+'a');    printf("\n");    return 0;}