题目1 : Beautiful String

来源：互联网发布：沈阳seo引擎优化软件编辑：程序博客网时间：2024/06/14 06:33

题目来自于hihocoder：http://hihocoder.com/contest/hiho58/problem/1

题目1 : Beautiful String时间限制:10000ms单点时限:1000ms内存限制:256MB描述We say a string is beautiful if it has the equal amount of 3 or more continuous letters (in increasing order.)Here are some example of valid beautiful strings: "abc", "cde", "aabbcc", "aaabbbccc".Here are some example of invalid beautiful strings: "abd", "cba", "aabbc", "zab".Given a string of alphabets containing only lowercase alphabets (a-z), output "YES" if the string contains a beautiful sub-string, otherwise output "NO".输入The first line contains an integer number between 1 and 10, indicating how many test cases are followed.For each test case: First line is the number of letters in the string; Second line is the string. String length is less than 10MB.输出For each test case, output a single line "YES"/"NO" to tell if the string contains a beautiful sub-string.提示Huge input. Slow IO method such as Scanner in Java may get TLE.样例输入43abc4aaab6abccde3abb样例输出YESNOYESNO

解题关键

最基本的思路为对S的每一个子串进行判定是否满足要求。枚举子串的起点、终点以及检查是否合法。

假设S的长度为N，则时间复杂度为O(N^3)。

For i = 0..N-1    For j = 0..N-1        check(S[i..j])    End ForEnd For

这样的做法对于N稍大的数据来说就会超过时限。

进一步考虑，由于合法子串中相同的字母总是连续的，我们不妨用(c,n)来表示一串连续相同的字母，比如”aaa”表示(a,3)，”bb”表示为(b,2)。

我们将整个字符串S用(c,n)表示，得到{(c[1], n[1]),(c[2],n[2]),…,(c[t],n[t])}的序列。其中我们合法的子串也可以表示为{(a,n),(b,n),(c,n)}。

则算法改变为在序列{(c[1], n[1]),(c[2],n[2]),…,(c[t],n[t])}中判定是否存在连续的3个元素满足c[i],c[i+1],c[i+2]连续且n[i] == n[i+1] == n[i+2]。

预处理时间为O(N)，得到的序列长度最大为N，所以整体的时间复杂度降低为O(N)。

For i = 1 .. t-2    If (c[i]+1 == c[i+1] and c[i+1]+1 == c[i+2]) and (n[i] == n[i+1] == n[i+2])        Return True    End IfEnd For

然而实际运行会发现，这个算法是不正确的。比如：”aaaabbccc”，其对应的序列为{(a,4),(b,2),(c,3)}，根据我们上面的算法并不能找到合法子串。但实际上存在合法子串”aabbcc”。

很显然，问题出在我们对于n[i],n[i+1],n[i+2]的判定上。通过上面的反例我们可以发现，在子串中n[i],n[i+2]的值其实是可以变动的，唯一固定的是n[i+1]的值。当n[i]>n[i+1]时，我们只要删去前面的若干个字母，就能够使得n[i]==n[i+1]。同理对于n[i+2]>n[i+1]时，我们删去后面的字母。因此只要有n[i]>=n[i+1],n[i+2]>=n[i+1]，就一定能够通过变换使得n[i] == n[i+1] == n[i+2]。

改正后我们的算法代码为：

For i = 1 .. t-2    If (c[i]+1 == c[i+1] and c[i+1]+1 == c[i+2]) and (n[i] >= n[i+1] and n[i+1] <= n[i+2])        Return True    End IfEnd For

结果分析
在实际的比赛中，该题目的通过率仅为26%。

但根据赛后的统计结果，大部分的选手都使用了朴素的算法通过了规模较小的数据点。在该题目上获取了10~60不等的分数。

其中比较有意思的是有一位选手仅仅判定连续3个字母是否连续，也获得了60的分数。

而分布在70~90分数段的程序，随机抽取了若干样本，发现大多数都是想到了正确算法的。而导致他们丢分的主要原因则是多组数据产生的初始化问题。

代码：

#include<cstdio>struct node{    char ch;    int count;}; int main(){    int tn;    scanf("%d",&tn);    char ch;   struct node str[10*1024*1024+12];//str[10485860];   int n;   int cur;   bool ans;   for(int ye=0;ye<tn;++ye)   {    ans=false;       scanf("%d",&n);       cur=0;       str[0].ch==0;       ch=getchar();//此处用于去除多余的换行符       for(int i=0;i<n;++i)       {           ch=getchar();         //  printf("ch=%c",ch);           if(ch!=str[cur].ch)           {              cur++;              str[cur].ch=ch;             str[cur].count=1;           }           else           {            ++str[cur].count;              }       }       for(int i=1;i+2<=cur;++i)       {           if((str[i].ch+1==str[i+1].ch)&&(str[i+1].ch+1==str[i+2].ch)&&(str[i].count>=str[i+1].count)&&(str[i+2].count>=str[i+1].count))           {               ans=true;               break;           }       }       if(ans)       printf("YES\n");       else        printf("NO\n");   }}

2 0