POJ 3294 Life Forms (后缀数组,求出现在不少于k个字符串的最长子串)
来源:互联网 发布:海鼎软件价格 编辑:程序博客网 时间:2024/05/17 22:38
Description
You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust.
The answer is given in the 146th episode of Star Trek - The Next Generation, titledThe Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA.
Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them.
Input
Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case.
Output
For each test case, output the longest string or strings shared by more than half of the life forms. If there are many, output all of them in alphabetical order. If there is no solution with at least one letter, output "?". Leave an empty line between test cases.
Sample Input
3abcdefgbcdefghcdefghi3xxxyyyzzz0
Sample Output
bcdefgcdefgh?
Source
/* * poj 3294 * 给出n个字符串,求出现在一半以上字符串的最长子串,按照字典序输出所有结果 * 将n个字符串连接起来,中间用没有出现过的字符隔开,然后求后缀数组。然后二分答案,进行分组,判断每组的后缀是否出现在不少于k个的原串中, */#include<stdio.h>#include<iostream>#include<string.h>#define N 101010using namespace std;int t1[N],t2[N],x[N],s[N],c[N],sa[N],height[N],rank[N],b[N];int vis[105];//数组开小点,不然tle,无语。。。char str[105][1005];void build_sa(int *s,int n,int m){ int *x=t1,*y=t2,i,k,p; for(i=0; i<m; i++) c[i]=0; for(i=0; i<n; i++) c[x[i]=s[i]]++; for(i=1; i<m; i++) c[i]+=c[i-1]; for(i=n-1; i>=0; i--) sa[--c[x[i]]]=i; for(k=1; k<=n; k<<=1) { p=0; for(i=n-k; i<n; i++) y[p++]=i; for(i=0; i<n; i++) if(sa[i]>=k) y[p++]=sa[i]-k; for(i=0; i<m; i++) c[i]=0; for(i=0; i<n; i++) c[x[y[i]]]++; for(i=1; i<m; i++) c[i]+=c[i-1]; for(i=n-1; i>=0; i--) sa[--c[x[y[i]]]]=y[i]; swap(x,y); p=1; x[sa[0]]=0; for(i=1; i<n; i++) x[sa[i]]=y[sa[i-1]]==y[sa[i]]&&y[sa[i-1]+k]==y[sa[i]+k]?p-1:p++; if(p>=n) break; m=p; }}void getheight(int n){ int i,k=0,j; for(i=0; i<=n; i++) rank[sa[i]]=i; for(i=0; i<n; i++) { if(k) k--; j=sa[rank[i]-1]; while(s[i+k]==s[j+k]) k++; height[rank[i]]=k; }}int check(int len,int m,int n){ int i,ret,tmp; memset(vis,0,sizeof(vis)); tmp=b[sa[1]]; ret=0; if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } for(i=2;i<=n;i++) { if(height[i]<len) { if(ret>=m) return 1; ret=0; tmp=b[sa[i]]; memset(vis,0,sizeof(vis)); if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } } else { tmp=b[sa[i]]; if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } if(ret>=m) return 1; } } return 0;}void put(int len,int m,int n){ int ret=0,tmp,i,j; tmp=b[sa[1]];memset(vis,0,sizeof(vis)); if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } for(i=2;i<=n;i++) { if(height[i]<len) { if(ret>=m) { for(j=0;j<len;j++) printf("%c",s[sa[i-1]+j]); printf("\n"); } ret=0; tmp=b[sa[i]];memset(vis,0,sizeof(vis)); if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } } else { tmp=b[sa[i]]; if(tmp!=-1&&!vis[tmp]) { ret++; vis[tmp]=1; } } } if(ret>=m) { for(j=0;j<len;j++) printf("%c",s[sa[i-1]+j]); printf("\n"); }}int main(){ int n,k,l,r,m,ans,i,j,flag=1; while(scanf("%d",&n),n!=0) { if(flag==1) flag=0; else printf("\n"); k=0; for(i=0;i<n;i++) { scanf("%s",str[i]); for(j=0;str[i][j]!='\0';j++) b[k]=i,s[k++]=str[i][j]; b[k]=-1; s[k++]=i+130; } k--; s[k]=0; build_sa(s,k+1,300); getheight(k); l=0;r=1000; m=n/2+1; ans=-1; while(l<=r) { int mid=(l+r)>>1; if(check(mid,m,k)) { ans=mid; l=mid+1; } else r=mid-1; } //printf("ans=%d\n",ans); if(ans<=0) printf("?\n"); else put(ans,m,k); } return 0;}
- POJ 3294 Life Forms (后缀数组,求出现在不少于k个字符串的最长子串)
- POJ 3294 Life Forms(不小于k个字符串中的最长子串 后缀数组)
- POJ 3294 Life Forms(后缀数组求k个串的最长子串)
- POJ 3415 Life Forms 给定n个字符串,求出现在不小于k个字符串中的最长子串。
- POJ 题目3294Life Forms(后缀数组求超过k个的串的最长公共子串)
- POJ 3294 Life Forms (后缀数组求解出现次数不少于K次的串,5级)
- Life Forms 后缀数组 不小于k个字符串中的最长子串
- poj3294 Life Forms(后缀数组+大于k/2个字符串中含有的最长公共子串)
- poj 3294 Life Forms(不小于k 个字符串中的最长子串)
- Poj 3294 Life Forms (后缀数组 在n个串中出现k次的最长公共子串并输出)
- poj 3294 Life Forms(n个字符串中 求公共子串长度超过k得最大子串 后缀数组)
- poj 3294 不小于 k 个字符串中的最长子串(后缀数组+二分)
- poj 3294 ( 后缀数组 不小于 k 个字符串中的最长子串 )
- POJ 3294 后缀数组:求不小于k个字符串中的最长子串
- poj 3294 求多于k个字符串的最长公共子串的个数-------后缀数组+二分答案
- 后缀数组(多个字符串的最长公共子串)—— POJ 3294
- poj 3294 Life Forms(后缀数组)
- poj 3294 Life Forms (后缀数组)
- DateUtil
- primefaces4 翻页时更新分页footer,totalRecords的更新
- 黑马程序员 Java学习总结之Java异常机制
- Find the 5 repeated items in an array
- Linux常用压缩解压缩命令
- POJ 3294 Life Forms (后缀数组,求出现在不少于k个字符串的最长子串)
- linux服务器内核报错,printk: 58 messages suppressed和Out of socket memory
- 开车的一些小技巧
- PCM数据格式
- C语言之指针的概念和使用
- 用C语言模拟面向对象
- android学习笔记之划屏的viewpager配合fragment使用
- JMX连接tomcat(二)Linux篇
- gSOAP中文文档