Google code jam: Problem A. Alien Language

来源：互联网发布：球球大作战开挂软件编辑：程序博客网时间：2024/04/30 15:02

Problem

After years of study, scientists at Google Labs have discovered an alien language transmitted from a faraway planet. The alien language is very unique in that every word consists of exactly L lowercase letters. Also, there are exactly D words in this language.

Once the dictionary of all the words in the alien language was built, the next breakthrough was to discover that the aliens have been transmitting messages to Earth for the past decade. Unfortunately, these signals are weakened due to the distance between our two planets and some of the words may be misinterpreted. In order to help them decipher these messages, the scientists have asked you to devise an algorithm that will determine the number of possible interpretations for a given pattern.

A pattern consists of exactly L tokens. Each token is either a single lowercase letter (the scientists are very sure that this is the letter) or a group of unique lowercase letters surrounded by parenthesis ( and ). For example: (ab)d(dc) means the first letter is either a or b, the second letter is definitely d and the last letter is either d or c. Therefore, the pattern (ab)d(dc) can stand for either one of these 4 possibilities: add, adc, bdd, bdc.

Input

The first line of input contains 3 integers, L, D and N separated by a space. D lines follow, each containing one word of length L. These are the words that are known to exist in the alien language. N test cases then follow, each on its own line and each consisting of a pattern as described above. You may assume that all known words provided are unique.

Output

For each test case, output

Case #X: K

where X is the test case number, starting from 1, and K indicates how many words in the alien language match the pattern.

Limits

Small dataset

1 ≤ L ≤ 10
1 ≤ D ≤ 25
1 ≤ N ≤ 10

Large dataset

1 ≤ L ≤ 15
1 ≤ D ≤ 5000
1 ≤ N ≤ 500

Sample

Input

Output
3 5 4 abc bca dac dbc cba (ab)(bc)(ca) abc (abc)(abc)(abc) (zyx)bcCase #1: 2 Case #2: 1 Case #3: 3 Case #4: 0

我的代码：

#include <iostream>#include <cstdio>#include <string>using namespace std;const int Max_D=5000+10;const int Max_L=15+5;//char word[Max_D][Max_L];//char alien[Max_D];string word[Max_D];string alien;char hash[Max_L][26];int L,D,N;int check();void main(){freopen("A-small-practice.in","r",stdin);//freopen("a.in","r",stdin);freopen("a.out","w",stdout);cin>>L>>D>>N;string temp;for(int i=0;i!=D;++i){cin>>word[i];}for(i=0;i!=N;++i){cout<<"Case #"<<(i+1)<<": "<<check()<<endl;}}int check(){memset(hash,0,sizeof(hash));//No.1cin>>alien;int ptr=0;for(int i=0;i!=L;++i){if(alien[ptr]=='('){++ptr;while(alien[ptr]!=')'){hash[i][alien[ptr]-'a']=1;++ptr;}++ptr;}else{hash[i][alien[ptr]-'a']=1;++ptr;}}int Yes,times=0;//No.2for(i=0;i!=D;++i){Yes=1;for(int j=0;j!=L;++j){if(hash[j][word[i][j]-'a'] != 1){Yes=0;break;}}times+=Yes;}return times;}

思路：

此题的大数据和小数据对于程序没有区别，只是主要范围，在定义数组的时候注意越界问题。此题在于找对方向。如果暴力比较的话，产生外星文的可能序列是很费力的。然后用这个产生了的N钟可能外星文序列再一个一个去比较字典，这个计算一个外星文就很费力了。本文的思路是因为外星文是由26个字母组成的，每一位上又有多种可能，所以，就把每一个外星文建立一个简单的哈希映射。具体说来就是简历一个char hash[length][26]的二维数组，length是外星文的长度，26是说英文字母有26个。首先，在No.1处，把这个二维数据都置零。这里注意memset的第三个参数，那个是要改变的位数。一般最简单就是sizeof(那个实例)。然后把每一位的外星文的可能字符列置1.比如说，外星文(abcd)(sbde)(ds)a(dskl).这是一个五位长度的外星文。第一位的可能字符有a,b,c,d. 就在hash[0][0],hash[0][1],hash[0[2],hash[0][3]置1.第二位的可能字符有 s,b,d,e.于是就在hash[1]['s'-'a'],hash[1]['b'-'a'],hash[1]['d'-'a'],hash[1]['e'-'a']置1，依此类推。这样，建立好这样一个外星文的哈希表之后，只要用字典中的字母过一遍就能只能和所有的可能的外星文匹配不匹配。比如字典里面有asdad.就依次到外星文的哈希表中看第一位的'a'-'a'，第二位's'-'a'。。。是否为1.这就是本题的大致思路。

关于哈希，我也不太清楚，在网上查了查，贴下

哈希(Hash)表

一般的查找方法基于比较的,查找效率依赖比较次数,其实理想的查找希望不经比较,一次存取便能得到所查

记录,那就必须在记录的存储位置和它的关键字之间建立一个确定的对应关系f,这样查找k时,只要根据这个

对应关系f找到给定值k的像f(k)。这种对应关系f叫哈希（hash）函数。按这种思想建立的表叫哈希表（也

叫散列表）。哈希表存取方便但存储时容易冲突（collision）：即不同的关键字可以对应同一哈希地址。如

何确定哈希函数和解决冲突是关键。

同时在解决该题的时候，需要大量数组，我开始用的是string，总有莫名其妙的错误。后来我用了char数组，就好了，查了查，说是iosteam在对>>重载的时候没有考虑到string标准类。所以有可能出现错误。但是后来这个错误又莫名其妙的好了，这里mark下。