一个简单的模式字符串查找(支持通配符‘*’)

来源:互联网 发布:淘宝网飞利浦电水壶 编辑:程序博客网 时间:2024/05/10 20:27
数据结构课的一些作业还是有些难度的,对于部分有价值或下了苦工的问题还是传上来好叻,回头写注释,纪念菜鸡生涯

【问题描述】
 
在当前目录下的文件string.in中查找给定的字符串,并将查找到的字符串和行号输出到当前目录下的文件string.out中。要求:
 
1)从键盘输入给定的字符串,该字符串中只包含大小写字母、数字字符、中括号字符‘[’和‘]’、‘*’,以及字符‘^’。字符串的长度不超过20。
 
2)字符‘^’只能出现在中括号内,且只能作为中括号内的第一个字符出现。除了字符‘^’,中括号中至少包含一个以上的字母或数字。
 
3)字符*不会出现在中括号内
 
4)在给定字符串中,中括号最多出现一次。若中括号内未出现字符‘^’,表示该位置上的字符只要与中括号内的任一字符相同,则匹配成功;若中括号内出现字符‘^’,表示该位置上的字符与中括号内的所有字符都不相同时,匹配成功。
 
5)字符*可以同零个字符或者多个任意字符相匹配
 
6)在给定的字符串中,*号最多仅出现一次
 
7)*号的作用范围局限于一行,不会跨越行进行匹配
 
8) 有多个字符串和*号匹配时,仅仅输出一个,并且输出这些串中长度最短的那个
 
9)查找字符串时大小写无关。
 
10)先输出查到的行号(行号从1开始),行号后跟冒号‘:’,然后是查找到的字符串,多个字符串之间用逗号‘,’隔开。各行之间用一个回车换行符隔开。
 
【输入形式】
 
首先从标准输入(键盘)读入待查找的字符串。待查找的文件string.in位于当前目录下。
 
【输出形式】
 
将查找到的结果输出到当前目录下的string.out中。
 
【样例输入1】
 
zh[ao]ng
假如string.in文件内容为:
Zhang ying ju zhu zai ZhongGuo. 
Ta zheng zai du gao zhong.
Bie ren dou jia ta xiao zhang.
 
【样例输出1】
 
string.out文件内容为:
1:Zhang,Zhong
2:zhong
3:zhang
 
【样例1说明】
 
给定字符串中有中括号,表示第三个字符可以是a也可以是o,且大小写无关,因此文章中第一行的Zhang和Zhong与给定字符串匹配,故输出1:Zhang,Zhong。其它类推。
 
【样例输入2】
 
a[^ab]a
string.in文件内容为:
Do you like banana?
ABA is the abbreviation of American Bankers Association.
 
【样例输出2】
 
string.out文件内容为:
1:ana,ana
 
【样例2说明】
 
给定字符串中括号内有字符‘^’,表示第一个和第三个字符都为a,第二个字符不能为a或b,因此文章中第一行的banana内有两个字符串ana与给定字符串匹配,故输出1:ana,ana。第二行中ABA的第二个字符为B,由于大小写无关,与给定字符串中括号内的b相同,故不能匹配。
 
【样例输入3】
 
w*d
string.in文件内容为:
wwwdd
world is a nice word
 
【样例输出3】
 
string.out文件内容为:
1:wwwd,wwd,wd
2:world,word
 
【样例3说明】
 

给定的字符串中有‘*’,表示在一行内,可以和以'w'开头,以'd'结尾的任意字符串相匹配。在一行中,对于第一个字符'w',同时有字符串"wwwd"以及"wwwdd"与之相匹配,根据上述第8条规则,应该匹配"wwwd"。一次类推得到'wwd'和'wd'。同样的规则用于第二行,得到"world"和"word"


#include <stdio.h>#include <stdlib.h>#include <string.h>char tolower(char s){if (s >= 'A'&&s <= 'Z')s += 'a' - 'A';return s;}// This function judges whether from a given position(pos_scans), in the string(scans[]),// the following letters can match the pattern given in the regular expression(regex[]).// If so, the string matching the pattern is to be stored in the string(prints[]), and return 1int regex_match(char scans[], int pos_scans, char regex[], char prints[]) {int iter_regex = 0; // iter_regex records the position of scanner in regex[]int iter_scans = 0; // iter_scans records the position of scanner in scans[]int len_regex = strlen(regex); char dic[81];       // dic[] stores the pattern in a wildcard box "[]"int i, j;while (iter_regex < len_regex){ if (regex[iter_regex] != '[' && regex[iter_regex] != '*') {// the scanner in regex[] gets a letter (']' is not included. this is guaranteed in '[' case)if (tolower(regex[iter_regex]) == tolower(scans[pos_scans + iter_scans])) {// simply check whether the same letter appears in scans[]iter_regex++;iter_scans++;}else break;}else if (regex[iter_regex] == '['){// the scanner starts a wildcard box "[]"i = 0;iter_regex++;while (regex[iter_regex] != ']'){// store the pattern in this box into dic[]dic[i++] = regex[iter_regex];iter_regex++;}dic[i] = '\0';if (dic[0] == '^'){// if '^' is there in the box, the criteria is oppositefor (j = 1; j < i; j++){// the letter scanned in scans[] cannot appear in the boxif (tolower(scans[pos_scans + iter_scans]) == tolower(dic[j]))break;}if (j == i){// "j" reaches "i", meaning a successiter_scans++;iter_regex++;}else break; }else{// no '^' is there in the boxint flag = 0;for (j = 0; j < i; j++){if (tolower(scans[pos_scans + iter_scans]) == tolower(dic[j])){// it is a match only if the letter scanned in scans[] appears in the boxflag = 1;break;}}if (flag){iter_regex++;iter_scans++;}else break;}}else if (regex[iter_regex] == '*'){// '*' means any letter (or letters) can matchif (iter_regex == len_regex - 1){// if the scanner has already reached the end of regex[]iter_regex++;while (scans[pos_scans + iter_scans] != '\0') iter_scans++; // all the remaining letters in scans[] can match break;}else if (regex[iter_regex + 1] != '['){// if the scanner gets a letter following '*'while (tolower(scans[pos_scans + iter_scans]) != tolower(regex[iter_regex + 1])){// scanner in scans[] can go forward until it gets the same letter as scanned in regex[]iter_scans++;if (scans[pos_scans + iter_scans] == '\0') break;}if (tolower(scans[pos_scans + iter_scans]) == tolower(regex[iter_regex + 1])){// if the scanner in scans[] meets the same letter as scanned in regex[], the match is a successiter_scans++;iter_regex+=2;}else break;// otherwise the scanner goes to the end of scans[], meaning the match is a failure}else if (regex[iter_regex + 1] == '['){// it the scanner finds a '[' following '*'i = 0;iter_regex++;while (regex[iter_regex] != ']'){// store the pattern into dic[]dic[i++] = regex[iter_regex];iter_regex++;}dic[i] = '\0';while (scans[pos_scans + iter_scans] != '\0'){// check the scanner has not reached the end of scans[]if (dic[0] == '^'){// if '^' starts this "[]" boxfor (j = 1; j < i; j++){// if the letter scanned in scans[] does not appear in the box // it means a success of matching "*[]"if (tolower(scans[pos_scans + iter_scans]) == tolower(dic[j])){// if the letter appears, we should scan the next letter in scans[]iter_scans++;break;}}if (j == i){// the letter scanned in scans[] does not appear in the boxiter_scans++;iter_regex++;break;}}else{int flag = 0;for (j = 0; j < i; j++){if (tolower(scans[pos_scans + iter_scans]) == tolower(dic[j])){// if the letter appears in the box, meaning the match is a successflag = 1;break;}}if (flag){iter_regex++;iter_scans++;break;}else iter_scans++;// if not, we scan the next letter in scans[]}}}}if (scans[pos_scans + iter_scans] == '\0') break; // the scanning of scans[] ends}if (iter_regex == len_regex){// if the scanning of regex is finished, it means the match of regex[] is a successfor (j = 0; j < iter_scans; j++)prints[j] = scans[pos_scans + j];prints[j] = '\0';return 1;}else return 0;}int main(){FILE *fin, *fout;char regex[21];char scans[81];char prints[161];int line = 0;int i;if ((fin = fopen("string.in", "r")) == NULL)exit(1);if ((fout = fopen("string.out", "w")) == NULL)exit(1);scanf("%s",regex);while (fgets(scans, 81, fin) != NULL){line++;int flag = 1;for (i = 0; scans[i] != '\0'; i++){if (regex_match(scans, i, regex, prints)){if(flag) fprintf(fout, "%d:", line);else fprintf(fout, ",");fprintf(fout, "%s", prints);flag = 0;}}if (!flag) fprintf(fout,"\n");}fclose(fin);fclose(fout);return 0;}

0 0
原创粉丝点击