Sicily 1941. Parser
来源:互联网 发布:java 数据库唯一 编辑:程序博客网 时间:2024/05/22 16:59
1941. Parser
Constraints
Time Limit: 1 secs, Memory Limit: 32 MB
Description
The CSV ("Comma Separated Value") file format is often used to exchange data between disparate applications. The file format, as it is used in Microsoft Excel, has become a pseudo standard throughout the industry, even among non-Microsoft platforms.
The CSV file format in this problem is defined below:
A CSV file consists of zero or more records, and a record consists of zero or more fields.
Each record ends with a line feed character (ASCII/LF=0x0A).However, fields may contain embedded line-breaks (see below) so a record may span more than one line.
Fields are separated with commas.Example: John,Doe,120 any st.,"Anytown, WW",08123
Leading and trailing space-characters in a field are ignored, if they are not delimited by double quotes.So John , Doe ,... resolves to "John" and "Doe", etc. Space characters can be spaces, or tabs.
Fields with embedded commas must be delimited with double quote characters.In the above example. "Anytown, WW" has to be delimited in double quotes because it has an embedded comma.
Fields that contain double quote characters must be entirely surrounded by double quotes, and the embedded double-quotes must each be represented by a pair of consecutive double quotes.So, John "Da Man" Doe would convert to "John ""Da Man""",Doe, 120 any st.,...
A field that contains embedded line-breaks must be surrounded by double quotes.So:
Field 1: Conference room 1
Field 2:
John,
Please bring the M. Mathers file for review
-J.L.
Field 3: 10/18/2002
...
can be converted to:
Conference room 1, "John,
Please bring the M. Mathers file for review
-J.L. ",10/18/2002,...
Note that this is a single CSV record, even though it takes up more than one line in the CSV file. This works because the line breaks are embedded inside the double quotes of the field.
Fields with leading or trailing spaces must be delimited with double-quote characters.So, to preserve the leading and trailing spaces around, the last name above: John ," Doe ",...
Fields may always be delimited with double quotes.The delimiters will always be discarded.
Your task is to write a program as a CSV parser which can parse the input CSV file correctly.
Input
Your program should read the content of input file and parse it according to the CSV format described above. Input is ended by EOF.
Every field contains no more than 50 characters, and there are at most 500 fields in the input file.
Output
If the input file isn’t a legal CSV format file, simply output “Wrong Format” in one line, otherwise you should output all fields. Please output one line-break (‘\n’) after each field. E.g. suppose the input file is like:
field11, field12, …, field1n
field21, field22, …, field2n
…
fieldm1, fieldm2, …, fieldmn
So the output should actually be like:
field11
field12
…
field1n
field21
field22
…
field2n
…
fieldm1
fieldm2
…
fieldmn
Note that one field may contain embedded line-breaks, so in the output file one field may occupy several lines. (You can refer to the Sample Input 3 and Sample Output 3 for details).
Sample Input
Sample Input 1:field11, field12field21, field22Sample Input 2:John, Doe , "Anytown, WW" , "John ""Da Mon"" Von"Sample Input 3:Conference room 1, "John,Please bring the M.Mathers file for review-J.L.", 3/20/2006Sample Input 4:John, "Wrong field" sample", Bob
Sample Output
Sample Output 1:field11field12field21field22Sample Output 2:JohnDoeAnytown, WWJohn "Da Mon" VonSample Output 3:Conference room 1John,Please bring the M.Mathers file for review-J.L.3/20/2006Notes: In this case, the second field in input is surrounded by double-quotes thus can take link-breaks in it. So we can see that in the sample output the second field is right there as what it appears in input and a line-break is added after it and before the third field.Sample Output 4:Wrong Format
// Problem#: 1941// Submission#: 3589015// The source code is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License// URI: http://creativecommons.org/licenses/by-nc-sa/3.0/// All Copyright reserved by Informatic Lab of Sun Yat-sen University#include <stdio.h>#include <string.h>const int MAX_BUFFER_LEN = 30000;const int MAX_FIELD_NUM = 500;const int MAX_FIELD_LEN = 100;char buf[MAX_BUFFER_LEN];char str[MAX_FIELD_LEN];int n;char fields[MAX_FIELD_NUM][MAX_FIELD_LEN];int nField;void input() { n = 0; while ((buf[n] = getchar()) != EOF) n++;}void addField(char str[], int len) { char * p = str; if (p[0] < 0) p++; while (len > 0 && (str[len - 1] == ' ' || str[len - 1] == '\t')) len--; if (len > 0 && str[len - 1] < 0) len--; str[len] = '\0'; strcpy(fields[nField], p); nField++;}void parse() { bool isValid = true; bool isQuoted = false; nField = 0; int count = 0; for (int i = 0; i < n; i++) { switch(buf[i]) { case '\n': case ',': if (isQuoted) str[count++] = buf[i]; else { addField(str, count); count = 0; } break; case '"': if (isQuoted) { if (i + 1 < n && buf[i + 1] == '"') { str[count++] = '"'; i++; } else { str[count++] = -1; isQuoted = false; } } else { if (count == 0) { str[count++] = -1; isQuoted = true; } else isValid = false; } break; case ' ': case '\t': if (isQuoted || count > 0) str[count++] = buf[i]; break; default: str[count++] = buf[i]; break; } if (!isValid) break; } if (isQuoted) isValid = false; if (!isValid) printf("Wrong Format\n"); else { if (count > 0 || nField == 0) addField(str, count); for (int i = 0; i < nField; i++) printf("%s\n", fields[i]); }}int main() { input(); parse(); //int n; //scanf("%d", &n); return 0;}
- Sicily 1941. Parser
- sql parser
- SAX Parser
- XML PARSER
- XML parser
- SAX Parser
- Parser Generator
- RM PARSER
- LL parser
- crontab parser
- HTML parser
- easyUI Parser
- Parser总结
- XML Parser
- opensaml parser
- html parser
- ini-parser
- Excel parser
- 高低位字节互换
- 黑马程序员-java线程1
- tcp为什么要三次握手,而不能二次握手?
- QT消息处理模块
- 黑马程序员————oc三大特性
- Sicily 1941. Parser
- 设计模式(2)简单工厂模式
- Unity3D--如何控制UGUI的触摸 使其可以控制穿透UGUI触摸到其它物体
- hdu 3342 Legal or Not
- Roman to Integer
- Android 常用 Intent
- IOS开发数据库篇—SQLite模糊查询
- 第1周-项目1-旱冰场造价
- 细节杂记 移除字符串中的字符 remove indexof 字符串大小比较 compare