利用Python解析CSV文件

来源:互联网 发布:c 窗体编程视频教程 编辑:程序博客网 时间:2024/05/16 12:45

1.CSV的特点:

每行文本以行为单位呈现

字段被分隔符(通常为逗号)隔开

只存储数据

不需要任何软件就可以读取


2.手动解析CSV文件

# Your task is to read the input DATAFILE line by line, and for the first 10 lines (not including the header)# split each line on "," and then for each line, create a dictionary# where the key is the header title of the field, and the value is the value of that field in the row.# The function parse_file should return a list of dictionaries,# each data line in the file being a single list entry.# Field names and values should not contain extra whitespace, like spaces or newline characters.# You can use the Python string method strip() to remove the extra whitespace.# You have to parse only the first 10 data lines in this exercise,# so the returned list should have 10 entries!import osDATADIR = ""DATAFILE = "beatles-diskography.csv"def parse_file(datafile):    data = []    with open(datafile, "r") as f:        header = f.readline().split(",")   #获取表头        counter = 0        for line in f:            if counter == 10:                break            fields = line.split(",")            entry = {}            for i, value in enumerate(fields):                entry[header[i].strip()] = value.strip();    #用strip方法去除空白            data.append(entry)            couter += 1    return data

3.利用CSV模块解析CSV文件

数据中的值可能包含分隔符(逗号),影响解析。而CSV模块会自动解决这些难题。

# -*- coding: UTF-8 -*- import osimport pprintimport csvDATADIR = ""DATAFILE = "beatles-diskography.csv"def parse_csv(datafile):    data = []    n = 0    with open(datafile, "rb") as sd:        r = csv.DictReader(sd)   #为每行创建一个字典,同时将字段名称与表头对应        for line in r:            data.append(line)    return dataif __name__ == '__main__':    datafile = os.path.join(DATADIR, DATAFILE)    d = parse_csv(datafile)    pprint.pprint(d)


0 0