利用LOAD DATA将csv文件中的数据导入MySQL

来源:互联网 发布:大数据支撑平台 编辑:程序博客网 时间:2024/06/02 04:56

先贴代码:

#coding=utf-8import csv, os, reimport MySQLdb_PATH = 'D:/bike_sharing_data/trip_data/trip_history/'_TABLE_NAME = 'trip_history_raw'filelist = os.listdir(_PATH)conn = MySQLdb.connect(    host='localhost',    port=3306,    user='root',    passwd='****',    db='bike_sharing')cursor = conn.cursor()pattern = re.compile(r'^20170[4-7]')for filename in filelist:    path = _PATH + filename    print filename    if pattern.match(filename):        data = 'LOAD DATA LOCAL INFILE \''+ path +'\' INTO TABLE '+ _TABLE_NAME + ' FIELDS TERMINATED BY \',\' ENCLOSED BY \'"\' LINES TERMINATED BY \'\n\' IGNORE 1 LINES (tripduration, starttime, stoptime, start_station_id, @1, @2, @3, end_station_id, @4, @5, @6, @7, usertype, @8, @9);'    else:        data = 'LOAD DATA LOCAL INFILE \''+ path +'\' INTO TABLE '+ _TABLE_NAME + ' FIELDS TERMINATED BY \',\' ENCLOSED BY \'"\' LINES TERMINATED BY \'\r\n\' IGNORE 1 LINES (tripduration, starttime, stoptime, start_station_id, @1, @2, @3, end_station_id, @4, @5, @6, @7, usertype, @8, @9);'    cursor.execute(data)    conn.commit()cursor.close()

这里使用LOAD DATA的方式将数据导入MySQL中,因为这种方法要比一行一行直接insert或者批量insert效率要高。

LOAD DATA格式:

LOAD DATA LOCAL INFILE 'path'     INTO TABLE '_TABLE_NAME'    FIELDS TERMINATED BY ','    ENCLOSED BY '"'    LINES TERMINATED BY '\n'    IGNORE 1 LINES    (tripduration, starttime, stoptime, start_station_id, @1, @2, @3, end_station_id, @4, @5, @6, @7, usertype, @8, @9);

FIELDS TERMINATED BY ',':表示字段分隔符;

ENCLOSED BY '"':表示每个字段由双引号包围;

LINES TERMINATED BY '\n':表示行分隔符,在windows下需使用LINES TERMINATED BY '\r\n',否则只能读取奇数(或偶数)行;特别地,若csv文件中不存在换行符,windows下要用TERMINATED BY '\n',否则无法插入数据

IGNORE 1 LINES:表示忽略文件的第一行,因为csv文件的第一行是列名,所以插入数据时要忽略掉;

(tripduration, starttime, stoptime, start_station_id, @1, @2, @3, end_station_id, @4, @5, @6, @7, usertype, @8, @9):当我们只需要将csv(或其他格式)文件中的部分列导入MySQL表中时,可以使用@a过滤掉其他列的数据;其中的列名为数据库中表的列名,排列顺序与csv文件的列保持一致。

原创粉丝点击