ValueError: No engine for filetype: 'csv'解决与书本7-2代码改写

来源:互联网 发布:数据修复图片 编辑:程序博客网 时间:2024/05/18 01:49

经过人指点,解决了一个问题

使用数据集air_data.csv

为了程序运行需要,仅给出一部分数据集

MEMBER_NOFFP_DATEFIRST_FLIGHT_DATEGENDERFFP_TIERWORK_CITYWORK_PROVINCEWORK_COUNTRYAGELOAD_TIMEFLIGHT_COUNTBP_SUMEP_SUM_YR_1EP_SUM_YR_2SUM_YR_1SUM_YR_2SEG_KM_SUMWEIGHTED_SEG_KMLAST_FLIGHT_DATEAVG_FLIGHT_COUNTAVG_BP_SUMBEGIN_TO_FIRSTLAST_TO_ENDAVG_INTERVALMAX_INTERVALADD_POINTS_SUM_YR_1ADD_POINTS_SUM_YR_2EXCHANGE_COUNTavg_discountP1Y_Flight_CountL1Y_Flight_CountP1Y_BP_SUML1Y_BP_SUMEP_SUMADD_Point_SUMEli_Add_Point_SumL1Y_ELi_Add_PointsPoints_SumL1Y_Points_SumRation_L1Y_Flight_CountRation_P1Y_Flight_CountRation_P1Y_BPSRation_L1Y_BPSPoint_NotFlight549932006/11/022008/12/2460北京CN312014/03/31210505308074460239560234188580717558440.142014/03/3126.2563163.5213.48325358918335236640340.96163904310310724619725911174460399921144521111006197603702110.509523810.490476190.4872206910.5127773350280652007/02/192007/08/03北京CN422014/03/31140362480041288171483167434293678367777.22014/03/2517.545310275.19424460417012000291.252314446872177358185122412881200053288532884157682384100.5142857140.4857142860.4892890940.51070814733551062007/02/012007/08/3060北京CN402014/03/31135351159039711163618164982283712355966.52014/03/2116.87543894.87510115.29850746318349112000201.2546755166570169072182087397111549155202517114063612337980.5185185190.4814814810.4814671370.51853001526211892008/08/222008/08/235Los AngelesCAUS642014/03/3123337314034890116350125500281336306900.882013/12/262.87542164.25219727.863636367300111.090869565131018610415121034890034890348903722041861000.4347826090.5652173910.5517216840.44827535112


#-*- coding: utf-8 -*-#数据清洗,过滤掉不符合规则的数据import pandas as pddatafile= '../data/air_data.csv' #航空原始数据,第一行为属性标签cleanedfile = '../tmp/data_cleaned.csv' #数据清洗后保存的文件data = pd.read_csv(datafile,encoding='utf-8') #读取原始数据,指定UTF-8编码(需要用文本编辑器将数据装换为UTF-8编码)data = data[data['SUM_YR_1'].notnull()*data['SUM_YR_2'].notnull()] #票价非空值才保留#只保留票价非零的,或者平均折扣率与总飞行公里数同时为0的记录。index1 = data['SUM_YR_1'] != 0index2 = data['SUM_YR_2'] != 0index3 = (data['SEG_KM_SUM'] == 0) & (data['avg_discount'] == 0) #该规则是“与”data = data[index1 | index2 | index3] #该规则是“或”csvdata.to_csv(cleanedfile,sep="\t", encoding="utf-8")#data.to_excel(cleanedfile) #导出结果

如果编辑文本后运行,会出现如下错误:

ValueError: No engine for filetype: 'csv'

如果python console中运行,会出现如下错误:

  File "<input>", line 0
SyntaxError: encoding declaration in Unicode string


这两个问题都是由于最后一句使用了to_excel导致的,改成to_csv即可

另外,to_csv可以生成csv或者xls文件

0 0