Pandas 读写数据

来源：互联网发布：学java好还是嵌入式好编辑：程序博客网时间：2024/06/05 22:40

IO Tools (Text, CSV, HDF5, …)

读取

读取csv

pd.read_csv(path,seq=”,”,header=0,index_col=null,names=None,encoding,parse_dates)

行列设置

header : int or list of ints, default ‘infer’
指定第几行为表头，例如header=-1，所读数据没有表头；header=1，并把第2行当成表头。
names : array-like, default None
指定列名
index_col : int or sequence or False, default None
制定某一列为索引。

数据格式设置

dtype : Type name or dict of column -> type, default None
设置列的数据类型. E.g. {‘a’: np.float64, ‘b’: np.int32}
parse_dates : boolean or list of ints or names or list of lists or dict, default False.
1. If True -> try parsing the index.
2. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
3. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
4. If {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’. A fast-path exists for iso8601-formatted dates.
设置时间

设置读取数据范围

skiprows : list-like or integer, default None
跳过指定的行(0-indexed)或者从开始跳过列(int)
skipfooter : int, default 0
跳过最后的几行
nrows : int, default None
读文件的多少行,对于大文件效率很高.

设置读取数据字符集

encoding : str, default None
设置读取时候的字符集

encoding设定读取的字符集，这个要和文本的编码保持一致

读取txt（制表符格式文件）

读取excel

pandas.read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None, na_values=None, dtype=None,**kwds)
io:excel文件路径,注意2003和2007的后缀
sheetname:表的名字
header:表头行数
skiprows:
skep_footer=0
index_col:设置索引列
na_values:设置缺省值
stype:设置行数据类型

保存

保存csv

pd.to_csv(path,seq=”,”,header=True,index=True,encoding)

保存txt

保存excel

DataFrame.to_excel(excel_writer, sheet_name=’Sheet1’,columns=None, header=True, index=True, startrow=0, startcol=0, engine=None, encoding=None)
excel_writer:路径
sheet_name:表名
columns:设置列排序
header=True:是否显示表头
index=True:是否显示索引
startrow=0:开始行
startcol=0:开始列
engine=None:保存引擎,xls/xlsx
encoding=None:编码

http://pandas.pydata.org/pandas-docs/stable/io.html

0 0