pandas 终极版2:选择DataFrame数据,.at, .iat, .loc, .iloc

来源:互联网 发布:windows输错密码锁定 编辑:程序博客网 时间:2024/05/17 04:26

pandas数据访问主要方式: .at, .iat, .loc, .iloc 

1.选择一个列df['A']或者df.A,选择多列必须使用双重中括号。

In [25]: dfOut[25]:                    A         B         C         D2017-08-08 -0.251957  1.408053 -0.085674  0.3653772017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.6971282017-08-11 -0.542612 -1.401814 -0.158879  0.9691362017-08-12  0.268779 -1.376531  0.950630 -1.0414012017-08-13  0.488717 -0.267509  0.112117 -1.166227In [26]: df['A']Out[26]: 2017-08-08   -0.2519572017-08-09   -0.1410472017-08-10   -0.2181482017-08-11   -0.5426122017-08-12    0.2687792017-08-13    0.488717Freq: D, Name: A, dtype: float64In [27]: df[['B','D']]Out[27]:                    B         D2017-08-08  1.408053  0.3653772017-08-09  0.445702  0.2516002017-08-10  1.619402 -0.6971282017-08-11 -1.401814  0.9691362017-08-12 -1.376531 -1.0414012017-08-13 -0.267509 -1.166227In [28]: df[['B','D']].head(3)Out[28]:                    B         D2017-08-08  1.408053  0.3653772017-08-09  0.445702  0.2516002017-08-10  1.619402 -0.697128

2.选择行,是用:表示区间。

In [29]: df[1:3]Out[29]:                    A         B         C         D2017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.697128In [30]: df['20170808':'20170811']Out[30]:                    A         B         C         D2017-08-08 -0.251957  1.408053 -0.085674  0.3653772017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.6971282017-08-11 -0.542612 -1.401814 -0.158879  0.969136

3.loc与iloc比较,loc主要是标签选择,iloc主要是数值选择。

Out[45]:                    A         B         C         D2017-08-08 -0.251957  1.408053 -0.085674  0.3653772017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.6971282017-08-11 -0.542612 -1.401814 -0.158879  0.9691362017-08-12  0.268779 -1.376531  0.950630 -1.0414012017-08-13  0.488717 -0.267509  0.112117 -1.166227

(1)选择索引位置为2的一行数据(我们是用dates作为行索引的),得到的返回值都是series数据

In [43]: df.loc[dates[2]]Out[43]: A   -0.218148B    1.619402C   -1.799525D   -0.697128Name: 2017-08-10 00:00:00, dtype: float64In [44]: df.iloc[2]Out[44]: A   -0.218148B    1.619402C   -1.799525D   -0.697128Name: 2017-08-10 00:00:00, dtype: float64


(2)选择某一个数据,iat效率更高。

In [37]: df.loc[dates[1],'B']Out[37]: 0.44570222309756741In [38]: df.at[dates[1],'B']Out[38]: 0.44570222309756741
In [41]: df.iloc[1,1]Out[41]: 0.44570222309756741In [42]: df.iat[1,1]Out[42]: 0.44570222309756741

(3)数据切片 

返回4-5行,1-2列数据

In [46]: df.loc['20170811':'20170812',['A','B']]Out[46]:                    A         B2017-08-11 -0.542612 -1.4018142017-08-12  0.268779 -1.376531In [47]: df.iloc[3:5,0:2]Out[47]:                    A         B2017-08-11 -0.542612 -1.4018142017-08-12  0.268779 -1.376531


指定一个位置的列表切片:

In [49]: df.loc[[dates[0],dates[5]],['B','D']]Out[49]:                    B         D2017-08-08  1.408053  0.3653772017-08-13 -0.267509 -1.166227In [50]: df.iloc[[0,5],[1,3]]Out[50]:                    B         D2017-08-08  1.408053  0.3653772017-08-13 -0.267509 -1.166227

loc与iloc对行进行切片:

In [54]: df.loc['20170809':'20170811',:]Out[54]:                    A         B         C         D2017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.6971282017-08-11 -0.542612 -1.401814 -0.158879  0.969136In [55]: df.iloc[1:4,:]Out[55]:                    A         B         C         D2017-08-09 -0.141047  0.445702 -0.560573  0.2516002017-08-10 -0.218148  1.619402 -1.799525 -0.6971282017-08-11 -0.542612 -1.401814 -0.158879  0.969136

loc与iloc对列进行切片:仔细看一下有什么不一样。

In [63]: df.loc[:,'A':'C']Out[63]:                    A         B         C2017-08-08 -0.251957  1.408053 -0.0856742017-08-09 -0.141047  0.445702 -0.5605732017-08-10 -0.218148  1.619402 -1.7995252017-08-11 -0.542612 -1.401814 -0.1588792017-08-12  0.268779 -1.376531  0.9506302017-08-13  0.488717 -0.267509  0.112117In [64]: df.iloc[:,0:3]Out[64]:                    A         B         C2017-08-08 -0.251957  1.408053 -0.0856742017-08-09 -0.141047  0.445702 -0.5605732017-08-10 -0.218148  1.619402 -1.7995252017-08-11 -0.542612 -1.401814 -0.1588792017-08-12  0.268779 -1.376531  0.9506302017-08-13  0.488717 -0.267509  0.112117In [65]: df.iloc[:,[0,1,2]]Out[65]:                    A         B         C2017-08-08 -0.251957  1.408053 -0.0856742017-08-09 -0.141047  0.445702 -0.5605732017-08-10 -0.218148  1.619402 -1.7995252017-08-11 -0.542612 -1.401814 -0.1588792017-08-12  0.268779 -1.376531  0.9506302017-08-13  0.488717 -0.267509  0.112117





阅读全文
0 0
原创粉丝点击