【Python数据分析与展示】（四）pandas库基本操作

来源：互联网发布：家装erp软件编辑：程序博客网时间：2024/06/05 21:16

Series

Series是由一组数据和数据的索引构成

import numpy as npimport pandas as pda = pd.Series ([9,8,7,6],index = ['a','b','c','d']) #如果index处于属性的第二位，可以省略“index =”#a    9 b    8 c    7 d    6dtype: int64a = pd.Series (25,index = ['a','b','c','d']) #不能省略index = d= {'a':1,'b':5,'c':7}b = pd.Series(d) #从字典创建Seriesb = pd.Series(d,index = ["b","c","d"])#b    5.0 c    7.0 d    NaNdtype: float64e = pd.Series (np.arange(0,5),index = np.arange(9,4,-1)) #从ndarray创建Series# 9    0  8    1  7    2  6    3  5    4dtype: int32b.values #array([  5.,   7.,  nan])b.index #Index(['b', 'c', 'd'], dtype='object')b[1] #7.0 index的默认索引，如果自定义索引是数字，则不能使用默认索引了b['b'] #5.0f = pd.Series([9,8,7,6],index = ['a','b','c','d'])#a    9 b    8 c    7 d    6dtype: int64np.exp(f) #nparray的操作可以用于Series#a    8103.083928 b    2980.957987 c    1096.633158 d     403.428793dtype: float64f[:3]#a    9b    8c    7dtype: int64f[f>7]#a    9b    8dtype: int64g =pd.Series([1,2,3,5],index = ['b','c','d',"e"])f+g #注意结果中索引的对齐#a    NaNb    9.0c    9.0d    9.0e    NaNdtype: float64f.name = "f Series 对象"f.index.name = "索引咧名字"#索引咧名字a    9b    8c    7d    6Name: f Series 对象, dtype: int64f['a'] = 18 #直接修改索引咧名字a    18b     8c     7d     6Name: f Series 对象, dtype: int64

DataFrame

由共用相同的索引的一组列构成，或者可以说是带行列索引的二维数组

a = pd.DataFrame(np.arange(0,10).reshape(5,2)) # 由二维数组ndarray创建a#   0   10   0   11   2   32   4   53   6   74   8   9dt = {"one":pd.Series([1,2,3],index = ['a','b','c']),     "two":pd.Series([9,8,7,6],index = ['a','b','c','d'])}d = pd.DataFrame(dt)#   one twoa   1.0 9b   2.0 8c   3.0 7d   NaN 6d.columns= ["一","二"] #修改列名d.columns d1 = {"one":[1,2,3,4],"two":[5,6,7,8]} #从列表类型的字典创建pd.DataFrame(d1,index = ['a','b','c','d'])#one twoa   1   5b   2   6c   3   7d   4   8d.reindex(index =['a','d','c','b'],columns =  ["二","一"]，fill)@   二   一a   9   1.0d   6   NaNc   7   3.0b   8   2.0

.index .columns 的索引是index类型，不可修改。.drop能删除指定的Series和DataFrame的行或列，删除列的时候需要给出参数，axis =1
索引的常用操作

方法说明 .append(idx) 链接另一个index对象，产生新的index对象 .diff(idx) 计算差集，产生新的index对象 .intersection(idx) 计算交集 .union(idx) 计算并集 .delect(loc) 删除loc位置处的元素 .insert(loc,c) 在loc位置插入元素

pandas运算·

b1 = pd.DataFrame(np.arange(0,20).reshape(4,5)) #   0   1   2   3   40   0   1   2   3   41   5   6   7   8   92   10  11  12  13  143   15  16  17  18  19c = pd.Series(np.arange(4))#0    01    12    23    3dtype: int32b - c  #默认发生在1轴上，也就是按列计算#0  1   2   3   40   0.0 0.0 0.0 0.0 NaN1   5.0 5.0 5.0 5.0 NaN2   10.0    10.0    10.0    10.0    NaN3   15.0    15.0    15.0    15.0    NaNb1.sub(c,axis = 0) #   0   1   2   3   40   0   1   2   3   41   4   5   6   7   82   8   9   10  11  123   12  13  14  15  16#算数运算 .add() .sub() .mul() .div()#比较运算 同纬度运算，尺寸要一致，不同纬度广播，默认发生在1轴

阅读全文

0 0