http://blog.csdn.net/u014607457/article/details/51290582

来源:互联网 发布:c语言头文件里写什么 编辑:程序博客网 时间:2024/05/19 16:05

1.简介

DataFrame提供的是一个类似表的结构,由多个Series组成,而Series在DataFrame中叫columns,即:DataFrame提供的是非一维矩阵。如图所示index:索引;a,b均为column,等同于两个个Series组成该矩阵。小白一枚,理解有错望指导改正。



2.部分操作

2.1 numpy中的array与pandas中的Series得到的结果是一致。

程序:

#coding:utf-8import pandas as pdimport numpy as nps1 = np.array([2,3,4,5])s2 = np.array([5,6,7,8])print pd.DataFrame([s1,s2])print '*'*40t1 = pd.Series([2,3,4,5])t2 = pd.Series([5,6,7,8])print pd.DataFrame([t1,t2])
     结果:



2.2 value为Series的字典结构;

 程序:

#coding:utf-8import pandas as pdimport numpy as nps1 = np.array([2,3,4,5])s2 = np.array([5,6,7,8])print pd.DataFrame({'A':s1,'B':s2})

 结果:



注:若创建使用的参数中,array、Series长度不一样时,对应index的value值若不存在则为NaN

2.3 if-then操作 (.ix[]):

 .ix[条件,then执行的区域]

  eg:

#coding:utf-8import pandas as pdimport numpy as nps1 = np.array([2,3,4,5])s2 = np.array([5,6,7,8])m = pd.DataFrame({'A':s1,'B':s2},index=['A','B','C','D'])m.ix[m.A>2,'B'] = -2print m


2.4 numpy.where()操作

 numpy.where(条件,then,else)

eg:

#coding:utf-8import pandas as pdimport numpy as nps1 = np.array([2,3,4,5])s2 = np.array([5,6,7,8])m = pd.DataFrame({'A':s1,'B':s2},index=['A','B','C','D'])m["then"] = np.where(m.A>2,5,0)print m

2.5 根据条件选择DataFrame

2.5.1 直接取值

eg:
 

#coding:utf-8

import pandas as pd

import numpy as np

s1 = np.array([2,3,4,5])

s2 = np.array([5,6,7,8])

m = pd.DataFrame({'A':s1,'B':s2},index=['A','B','C','D'])

t = m[m.A>=3]

s = m.loc[m.A>=3]

print t

print '@'*20

print s


 

s和t显示的是结果一致。

2.6  groupby 形成group
d = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),                  'size': list('SSMMMLL'),                  'weight': [8, 10, 11, 1, 20, 12, 12],                  'adult' : [False] * 5 + [True] * 2});#列出动物中weight最大的对应sizegroup=d.groupby("animal").apply(lambda subf: subf['size'][subf['weight'].idxmax()])print group


2.7 group中取出数组中的一组

df = pd.DataFrame({'animal': 'cat dog cat fish dog cat dog'.split(),                  'size': list('SSMMMLL'),                  'weight': [8, 10, 11, 1, 20, 12, 12],                  'adult' : [False] * 4 + [True] * 3});group=df.groupby("animal")dog = group.get_group("dog")print dog




阅读全文
0 0
原创粉丝点击