Python小练习2：pandas.Dataframe使用方法示例demo

来源：互联网发布：pc安装mac os单系统编辑：程序博客网时间：2024/06/03 20:10

pandas.Dataframe使用方法示例demo

本文通过一个实例来介绍pandas.Dataframe的各种常用操作，问题总结并修改自coursera上南京大学的课程：用Python玩转数据。

直接进入正题，我们的示例首先调用matplotlib.finance包中提供的函数quotes_historical_yahoo_ochl，通过雅虎财经提供的api函数得到了微软公司近两年的股票数据，构造成数据框Dataframe的形式。然后对其股票数据进行简单的分析和操作，例如求微软公司2015年每个月股票收盘价的平均值。通过多个类似的问题，涵盖了大多数常用Dataframe操作的练习。

代码如下：（python版本3.5）

"""Created on Mon Jan 16 17:26:05 2017练习：对微软公司股票数据的操作练习@author: AS"""from matplotlib.finance import quotes_historical_yahoo_ochlfrom datetime import dateimport pandas as pdtoday = date.today()start = (today.year-2, today.month, today.day)quotesMS = quotes_historical_yahoo_ochl('MSFT', start, today) #获取微软两年内的股票数据attributes=['date','open','close','high','low','volume']      #属性描述quotesdfMS = pd.DataFrame(quotesMS, columns= attributes)      #构造成DataFrameprint('通过雅虎财经api读取的近两年微软公司的股票数据的前5条')print(quotesdfMS[:5])list = []for i in range(0, len(quotesMS)):    x = date.fromordinal(int(quotesMS[i][0])) # 转换存储时间的格式，例 735618 转换为 2015-01-20    y = date.strftime(x, '%y/%m/%d')  #进一步转换格式    list.append(y)quotesdfMS.index = list  #用转换后的时间变量作为索引quotesdfMS = quotesdfMS.drop(['date'], axis = 1) #将多余的时间变量剔除print('\n查询在2015年整年内（即1月1日至12月31日）微软股票收盘价最高的5天数据。')print(quotesdfMS['15/01/01':'15/12/31'].sort('close', ascending=0)[:5])print('\n根据成交量升序排列2015年上半年的微软股票数据,显示前5条')print(quotesdfMS['15/1/1':'15/5/31'].sort('volume')[:5])print('\n统计在2015年整年内（即1月1日至12月31日）微软股票收盘价每个月的均值')list = []quotesdfMS15 = quotesdfMS['15/01/01':'15/12/31'] for i in range(0, len(quotesdfMS15)):    list.append(int(quotesdfMS15.index[i][3:5])) #时间索引的第3，4个字符对应月份，例如2015-01-20对应月份'01'quotesdfMS15['month'] = list  #添加一列月份变量print(quotesdfMS15.groupby('month').mean().close) #根据month分组，求每组所有变量的均值print('\n统计在2015年整年内（即1月1日至12月31日）微软股票涨价的每个月的天数')list1 = []tmpdf = quotesdfMS['15/01/01':'15/12/31']for i in range(0, len(tmpdf)):    list1.append(int(tmpdf.index[i][3:5]))tmpdf['month'] = list1print(tmpdf[ tmpdf.close > tmpdf.open]['month'].value_counts())print('\n合并在2015年整年内（即1月1日至12月31日）微软股票收盘价最高的5天和最低的5天')sorted = tmpdf.sort('close')print(pd.concat([sorted[:5], sorted[len(sorted)-5:]]))

0 0