python学习笔记
来源:互联网 发布:量子网络 编辑:程序博客网 时间:2024/06/05 00:12
NUMPY:
import numpy as np
a = np.arange(20),np.random.rand(5),np.linspace(0, 2, 9)
a = a.reshape(4, 5)
type(a)
a.ndim,a.shape,a.size,a.dtype
d = (4, 5),np.zeros(d),np.ones(d, dtype=int)
raw = [[0,1,2,3,4], [5,6,7,8,9]]
b = np.array(raw)
a = np.asmatrix(a),b = np.mat(b)
b = np.matrix('1.0 2.0; 3.0 4.0')
全部的'+','-','*','/'运算都是基于全部的数组元素的
'+='、'-='、'*='、'/='操作符在NumPy中同样支持:
print np.exp(a)
print np.sqrt(a)
print np.square(a)
print np.power(a, 3)
a = np.arange(20).reshape(4,5)
print "a:"
print a
print "sum of all elements in a: " + str(a.sum())
print "maximum element in a: " + str(a.max())
print "minimum element in a: " + str(a.min())
print "maximum element in each row of a: " + str(a.max(axis=1))
print "minimum element in each column of a: " + str(a.min(axis=0))
stock_prices['kdj_j_yest']=np.insert(np.array(stock_prices['kdj_j'][:-1]),0,None)
布尔操作:由于Python中的布尔运算使用and、or和not等关键字,它们无法被重载,因此数组的布尔运算只能通过相应的ufunc函数进行。这些函数名都以“logical_”开头,np.logical_and np.logical_not np.logical_or np.logical_xor
>>> a == b
array([False, False, True, False, False], dtype=bool)
>>> a > b
array([False, False, False, True, True], dtype=bool)
>>> np.logical_or(a==b, a>b) # 和 a>=b 相同
print a[0][1]
print a[0, 1]
b = a.copy()
a[:,[1,3]]
a[:, 2][a[:, 0] > 5]
np.nan_to_num(a)
c = np.hstack([a,b])
d = np.vstack([a,b])
serices 的创建:
s = Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'], name='my_series')
d = {'a': 0., 'b': 1, 'c': 2},s = Series(d)
Series(d, index=['b', 'c', 'd', 'a']),Series(4., index=['a', 'b', 'c', 'd', 'e'])
serices 的选择和boolean过滤:
s[:2],s[[2,0,4]],s[['e', 'i']]
s[s > 0.5]
'e' in s
DataFrame的创建:
df = DataFrame(columns=('lib', 'qty1', 'qty2'))或df = DataFrame(d, index=['r', 'd', 'a'], columns=['two', 'three'])
d = {'one': Series([1., 2., 3.], index=['a', 'b', 'c']), 'two': Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])},df = DataFrame(d)
d = {'one': [1., 2., 3., 4.], 'two': [4., 3., 2., 1.]},df = DataFrame(d, index=['a', 'b', 'c', 'd'])
d= [{'a': 1.6, 'b': 2}, {'a': 3, 'b': 6, 'c': 9}],df = DataFrame(d)
行拼接:for i in range(5):a = DataFrame([np.linspace(i, 5*i, 5)], index=[index[i]]),df = pd.concat([df, a], axis=0)
列拼接:a = Series(range(5)),b = Series(np.linspace(4, 20, 5)),df = pd.concat([a, b], axis=1)
(或pd.merge(df1,df2,left_index=True,right_index=True,how='outer'))
列扩展:val = Series([-1.2, -1.5, -1.7], index=['two', 'four', 'five']),frame2['debt'] = val
行扩展: df.loc[len(df)] = row或者res=res.append(a new pd.Series),
删除行、列:del DF['column-name']或者df.drop([Column Name or list],inplace=True,axis=1)(凡是会对原数组作出修改并返回一个新数组的,往往都有一个 inplace可选参数。如果手动设定为True(默认为False),那么原数组直接就被替换。也就是说,采用inplace=True之后,原数组名(如2和3情况所示)对应的内存值直接改变;而采用inplace=False之后,原数组名对应的内存值并不改变,需要将新的结果赋给一个新的数组或者覆盖原数组的内存位置)
DataFrame属性
print df.index
print df.values
DataFrame选择和切片:
目前初步确立用法:[][]时列行,[,]时行列,iloc,loc指定的是行
print df['b'][2]
print df['b']['gamma']
print df.iloc[1]
print df.loc['beta']
print df[1:3]
bool_vec = [True, False, True, True, False]
print "Selecting by boolean vector:"
print df[bool_vec]
print df[['b', 'd']].iloc[[1, 3]]
print df.iloc[[1, 3]][['b', 'd']]
print df[['b', 'd']].loc[['beta', 'delta']]
print df.loc[['beta', 'delta']][['b', 'd']]
最快访问:at,iat
print df.iat[2, 3],print df.at['gamma', 'd']
最快访问的智能化:ix
print df.ix['gamma', 4],print df.ix[['delta', 'gamma'], [1, 4]],print df.ix[[1, 2], ['b', 'e']]
索引扩展
df = df.reindex(df.index|set(['e']))
df = df.reindex(list(df.index).append( [ 'c', 'd', 'e' ]))
pd.set_option('display.width', 200)
dates = pd.date_range('20150101', periods=5)
df = pd.DataFrame(np.random.randn(5, 4),index=dates,columns=list('ABCD'))
df2 = pd.DataFrame({ 'A' : 1., 'B': pd.Timestamp('20150214'), 'C': pd.Series(1.6,index=list(range(4)),dtype='float64'), 'D' : np.array([4] * 4, dtype='int64'), 'E' : 'hello pandas!' })
stock_list = ['000001.XSHE', '000002.XSHE', '000568.XSHE', '000625.XSHE', '000768.XSHE', '600028.XSHG', '600030.XSHG', '601111.XSHG', '601390.XSHG', '601998.XSHG']
raw_data = DataAPI.MktEqudGet(secID=stock_list, beginDate='20150101', endDate='20150131', pandas='1')
df = raw_data[['secID', 'tradeDate', 'secShortName', 'openPrice', 'highestPrice', 'lowestPrice', 'closePrice', 'turnoverVol']]
print df.shape
print df.head()
print df.tail(3)
print df.describe()
print df.sort(columns='tradeDate').head()
df = df.sort(columns=['tradeDate', 'secID'], ascending=[False, True])
print df[df.closePrice > df.closePrice.mean()].head()
print df[df['secID'].isin(['601628.XSHG', '000001.XSHE', '600030.XSHG'])].head()
print df.dropna(subset=['closePrice']).shape
print df.dropna(thresh=6).shape
print df.dropna(how='all').shape
print df.dropna().shape
print df.fillna(value=20150101).head()
print df['closePrice'].value_counts().head()
print df[['closePrice']].apply(lambda x: (x - x.min()) / (x.max() - x.min())).head()
dat1 = df[['secID', 'tradeDate', 'closePrice']].head()
dat2 = df[['secID', 'tradeDate', 'closePrice']].iloc[2]
dat = dat1.append(dat2, ignore_index=True)
dat1 = df[['secID', 'tradeDate', 'closePrice']]
dat2 = df[['secID', 'tradeDate', 'turnoverVol']]
dat = dat1.merge(dat2, on=['secID', 'tradeDate'])
df_grp = df.groupby('secID')
grp_mean = df_grp.mean()
df2 = df.sort(columns=['secID', 'tradeDate'], ascending=[True, False])
print df2.drop_duplicates(subset='secID')
print df2.drop_duplicates(subset='secID', take_last=True)
dat = df[df['secID'] == '600028.XSHG'].set_index('tradeDate')['closePrice']
dat.plot(title="Close Price of SINOPEC (600028) during Jan, 2015")
obj.combine_first(other) 方法的作用是使用 other 中的数据去填补 obj 中的 NA 值,就像打补丁。而且可以自动对齐。
marketIndex[field]=(marketIndex[field]+stock_prices[field]).combine_first(marketIndex[field])
pythonMap操作:
dict["w"] = "watermelon"
dict.get("c", "apple")
print dict.pop("b")
dict.update(dict2)
dict.items()
dict.keys()
dict.values()
dict.clear()
print d.popitem()
#按照key排序print sorted(dict.items(), key=lambda d: d[0])
#按照value排序print sorted(dict.items(), key=lambda d: d[1])
#字典的浅拷贝dict2 = dict.copy()
#深拷贝import copy,dict3 = copy.copy(dict)
Python两列表同位置同元素个数
a:[0, 0, 1, 0, 0, 0, 0, 0, 0, 1]
b:[1, 0, 1, 1, 0, 0, 0, 0, 0, 0]
map(cmp,a,b).count(0)
若是只判断“1” 的个数呢
map(lambda x,y:x+y, a, b).count(2)
或者sum(map(lambda x, y: 1 if x==y==1 else 0, a, b))
参考:https://uqer.io/community/share/54ca15f9f9f06c276f651a56
- 【python】:python学习笔记
- Python学习笔记-Python起步
- Python学习笔记-Python基础
- Python学习笔记-Python对象
- Python基本概念--Python学习笔记
- Python学习笔记:Python函数
- python学习笔记-python安装
- Python学习笔记--Python基础
- 【Python】python基础学习笔记
- python学习笔记3
- Python学习笔记
- Boost.Python学习笔记
- python学习笔记1
- python学习笔记2
- python学习笔记(2)
- Python学习笔记 Module
- Python学习笔记.
- python学习笔记(1)
- 数字图像处理matlab版第四章
- 朴素贝叶斯分类:拉普拉斯修正
- 什么是理论?什么是实践?
- 急需一份pycharm2017.2.4的 license server的注册码,源码失效啦
- 机器学习:单词拼写纠正器python实现
- python学习笔记
- 接口自动化测试PHPUnit-引入jenkins自动化执行
- VC++联合HALCON多线程处理图像
- 改 Bug
- Java 异常处理的误区和经验总结
- 彻底理解引用在 Android 和 Java 中的工作原理
- [BZOJ]1014 [JSOI]2008 火星人prefix Splay
- 支付宝小额免密码支付
- JAVA基本数据类型初始化--笔记