Numpy:常用函数

来源:互联网 发布:纸箱价格算法 编辑:程序博客网 时间:2024/06/05 10:46

1.写入文件可以使用numpy.savetxt(‘filename’,array)可以把数组写入到文件filename中。使用numpy.loadtxt(‘filename’,delimiter=’,or something’,usecls=sequence,unpack=True/False)读取文件。这两个函数也能对大部分数据存储使用的csv格式文件进行操作。

2.使用numpy.average(arrayone,weights=arraytwo)可以arrayone在arraytwo加权上的均值。numpy.mean(array)直接求array的均值。比如可以求成交量加权价格,时间加权价格等。

>>> c,v=numpy.loadtxt('apple.csv', delimiter=',', unpack=True)>>> carray([ 344.17,  345.17,  346.17,  347.17,  348.17,  349.17,  350.17,        351.17,  352.17])>>> varray([ 344.4,  345.4,  346.4,  347.4,  348.4,  349.4,  350.4,  351.4,        352.4])>>> k=numpy.average(c,weights=v)>>> k348.18913509376193

3.使用numpy.max(array)和numpy.min(array)分别可求array的最大值和最小值。而numpy.ptp(array)是求array的极差,也就是最大和最小值的差。numpy.median(array)计算array排序后的中位数。numpy.var(array)计算array的方差,而numpy.std(array)计算array的标准差。(注意样本方差和总体方差的计算区别,总体方差是用总体个数去除离差平方和,而样本使用样本个数减1去除离差平方和,其中样本个数减1(即n-1)称为自由度。样本方差如此计算是为了保证样本方差是一个无偏估计量。而这些区别在numpy中具体有没有体现,还得摸索)。ndarray中array.mean()也可以直接计算array均值。

>>> carray([ 1.,  1.,  1.,  5.,  1.,  1.])>>> varray([ 1.,  1.,  1.,  5.,  1.,  1.])>>> numpy.max(c)5.0>>> numpy.max(v)5.0>>> numpy.min(v)1.0>>> numpy.ptp(c)4.0>>> carray([ 1.,  1.,  1.,  5.,  1.,  1.])>>> numpy.median(c)1.0>>> numpy.var(c)2.2222222222222219>>> numpy.var(c)==numpy.mean((c-c.mean())**2)##验证var()True

4.可以使用numpy.diff(array)计算array中相邻的两个元素的差值。使用numpy.log(array)计算array中每个元素的对数值。numpy是面向浮点型数值运算的。注意numpy.loadtxt()中的converters参数的使用。numpy.where(array>num)可以提取出array元素中大于num值的下标数组。numpy.take(array,arrayindexs)可以提取出array数组中arrayindexs下标的值。numpy.argmax(array)返回array中最大值的下标,而numpy.argmin(array)返回array中最小值的小标。numpy.apply_along_axis()函数的使用要着重探讨。考察numpy.apply_along_axis()的性能提升。

>>> varray([ 1.,  1.,  1.,  5.,  1.,  1.])>>> numpy.diff(v)array([ 0.,  0.,  4., -4.,  0.])>>> numpy.diff(v)/v[:-1]array([ 0. ,  0. ,  4. , -0.8,  0. ])>>> def datestr2num(s):##定义日期转换函数,日期转换为数字    return datetime.datetime.strptime(s,'%Y/%m/%d').date().weekday()>>> dates,price=numpy.loadtxt('apple.csv',delimiter=',',usecols=(2,0),unpack=True,converters={2:datestr2num})>>> datesarray([ 3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,        1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.])>>> pricearray([ 46.5,  47.5,  48.5,  49.5,  50.5,  51.5,  52.5,  53.5,  54.5,        55.5,  56.5,  57.5,  58.5,  59.5,  60.5,  61.5,  62.5,  63.5,        64.5,  65.5,  66.5,  67.5,  68.5,  69.5,  70.5])>>> numpy.zeros(5)##初始化一个数组array([ 0.,  0.,  0.,  0.,  0.])>>> for i in range(5):    indices = numpy.where(dates==i)    prices=numpy.take(price,indices)    agv = numpy.mean(prices)    print "Day",i,'prices',prices,"averange",agvDay 0 prices [[ 48.5  53.5  58.5  63.5  68.5]] averange 58.5Day 1 prices [[ 49.5  54.5  59.5  64.5  69.5]] averange 59.5Day 2 prices [[ 50.5  55.5  60.5  65.5  70.5]] averange 60.5Day 3 prices [[ 46.5  51.5  56.5  61.5  66.5]] averange 56.5Day 4 prices [[ 47.5  52.5  57.5  62.5  67.5]] averange 57.5>>> numpy.argmax(prices)4>>> numpy.argmin(prices)0>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++apple.csvopendata         date       high     low    close46.5    765.98  2015/1/1    48.99   44.11   47.6647.5    766.98  2015/1/2    49.99   45.11   48.6648.5    767.98  2015/1/5    50.99   46.11   49.6649.5    768.98  2015/1/6    51.99   47.11   50.6650.5    769.98  2015/1/7    52.99   48.11   51.6651.5    770.98  2015/1/8    53.99   49.11   52.6652.5    771.98  2015/1/9    54.99   50.11   53.6653.5    772.98  2015/1/12   55.99   51.11   54.6654.5    773.98  2015/1/13   56.99   52.11   55.6655.5    774.98  2015/1/14   57.99   53.11   56.6656.5    775.98  2015/1/15   58.99   54.11   57.6657.5    776.98  2015/1/16   59.99   55.11   58.6658.5    777.98  2015/1/19   60.99   56.11   59.6659.5    778.98  2015/1/20   61.99   57.11   60.6660.5    779.98  2015/1/21   62.99   58.11   61.6661.5    780.98  2015/1/22   63.99   59.11   62.6662.5    781.98  2015/1/23   64.99   60.11   63.6663.5    782.98  2015/1/26   65.99   61.11   64.6664.5    783.98  2015/1/27   66.99   62.11   65.6665.5    784.98  2015/1/28   67.99   63.11   66.6666.5    785.98  2015/1/29   68.99   64.11   67.6667.5    786.98  2015/1/30   69.99   65.11   68.6668.5    787.98  2015/2/2    70.99   66.11   69.6669.5    788.98  2015/2/3    71.99   67.11   70.6670.5    789.98  2015/2/4    72.99   68.11   71.66+++++++++++++++++++++++++++++++++++++++++++++++++++++>>> opendata,highdata,lowdata,closedata=numpy.loadtxt('apple.csv',delimiter=',',usecols=(0,3,4,5),unpack=True)>>> opendataarray([ 46.5,  47.5,  48.5,  49.5,  50.5,  51.5,  52.5,  53.5,  54.5,        55.5,  56.5,  57.5,  58.5,  59.5,  60.5,  61.5,  62.5,  63.5,        64.5,  65.5,  66.5,  67.5,  68.5,  69.5,  70.5])>>> highdataarray([ 48.99,  49.99,  50.99,  51.99,  52.99,  53.99,  54.99,  55.99,        56.99,  57.99,  58.99,  59.99,  60.99,  61.99,  62.99,  63.99,        64.99,  65.99,  66.99,  67.99,  68.99,  69.99,  70.99,  71.99,        72.99])>>> lowdataarray([ 44.11,  45.11,  46.11,  47.11,  48.11,  49.11,  50.11,  51.11,        52.11,  53.11,  54.11,  55.11,  56.11,  57.11,  58.11,  59.11,        60.11,  61.11,  62.11,  63.11,  64.11,  65.11,  66.11,  67.11,        68.11])>>> closedataarray([ 47.66,  48.66,  49.66,  50.66,  51.66,  52.66,  53.66,  54.66,        55.66,  56.66,  57.66,  58.66,  59.66,  60.66,  61.66,  62.66,        63.66,  64.66,  65.66,  66.66,  67.66,  68.66,  69.66,  70.66,        71.66])>>> weekdate=numpy.loadtxt('apple.csv',delimiter=',',usecols=(2,),converters={2:datestr2num})>>> weekdatearray([ 3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,        1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.])>>> numpy.ravel(numpy.where(weekdate==0))[0]2>>> numpy.ravel(numpy.where(weekdate==4))[-1]21>>> weekdatearray=numpy.arange(2,22)>>> weekdatearrayarray([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,       19, 20, 21])>>> weekdatearray=numpy.split(weekdatearray,4)>>> weekdatearray[array([2, 3, 4, 5, 6]), array([ 7,  8,  9, 10, 11]), array([12, 13, 14, 15, 16]), array([17, 18, 19, 20, 21])]>>> def sumerize(a,o,h,l,c):    monday_open=o[a[0]]    week_high=numpy.max(numpy.take(h,a))    week_low=numpy.min(numpy.take(l,a))    friday_close=c[a[-1]]    return ("apple ",monday_open,week_high,week_low,friday_close)>>> weeksummary=numpy.apply_along_axis(sumerize,1,weekdatearray,opendata,highdata,lowdata,closedata)>>> weeksummaryarray([['apple ', '48.5', '54.99', '46.11', '53.66'],       ['apple ', '53.5', '59.99', '51.11', '58.66'],       ['apple ', '58.5', '64.99', '56.11', '63.66'],       ['apple ', '63.5', '69.99', '61.11', '68.66']],       dtype='|S6')>>> numpy.savetxt('applesumeray.csv',weeksummary,delimiter=',',fmt="%s")

这里写图片描述

5.numpy.maximum()与numpy.minimum()的使用。

>>> numpy.maximum([2, 3, 4], [1, 5, 2])array([2, 5, 4])>>> 
0 0
原创粉丝点击