HS300股指与其成分股的价格匹配

来源:互联网 发布:eureka api java 编辑:程序博客网 时间:2024/04/30 08:36


观察沪深300股指收盘点时,会发现大部分个股的收盘价走势与股指不是同步的。下面一段程序提供了一个方法寻找沪深300股指收盘点位和其成分股收盘价格匹配度比较高的个股。 此段程序的主要思路是先确定股指和个股收盘价格的线性关系系数,然后通过计算其线性关系中的残差项,对其进行ADF检测来评估哪些个股走势与沪深300股指走势比较相近。


股指和个股使用数据来自tushare.  http://tushare.org/trading.html

统计时间段为2014-10-1到2017-10-1,只发现15只个股走势和沪深300指数比较相近,同步度比较高 的股票代码如下:

‘’002024',    '600038',   '002415',   '601333',  '000060',   '600406',   '600332',   '000540',   '000402',   '601988',   '601998',   '601328',   '601111',  '600048',  '000069']

贴几张走势图,但有的走势图看起来涨跌幅的匹配度并不高。 下面三张图中,第三张有些时候匹配不是很好。


**********************************************************************************************


***********************************************************************************************


**********************************************************************************************

以下是Python程序:

#_*_coding:utf-8_*_'''Version: V17.1.0Date: 2017-11-5@Author: Cheney'''# tushare上获取数据,查询时间段2014.10-2017.10中,HS300股票收盘价格走势和HS300股指相似的个股# Part Iimport datetimeimport numpy as npimport pandas as pdimport tushare as tsimport matplotlib.pyplot as pltimport tracebackimport statsmodels.tsa.stattools as stsimport statsmodels.api as smt = datetime.datetime.now()print ('Program is starting ... %s' %t)def plot_price_relation(df, start, end, st_a, st_b='hs300'):    '''    Draw HS300 Index and stocks price relation plot    df--DataFrame, index is date, columns are stock and hs300 index close    start and end -- set the start and end date for stock and HS300 Index comparision    st_a , st_b -- stock code and hs300 code or label    '''    fig, (ax,bx) = plt.subplots(nrows=2)    x_date = [datetime.datetime.strptime(d, '%Y-%m-%d').date() for d in df.index]    ax.plot(x_date, df[st_b], label=st_b, c='g')    ax.set_title("%s index and stock %s daily prices relation" % (st_b, st_a))    ax.set_xticklabels([])    ax.set_ylabel("HS300Index")    ax.grid(True)    ax.legend(loc='best')    bx.plot(x_date, df[st_a], label=st_a, c='b')    bx.set_xlabel("Year/Month")    bx.set_ylabel("Stock Price")    bx.grid(True)    bx.legend(loc='best')    fig.autofmt_xdate()    # Save figures in a folder or show in time    plt.savefig('hs30index_pair_stock_plot/ %s+%s.png' % (st_b, st_a))    # plt.show()def get_df_close(stocka, stockb):    # Transform stock data as dateframe format and keep the close columns and date index    # stocka and stockb--stocks code, like '600036'    sta = ts.get_hist_data(stocka)    stb = ts.get_hist_data(stockb)    # To build a new DataFrame to get the close of stock and HS300 Index    df = pd.concat([sta, stb], axis=1)    df = df['close'].fillna(method='ffill')    df.columns = ['%s' %stocka, '%s' % stockb]    return df#Part IIif __name__ == "__main__":    start = datetime.datetime(2014,10,1).strftime('%Y-%m-%d')    end = datetime.datetime(2017,10,1).strftime('%Y-%m-%d')    # Get HS300 stocks code list    hs_name = 'hs300'    hs = ts.get_hs300s()    hs_list = hs['code']    stockADF = {}       for code in hs_list:          #Get the stock and hs300 index close data        df = get_df_close(code, hs_name)        #Calculate the linear model's coefficient        x_value= df['%s'%code]        x = sm.add_constant(x_value)        y = list(df['%s'%hs_name])        try:            #Calcualte the residuals of linear model, if it can't get the fit data, it will raise exception            res = sm.OLS(y, x_value)            res = res.fit()            betaCoef = res.params[0]            if (betaCoef-betaCoef) != 0:                raise        except:            print ("Can't catch the res params of stock %s and %s"%(code,hs_name))            traceback.print_exc()            continue        df['res'] = df['%s'%hs_name] - betaCoef * df['%s'%code]        tempStockADF = sts.adfuller(df['res'])        #Save the ADF test value in a dict for polting price comparision figure        stockADF[code+''+ hs_name] = [tempStockADF[0], tempStockADF[4]['1%']]      #Compare the ADF test value and 1% salient threshold to estimate whether meet stationary time series    for key,value in stockADF.items():                if value[0] < value[1]:            print ("The best pairs stocks %s, ADF values %s and percent-1 %s" %(key,value[0],value[1]))            keyCode = key.strip("\'\'")            code, hs_name = keyCode[:6], keyCode[-5:]            df = get_df_close(code,hs_name)            plot_price_relation(df, start, end, '%s'%code,'%s'%hs_name)    print ('Program total running time is %s' %(datetime.datetime.now() -t))
以上是量化交易学习中一点点的知识积累,有不足之处还望大牛多多指导。





原创粉丝点击