平台研究

来源：互联网发布：加强网络宣传队伍建设编辑：程序博客网时间：2024/05/12 16:34

import json
write_file('HS300.stocks.json',json.dumps(get_index_stocks('000300.XSHG')))

可以留作代码，也可以写成普通的文件

一些基本库的学习

缺失数据使用np.nan表示，默认不包括在计算内，可以通过下列方法更改缺失数据。

panda

seaborn 是python的统计可视化库，用于什么界面的？navitve python command，impossible

df.pct_change? 使用问号查询帮助还是很重要的，基础的几个数据算法要了解，熟悉

Signature: df.pct_change(periods=1, fill_method='pad', limit=None, freq=None, **kwargs)
Docstring:
Percent change over given number of periods.

Parameters
----------
periods : int, default 1
    Periods to shift for forming percent change
fill_method : str, default 'pad'
    How to handle NAs before computing percent changes
limit : int, default None
    The number of consecutive NAs to fill before stopping
freq : DateOffset, timedelta, or offset alias string, optional
    Increment to use from time series API (e.g. 'M' or BDay())

Returns

2014-01-02   2.2572   2.2563
2014-01-03   2.2468   2.2315
2014-01-06   2.2248   2.1782

====

PCT percent change over time／／

2014-01-06 -0.009792 -0.023885

2014-01-03 -0.004607 -0.010991 ==today-lastday/lastday open||close

从长期来看，短期的波动剧烈的个股还是比较少的，大部分是间隔波动很小，但是这个说明不了任何意义啊！

相关性算法分析：：：
我们可以比较不同股票的相关性，通过线性回归，画出不同股票的拟合曲线、置信区间（阴影区域）、皮尔森回归系数以及P值。

看完，线性回归、置信区间的含义？

iloc==Purely integer-location based indexing for selection by position

使用pct_change()计算收益率，使用dropna去除缺失值，使用distplot画出直方分布图。

perctange change//pct

dropna//drop nan 去除缺失的数据

dist plot画图！

如果有多组数据需要通过统计分布图进行比较，可以使用violinplot得到小提琴图。很显然的，和指数相关，就需要对指数影响力大

还可以通过clustermap画出聚类图，将相近的聚成一类；您可以通过查阅机器学习相关书籍了解更多过关于聚类的信息。

returns.corr()

======correlation........相关性计算

Signature: returns.corr(method='pearson', min_periods=1)
Docstring:
Compute pairwise correlation of columns, excluding NA/null values

Parameters
----------
method : {'pearson', 'kendall', 'spearman'}
    * pearson : standard correlation coefficient 做相似度计算的时候经常会用到皮尔逊相关系数(Pearson Correlation Coefficient
    * kendall : Kendall Tau correlation coefficient
    * spearman : Spearman rank correlation
min_periods : int, optional
    Minimum number of observations required per pair of columns
    to have a valid result. Currently only available for pearson
    and spearman correlation

衡量随机变量相关性的方法主要有三种：pearson相关系数，spearman相关系数，kendall相关系数：

0 0