python的PCA

来源:互联网 发布:淘宝一年的费用是多少 编辑:程序博客网 时间:2024/06/08 02:34

PCA是主成分分析,用来降维,用少量的变量去解释大部分变量,使得原来相关的变成不相关的,独立的变量。

sklearn.decomposition.PCA(n_components=None,copy=True,whiten=False)

n_components保留下来的特征个数n,缺省是所有都保留。赋值为int就是要保留几个。赋值为‘mle’,自动选取,使得满足要求的方差满分比。
copy,True就是原来的数据不会改变,False原始数据会改变。
whiten,白化,使得每个特征有相同的方差。

#-*- coding: utf-8 -*-#主成分分析 降维import pandas as pd#参数初始化inputfile = 'E:/PythonMaterial/chapter4/demo/data/principal_component.xls'outputfile = 'E:/PythonMaterial/chapter4/demo/data/dimention_reducted.xls' #降维后的数据data = pd.read_excel(inputfile, header = None) #读入数据from sklearn.decomposition import PCApca = PCA()pca.fit(data)a=pca.components_ #返回模型的各个特征向量print ab=pca.explained_variance_ratio_ #返回各个成分各自的方差百分比(贡献率)print " "print b
[[-0.56788461 -0.2280431  -0.23281436 -0.22427336 -0.3358618  -0.43679539  -0.03861081 -0.46466998] [-0.64801531 -0.24732373  0.17085432  0.2089819   0.36050922  0.55908747  -0.00186891 -0.05910423] [-0.45139763  0.23802089 -0.17685792 -0.11843804 -0.05173347 -0.20091919  -0.00124421  0.80699041] [-0.19404741  0.9021939  -0.00730164 -0.01424541  0.03106289  0.12563004   0.11152105 -0.3448924 ] [ 0.06133747  0.03383817 -0.12652433 -0.64325682  0.3896425   0.10681901  -0.63233277 -0.04720838] [-0.02579655  0.06678747 -0.12816343  0.57023937  0.52642373 -0.52280144  -0.31167833 -0.0754221 ] [ 0.03800378 -0.09520111 -0.15593386 -0.34300352  0.56640021 -0.18985251   0.69902952 -0.04505823] [ 0.10147399 -0.03937889 -0.91023327  0.18760016 -0.06193777  0.34598258   0.02090066 -0.02137393]][  7.74011263e-01   1.56949443e-01   4.27594216e-02   2.40659228e-02   1.50278048e-03   4.10990447e-04   2.07718405e-04   9.24594471e-05]
pca=PCA(n_components='mle')newData=pca.fit_transform(data)#用它来降低维度pd.DataFrame(newData).to_excel(outputfile)#保存结果pca.inverse_transform(newData)#必要时可以用inverse_transform()来复原数据
[[ -8.19133694e+00  -1.69040279e+01   3.90991029e+00   7.48106686e+00    5.16142203e-01] [ -2.85274026e-01   6.48074989e+00  -4.62870368e+00   5.01369607e+00   -1.65278935e+00] [  2.37073907e+01   2.85245701e+00  -4.96523096e-01  -1.57285727e+00   -2.09522277e-01] [  1.44320264e+01  -2.29917325e+00  -1.50272151e+00  -1.30763061e+00    1.54047215e+00] [ -5.43045680e+00  -1.00070408e+01   9.52086923e+00  -5.63779544e+00   -9.21974743e-01] [ -2.41595590e+01   9.36428589e+00   7.26578565e-01  -1.98622218e+00   -9.98528392e-01] [  3.66134607e+00   7.60198615e+00  -2.36439873e+00   4.21318409e-02   -8.48196502e-02] [ -1.39676121e+01  -1.38912398e+01  -6.44917778e+00  -2.92916826e+00   -1.91994563e-01] [ -4.08809359e+01   1.32568529e+01   4.16539368e+00   1.21239981e+00    1.33543444e+00] [  1.74887665e+00   4.23112299e+00  -5.89809954e-01  -1.57477365e+00    4.10612180e-01] [  2.19432196e+01   2.36645883e+00   1.33203832e+00   4.39763606e+00   -2.61113312e-02] [  3.67086807e+01   6.00536554e+00   3.97183515e+00  -1.54808393e+00    3.00572729e-01] [ -3.28750663e+00  -4.86380886e+00   1.00424688e+00   8.51193030e-01   -6.27109498e-01] [ -5.99885871e+00  -4.19398863e+00  -8.59953736e+00  -2.44159234e+00    6.09616105e-01]]
0 0
原创粉丝点击