sklearn矩阵分解类库学习
来源:互联网 发布:floor在sql是什么意思 编辑:程序博客网 时间:2024/05/29 19:28
sklearn.decomposition模块提供矩阵分解算法、其他PCA、NMF 或ICA,其中大部分算法都被视为降维技术。
①主成分分析:sklearn.decomposition.PCA(n_components=None, copy=True, whiten=False, svd_solver=’auto’, tol=0.0, iterated_power=’auto’, random_state=None)
主要参数说明:
n_components:参数主要用于指定保留的特征个数,其数据类型为整数、浮点数、None或字符型。若n_components为None时,表示保留所有特征;若n_components为整数时,表示保留的特征个数;若n_components为浮点数时,表示保留后特征的方差之和占所有特征方差的最小阈值;若n_components = ‘mle’ and svd_solver = ‘full’时,该算法会用MLE算法去选择保留的特征。
whiten:表示对保留后的特征数据是否进行标准化(转化成特征方差都为1)标识
svd_solver : SVD分解方式,可选项‘auto’, ‘full’, ‘arpack’, ‘randomized’
构建简单例子
In [1]: import numpy as np ...: import matplotlib.pyplot as plt ...: from mpl_toolkits.mplot3d import Axes3D ...: from sklearn.datasets.samples_generator import make_blobs ...: X, y = make_blobs(n_samples=10000, n_features=3, centers=[[3,3, 3], [0, ...: 0,0], [1,1,1], [2,2,2]], cluster_std=[0.2, 0.1, 0.2, 0.2], ...: random_state =9) ...: fig = plt.figure() ...: ax = Axes3D(fig, rect=[0, 0, 1, 1], elev=30, azim=20) ...: plt.scatter(X[:, 0], X[:, 1], X[:, 2],marker='o') ...: plt.show() ...:
利用PCA训练数据情况:
a、n_components=None,保留所有特征
In [2]: from sklearn.decomposition import PCA ...: pca = PCA() ...: pca.fit(X) ...: print(pca.n_components_) ...:3训练后,观察三个特征的方差及方差比
In [3]: pca.explained_variance_Out[3]: array([ 3.78483785, 0.03272285, 0.03201892])In [4]: pca.explained_variance_ratio_Out[4]: array([ 0.98318212, 0.00850037, 0.00831751])b、n_components为整数M,若M小于X的特征总数,则挑选前M个方差大的特征
In [5]: from sklearn.decomposition import PCA ...: pca = PCA(n_components=2)#保留2个特征值 ...: pca.fit(X) ...: print(pca.explained_variance_) ...: print(pca.explained_variance_ratio_) ...:[ 3.78483785 0.03272285][ 0.98318212 0.00850037]c、n_components为浮点数,选择特征方差占比大于阈值n_components的最大特征方差且特征个数最小
In [6]: pca = PCA(n_components=0.006) ...: pca.fit(X) ...: print(pca.explained_variance_) ...: print(pca.explained_variance_ratio_) ...: print(pca.n_components_) ...:[ 3.78483785][ 0.98318212]1d、n_components为mle时,svd_solver参数必须为full,否则报错
In [7]: pca = PCA(n_components='mle',svd_solver='full') ...: pca.fit(X) ...: print(pca.explained_variance_) ...: print(pca.explained_variance_ratio_) ...: print(pca.n_components_) ...:[ 3.78483785][ 0.98318212]1In [8]: pca = PCA(n_components='mle',svd_solver='arpack') ...: pca.fit(X) ...:---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-8-b62bafac46ff> in <module>() 1 pca = PCA(n_components='mle',svd_solver='arpack')----> 2 pca.fit(X)d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in fit(self, X, y) 305 Returns the instance itself. 306 """--> 307 self._fit(X) 308 return self 309d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in _fit(self,X) 368 return self._fit_full(X, n_components) 369 elif svd_solver in ['arpack', 'randomized']:--> 370 return self._fit_truncated(X, n_components, svd_solver) 371 372 def _fit_full(self, X, n_components):d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in _fit_truncated(self, X, n_components, svd_solver) 433 raise ValueError("n_components=%r cannot be a string " 434 "with svd_solver='%s'"--> 435 % (n_components, svd_solver)) 436 elif not 1 <= n_components <= n_features: 437 raise ValueError("n_components=%r must be between 1 and "ValueError: n_components='mle' cannot be a string with svd_solver='arpack'In [9]: pca = PCA(n_components='mle',svd_solver='randomized') ...: pca.fit(X) ...:---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-9-1f9c5b9ac3af> in <module>() 1 pca = PCA(n_components='mle',svd_solver='randomized')----> 2 pca.fit(X)d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in fit(self, X, y) 305 Returns the instance itself. 306 """--> 307 self._fit(X) 308 return self 309d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in _fit(self,X) 368 return self._fit_full(X, n_components) 369 elif svd_solver in ['arpack', 'randomized']:--> 370 return self._fit_truncated(X, n_components, svd_solver) 371 372 def _fit_full(self, X, n_components):d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in _fit_truncated(self, X, n_components, svd_solver) 433 raise ValueError("n_components=%r cannot be a string " 434 "with svd_solver='%s'"--> 435 % (n_components, svd_solver)) 436 elif not 1 <= n_components <= n_features: 437 raise ValueError("n_components=%r must be between 1 and "ValueError: n_components='mle' cannot be a string with svd_solver='randomized'In [10]: pca = PCA(n_components='mle') ...: pca.fit(X) ...:---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-10-92060cf30409> in <module>() 1 pca = PCA(n_components='mle')----> 2 pca.fit(X)d:\softwore\python\lib\site-packages\sklearn\decomposition\pca.py in _fit(self,X) 358 if max(X.shape) <= 500: 359 svd_solver = 'full'--> 360 elif n_components >= 1 and n_components < .8 * min(X.shape): 361 svd_solver = 'randomized' 362 # This is also the case of n_components in (0,1)TypeError: unorderable types: str() >= int()
阅读全文
0 0
- sklearn矩阵分解类库学习
- 矩阵论学习笔记四:矩阵分解
- sklearn数据预处理类库学习
- sklearn库的学习
- 矩阵分解学习---详细介绍SVD
- 矩阵LU分解求逆(学习笔记)
- 用Spark学习矩阵分解推荐算法
- 用Spark学习矩阵分解推荐算法
- 用Spark学习矩阵分解推荐算法
- 机器学习中的矩阵分解方法
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 矩阵分解
- 浅析 Linux 初始化 init 系统,第 3 部分: Systemd
- 最简单的react框架
- insertObjects: atIndexes:s;
- vueJS学习(一)入门学习与路由
- 工厂方法模式
- sklearn矩阵分解类库学习
- QDU BelamiYao的一道简单签到题(思维)
- 区块链项目投资的7大原则
- duilib 使用图片素材或者算法给窗体增加阴影(源码和demo)
- Android涂鸦画板源码
- memache的使用场景
- GridLayout + Animation 实现 Android 仿超级课程表“发现”全屏宫格图标弹出动画
- SSH本机免登陆密码
- vim手册