常用无监督降维方法简述
来源:互联网 发布:艾媒数据的报告 编辑:程序博客网 时间:2024/05/31 19:06
Unsupervised Dimension Reduction
Data with high dimension is always difficult to tackle. One hand is that it requires tremendous computation resource. On the other hand, it is not so objective as the one with low dimension. Therefore, dimension reduction is one of the key tricks to tackle it.
Linear Dimension Reduction
In order to reduce the dimension of samples, such as transform
Before doing that, it is necessary to make sure the average of training set
Principal Component Analysis, PCA
PCA, as you can see in the following contents, is the simplest linear dimension reduction method. Suppose that
where
In summary, PCA is defined as
Consider the eigenvalues of
Define the eigenvalues and corresponded eigen vectors as
Here is a simple example:
n=100;%x=[2*randn(n,1) randn(n,1)];x=[2*randn(n,1) 2*round(rand(n,1))-1+randn(n,1)/3];x=x-repmat(mean(x),[n,1]);[t,v]=eigs(x'*x,1);figure(1); clf; hold on; axis([-6 6 -6 6]);plot(x(:,1),x(:,2),'rx');plot(9*[-t(1) t(1)], 9*[-t(2) t(2)]);
Locality Preserving Projections
In PCA, the structure of clusters in origin training set may be changed, which is not true in locality preserving projections. It is another version of linear dimension reduction.
Define the similarity between
where
For the purpose of holding the structure of clusters, it is necessary to hypothesis that similar
However, to avoid the solution
where
If we set
So how to solve it? Consider the method we use in PCA:
Then define the generalized eigenvalues and eigen vectors as
n=100;%x=[2*randn(n,1) randn(n,1)];x=[2*randn(n,1) 2*round(rand(n,1))-1+randn(n,1)/3];x=x-repmat(mean(x),[n,1]);x2=sum(x.^2,2);W=exp(-(repmat(x2,1,n)+repmat(x2',n,1)-2*x*x'));D=diag(sum(W,2)); L=D-W;z=x'*D*x;z=(z+z')/2;[t,v]=eigs(x'*L*x,z,1,'sm');figure(1); clf; hold on; axis([-6 6 -6 6]);plot(x(:,1),x(:,2),'rx');plot(9*[-t(1) t(1)], 9*[-t(2) t(2)]);
Kernalized PCA
Let us turn to methods of nonlinear dimension reduction. Due to the time limit, we may not analyze it as deep as the linear one.
When it comes to nonlinearity, kernal functions are sure to be highlighted. Take the Gaussian Kernal function for example:
Here we will not take the eigenvalues of
However, centralization is necessary:
where
where
- 常用无监督降维方法简述
- 无监督学习:无监督降维
- 新兴机器学习算法:从无监督降维到监督降维
- HMM(参数训练无监督方法EM)
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】三十五:无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】35 无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- 【Scikit-Learn 中文文档】无监督降维
- src控件
- bzoj3224 普通平衡树【解法三】
- 单独调试文件
- dwz页面嵌入跳转以及操作完成跳转刷新
- 【转】How To Find What You Love To Do
- 常用无监督降维方法简述
- lib jar maven
- 动态规划题目集锦
- web服务访问名称
- 【MySQL基础】库的基本操作与数据类型
- mvc 返回字符设置(null空返回“”,xxs注入,日期转化等)
- Assign Cookies
- @ResponseBody 返回null对象 ajax回调不成功
- iOS文件的MD5校验--大文件的MD5加密