LLE Algorithm 解释

来源：互联网发布：单片机晶振频率编辑：程序博客网时间：2024/06/05 05:45

Input X: D by N matrix consisting of N data items in D dimensions.
Output Y: d by N matrix consisting of d < D dimensional embedding coordinates for the input points.

Find neighbours in X space [b,c].

for i=1:N  compute the distance from Xi to every other point Xj  find the K smallest distances   assign the corresponding points to be neighbours of Xiend

Solve for reconstruction weights W.

for i=1:N  create matrix Z consisting of all neighbours of Xi [d]  subtract Xi from every column of Z  compute the local covariance C=Z'*Z [e]  solve linear system C*w = 1 for w [f]  set Wij=0 if j is not a neighbor of i  set the remaining elements in the ith row of W equal to w/sum(w);end

Compute embedding coordinates Y using weights W.

create sparse matrix M = (I-W)'*(I-W)find bottom d+1 eigenvectors of M  (corresponding to the d+1 smallest eigenvalues) set the qth ROW of Y to be the q+1 smallest eigenvector  (discard the bottom eigenvector [1,1,1,1...] with eigenvalue zero)

Notes

[a] Notation    Xi and Yi denote the ith column of X and Y      (in other words the data and embedding coordinates of the ith point)    M' denotes the transpose of matrix M    * denotes matrix multiplication      (e.g. M'*M is the matrix product of M left multiplied by its transpose)    I is the identity matrix    1 is a column vector of all ones[b] This can be done in a variety of ways, for example above we compute     the K nearest neighbours using Euclidean distance.     Other methods such as epsilon-ball include all points within a     certain radius or more sophisticated domain specific and/or    adaptive local distance metrics.[c] Even for simple neighbourhood rules like K-NN or epsilon-ball     using Euclidean distance, there are highly efficient techniques     for computing the neighbours of every point, such as KD trees. [d] Z consists of all columns of X corresponding to     the neighbours of Xi but not Xi itself[e] If K>D, the local covariance will not be full rank, and it should be     regularized by seting C=C+eps*I where I is the identity matrix and     eps is a small constant of order 1e-3*trace(C).     This ensures that the system to be solved in step 2 has a unique solution.[f] 1 denotes a column vector of all ones

% LLE ALGORITHM (using K nearest neighbors)%% [Y] = lle(X,K,dmax)%% X = data as D x N matrix (D = dimensionality, N = #points)% K = number of neighbors% dmax = max embedding dimensionality% Y = embedding as dmax x N matrix%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function [Y] = lle(X,K,d)[D,N] = size(X);fprintf(1,'LLE running on %d points in %d dimensions\n',N,D);% STEP1: COMPUTE PAIRWISE DISTANCES & FIND NEIGHBORS fprintf(1,'-->Finding %d nearest neighbours.\n',K);X2 = sum(X.^2,1);distance = repmat(X2,N,1)+repmat(X2',1,N)-2*X'*X;[sorted,index] = sort(distance);neighborhood = index(2:(1+K),:);% STEP2: SOLVE FOR RECONSTRUCTION WEIGHTSfprintf(1,'-->Solving for reconstruction weights.\n');if(K>D)   fprintf(1,'   [note: K>D; regularization will be used]\n');   tol=1e-3; % regularlizer in case constrained fits are ill conditionedelse  tol=0;endW = zeros(K,N);for ii=1:N   z = X(:,neighborhood(:,ii))-repmat(X(:,ii),1,K); % shift ith pt to origin   C = z'*z;                                        % local covariance   C = C + eye(K,K)*tol*trace(C);                   % regularlization (K>D)   W(:,ii) = C\ones(K,1);                           % solve Cw=1   W(:,ii) = W(:,ii)/sum(W(:,ii));                  % enforce sum(w)=1end;% STEP 3: COMPUTE EMBEDDING FROM EIGENVECTS OF COST MATRIX M=(I-W)'(I-W)fprintf(1,'-->Computing embedding.\n');% M=eye(N,N); % use a sparse matrix with storage for 4KN nonzero elementsM = sparse(1:N,1:N,ones(1,N),N,N,4*K*N); for ii=1:N   w = W(:,ii);   jj = neighborhood(:,ii);   M(ii,jj) = M(ii,jj) - w';   M(jj,ii) = M(jj,ii) - w;   M(jj,jj) = M(jj,jj) + w*w';end;% CALCULATION OF EMBEDDINGoptions.disp = 0; options.isreal = 1; options.issym = 1; [Y,eigenvals] = eigs(M,d+1,0,options);Y = Y(:,2:d+1)'*sqrt(N); % bottom evect is [1,1,1,1...] with eval 0fprintf(1,'Done.\n');%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% other possible regularizers for K>D%   C = C + tol*diag(diag(C));                       % regularlization%   C = C + eye(K,K)*tol*trace(C)*K;                 % regularlization

0 0