Code for K-Means

来源:互联网 发布:js embed 属性 编辑:程序博客网 时间:2024/05/01 14:12

K-means

% Initialize centroids centroids = kMeansInitCentroids(X, K); for iter = 1:iterations % Cluster assignment step: Assign each data point to the % closest centroid. idx(i) corresponds to cˆ(i), the index % of the centroid assigned to example i idx = findClosestCentroids(X, centroids);% Move centroid step: Compute means based on centroid% assignments centroids = computeMeans(X, idx, K);end

Random initialization

function centroids = kMeansInitCentroids(X, K)%KMEANSINITCENTROIDS This function initializes K centroids that are to be %used in K-Means on the dataset X%   centroids = KMEANSINITCENTROIDS(X, K) returns K initial centroids to be%   used with the K-Means on the dataset X% You should return this values correctlycentroids = zeros(K, size(X, 2));% Instructions: You should set centroids to randomly chosen examples from%               the dataset Xrandidx=randperm(size(X,1));centroids=X(randidx(1:K),:);end

Finding closest centroids

function idx = findClosestCentroids(X, centroids)%FINDCLOSESTCENTROIDS computes the centroid memberships for every example%   idx = FINDCLOSESTCENTROIDS (X, centroids) returns the closest centroids%   in idx for a dataset X where each row is a single example. idx = m x 1 %   vector of centroid assignments (i.e. each entry in range [1..K])% Set KK = size(centroids, 1);% You need to return the following variables correctly.idx = zeros(size(X,1), 1);% Instructions: Go over every example, find its closest centroid, and store%               the index inside idx at the appropriate location.%               Concretely, idx(i) should contain the index of the centroid%               closest to example i. Hence, it should be a value in the %               range 1..Ktemp=zeros(K,1);for i=1:size(X,1),  for j=1:K,    temp(j)=sum((X(i,:)-centroids(j,:)).^2);  end;  [m,id]=min(temp);  idx(i)=id;end;end

Computing centroid means

function centroids = computeCentroids(X, idx, K)%COMPUTECENTROIDS returns the new centroids by computing the means of the %data points assigned to each centroid.%   centroids = COMPUTECENTROIDS(X, idx, K) returns the new centroids by %   computing the means of the data points assigned to each centroid. It is%   given a dataset X where each row is a single data point, a vector%   idx of centroid assignments (i.e. each entry in range [1..K]) for each%   example, and K, the number of centroids. You should return a matrix%   centroids, where each row of centroids is the mean of the data points%   assigned to it.% Useful variables[m n] = size(X);% You need to return the following variables correctly.centroids = zeros(K, n);% Instructions: Go over every centroid and compute mean of all points that%               belong to it. Concretely, the row vector centroids(i, :)%               should contain the mean of the data points assigned to%               centroid i.for i=1:K,  temp=find(idx==i);  centroids(i,:)=(sum(X(temp,:)))/size(temp,1);end;end

Image compression with K-means
In a straightforward 24-bit color representation of an image,each pixel is represented as three 8-bit unsigned integers (ranging from 0 to 255) that specify the red, green and blue intensity values. This encoding is often refered to as the RGB encoding.
Our image contains thousands of colors, and you will reduce the number of colors to 16 colors.
Specifically, you only need to store the RGB values of the 16 selected colors, and for each pixel in the image you now need to only store the index of the color at that location (where only 4 bits are necessary to represent 16 possibilities).
You will use the K-means algorithm to select the 16 colors that will be used to represent the compressed image. Concretely, you will treat every pixel in the original image as a data example and use the K-means algorithm to find the 16 colors that best group (cluster) the pixels in the 3dimensional RGB space. Once you have computed the cluster centroids on the image, you will then use the 16 colors to replace the pixels in the original image.
这里写图片描述

0 0
原创粉丝点击