KNN算法

来源：互联网发布：游戏编程入门pdf 编辑：程序博客网时间：2024/05/18 09:07

KNN算法又称为K最近邻算法，用于聚类，大致思路是寻找与待分类元素距离最相近的k个已知元素，把k个已知分类结果的元素所属的最大概率的类别作为待分类元素的类别，引用别人的通俗说法就是，对于一个陌生人，如果不知道他是好人还是坏人，那么选择5个他的最亲密的朋友，这5个人的人品好坏是已知的，如果5个人中好人多，就认为这个陌生人是好人，反之是坏人。以下是函数部分与实例。

1.函数定义

function[output]=KNN(input,train,k)
%%input consist of some row vector with size(n*d)
%%train is classified sample,we assume its size N*(d+1)
%N :we already have N classified row vector in train
%d :the pre d colums of train which is feature space
%1 :the last colums of train is classification result
%%k gives the space to approximations
%%output is a colum vector,represent output classification

[N,dPlus]=size(train);%%dPlus=d+1;
[n,d]=size(input);
Dis=zeros(n,N);

%构造距离矩阵Dis(n,N)
for i=1:1:n
for j=1:1:N
Dis(i,j)=(input(i,:)-train(j,1:d))*(input(i,:)-train(j,1:d))';
end
end

%寻找前k个距离最小值,存入n*k的Dk矩阵中
Dk=zeros(n,k);
for i=1:1:n
for j=1:1:k
a=Dis(i,:);
ma=max(max(a));
mi=min(min(a));
pos=find(a==mi,1,'first');
Dk(i,j)=train(pos,dPlus);
Dis(i,pos)=ma;
end
end

%将input 中的k个邻近元素中最多的分类赋值给input
for i=1:1:n
a=Dk(i,:);
%统计元素的重复次数
table = tabulate(a);
[maxCount,idx] = max(table(:,2));
%获取出现次数最多的元素
output(i,1)=table(idx);
end

2.函数运行

x1=0.1:0.1:1;
y1=x1;
x2=1.1:0.1:2;
y2=2*x2;
x3=2.1:0.1:3;
y3=3*x3;
x=[x1';x2';x3'];
y=[y1';y2';y3'];
c1=ones(size(x1'));
c2=2*ones(size(x2'));
c3=3*ones(size(x3'));
c=[c1;c2;c3];
train=[x,y,c];
figure
plot(x1,y1,'b',x2,y2,'r',x3,y3,'g');
input=[-2,0;2,0];
[output]=KNN(input,train,6);

0 0