聚类算法之Clustering by Local Gravitation

来源：互联网发布：js geturlparameter 编辑：程序博客网时间：2024/06/05 08:16

Clustering by Local Gravitation 是近期发表于IEEE TCYB的一篇关于聚类的文章，这篇文章提出了一种新的“密度”，作者称为CE值，CE可以代替密度聚类中的密度值以更好更灵活的聚类。

The code of this paper is attached as follow:

http://ieeexplore.ieee.org/document/7915751/media

和常规的密度聚类方法类似，先计算出所有样本点的CE值，然后设置一个阈值，大于阈值的点可以认为是密度聚类中的核心点，然后将核心点进行连接就可以。不过有的LRF和CO两个量的辅助，连接同一类簇的花样变得多了不少，作者就些提出了LGC聚类算法和CLA聚类算法。

（图片引自：Z. WANG et al., “Clustering by Local Gravitation,IEEE Trans. Cybern., to be published, doi:10.1109/TCYB.2017.2695218）

LGC算法和CLA算法本身的原理和DBSCAN类似，CLA又吸收了来自SCIENCE上的聚类算法"clustering by fast search and find of density peaks"的优点减少了参数设置的个数。虽然作者声明在使用合适的数据结构如特殊的k-d树的情况下LGC/CLA算法的时间复杂度为O(nlogn)，但作者提供的代码中实现的复杂度是O(n*n*logn)，因为代码中并没有借助k-d树来进行k近邻查找而是直接进行了排序，因此LGC和CLA的运行效率并不高。

It is easy to reproduce the results in the paper "Clustering by Local Gravitation", for the synthetic datasets as an example, the authorsprovide a script named as exp2Demo.m, just open it and run the code block by block.

The code of LGC is named as “LGC.m”and the code of CAL is “CLA.m”.

If you have no idea onpresetting the parameter of your own data set, try CLA first.

Here is a simple example:

ds = [randn(200,2)+ones(200,2); randn(200,2)-ones(200,2)];

[idx,cNums]=CLA(ds);

plot_cluster(idx,cNums,ds);

0 0