k-means, k-medoids, k-median and k-center 的区别
来源:互联网 发布:java map 泛型定义 编辑:程序博客网 时间:2024/05/17 02:58
k-means, k-medoids, k-median and k-center, 先不要晕. 这4个都是聚类算法, 个中区别慢慢讲来.
k-means, 这位是最有名的了. 因为简单有效, 通常是聚类的第一选择.
N data items —- > k clusters
in each cluster, there is an averaged center (mean) called centroids.
Object: minimize the sum of squared distance from each item to its nearest averaged center.
EM algorithm is the most common and simple way to realize it.k-medoids,
N data items —- > k clusters
in each cluster, there is a medoid, which is a real data item from the data set (not averaged !!!).
Object: minimize the sum of squared distance from each item to its nearest medoids.
Main realization : PAM, CLARA, CLARANS, EM algorithm (like k-means)
PAM: global optimal, but very slow
CLARA: use PAM on samples, efficient, not global optimal
CLARANS: random search, better than CLARA
EM: very fast, but not global optimalk-median,
N data items —- > k clusters
in each cluster, there is a median (median !! not mean or medoids !!).
Object: minimize the sum of distance from each item to its nearest median (sum of distance !! not sum of squared distance !!).k-center,
N data items —- > k clusters
in each cluster, there is a cluster center.
Object: minimize the maximum distance from each item to its nearest cluster centers (maximum distance !! not sum of distance !!)According to (Bradley NIPS1997),
k-median is to assign n points in m-dimensional real value space to k clusters so that the sum of distances of each point to the nearest center is minimized. The center is a vector in m-dimensional real value space, but not the one of n points. A center of one cluster is iteratively computed as the median vector of all points in this cluster.
k-median algorithm uses the same strategy as k-means to update the centers, but it uses the 1-norm distance.
In contrast the k-means algorithm uses squares of 2-norm distances to generate cluster centers.
According to (Arya STOC2001), k-median problem is to minimize the average distance from data points to their closest cluster centers. k-center problem is to minimize the maximum distance from data points to their closest cluster centers, which is the min-max analogue of the k-median problem.
In a general metric space, the k-median problem is known to be NP-hard. Its approximation has been widely studied in (Arya STOC2001, Guha JCSS2002).
转自:http://blog.sina.com.cn/s/blog_68db53590100nttp.html
- k-means, k-medoids, k-median and k-center 的区别
- k-means, k-medoids, k-median and k-center
- K-means和K-medoids
- 聚类 K-means & K-medoids 算法
- K-means 和 K-medoids算法聚类分析
- k-medoids与k-Means聚类算法的异同
- k-means优化 & k-means距离的选择 &k-medoids对比
- K-Means 和K-Medoids算法及其MATLAB实现
- K-Means, K-Medoids, GMM, Spectral clustering,Ncut
- K-Means, K-Medoids, GMM, Spectral clustering,Ncut
- K-Means 和K-Medoids算法及其MATLAB实现
- 机器学习:K-means和K-medoids对比[4]
- hard k-means and soft k-means
- 几种聚类算法的结合运用(K-MEANS K-medoids 最大最小距离算法)
- 最大最小距离算法(K-MEANS K-medoids )聚类算法的结合运用
- Clustering (2): k-medoids
- k-medoids 算法思想
- k-medoids聚类
- POJ 3104 Drying【二分】
- POJ 3273 :Monthly Expense【二分+贪心】
- CentOS 7 安装PHP7
- idea 的修改内存
- ES6 变量、常量声明总结
- k-means, k-medoids, k-median and k-center 的区别
- nginx配置负载均衡
- Java线程池
- 专题:二分图匹配
- html5 video调用servlet播放
- 专题:网络流问题
- 专题:博弈论
- mybatis不区分大小写问题
- STL :sscanf sprintf的应用