OpenCV 中 Kmeans 用法整理
来源:互联网 发布:大数据探索性分析 编辑:程序博客网 时间:2024/05/01 09:42
1.K-Means clustering in OpenCV
K-Means is an algorithm to detect clusters in a given set of points. It does this without you supervising or correcting the results. It works with any number of dimensions as well (that is, it works on a plane, 3D space, 4D space and any other finite dimensional spaces). And OpenCV comes with this algorithm built right into it!
K-means with OpenCV’s C++ interface
The function you need to call to execute the algorithm is:
This function is in the cv namespace. So you can use it by cv::kmeans or by simply including the cv namespace. If you know how K-means works, the parameters should be self explanatory.
Parameters
- samples: (input) The actual data points that you need to cluster. It should contain exactly one point per row. That is, if you have 50 points in a 2D plane, then you should have a matrix with 50 rows and 2 columns.
- clusterCount: (input) The number of clusters in the data points.
- labels: (output) Returns the cluster each point belongs to. It can also be used to indicate the initial guess for each point.
- termcrit: (input) This is an iterative algorithm. So you need to specify the termination criteria (number of iterations & desired accuracy)
- attempts: (input) The number of times the algorithm is run with different center placements
- flags: (input) Possible values include:
- KMEANS_RANDOM_CENTER: Centers are generated randomly
- KMEANS_PP_CENTER: Uses the kmeans++ center initialization
- KMEANS_USE_INITIAL_LABELS: The first iteration uses the suppliedlabels to calculate centers. Later iterations use random or semi-random centers (use the above two flags for that).
- centers: (output) This matrix holds the center of each cluster.
Returns
The function returns the compactness of the final clustering. What is compactness? It’s a measure of how good the labeling was done. The smaller the better.
When attempts is 1, the value returned is the compactness of the only iteration that happened. If attempts is more than 1, the final labeling returned is the one with the least compactness.
转自:http://www.aishack.in/2010/08/k-means-clustering-in-opencv/2.Kmeans clustering in OpenCV with C++
Kmeans clustering is one of the most widely used UnSupervised Learning Algorithms. If you are not sure what Kmeans is, refer this article. Also if you have heard about the term Vector Quantization, Kmeans is closely related to that (refer this article to know more about it). Autonlab has a great ppt on Kmeans Clustering.
First, I'll talk about the kmeans usage in OpenCV with C++ and then I'll explain it with a program. If you are not yet comfortable in OpenCV with C++, please refer to this article and the pretty much everything else is the same as in C API (where you use IplImage*,etc).
Function call in C++ API of OpenCV accepts the input in following format:
double kmeans(const Mat& samples, int clusterCount, Mat& labels, TermCriteria termcrit, int attempts, int flags, Mat* centers);
Parameters explained as follows:
- samples: It contains the data. Each row represents a Feature Vector. Each co lumn in a row represent a dimension. So, we can have multiple dimensions of data in the feature vector. Example if we have 50, 5 dimensional feature vector, we will have 50 rows, 5 colums of this matrix. One thing interesting which I've noticed is kmeans doesn't work with CV_64F type.
- clusterCount: It should be specified beforehand. We need to know how many clusters do we divide the data into. It is an integer.
- labels: It is an output Matrix. If we had a Matrix of above specified size (i.e 50 x 5 ), we will have 50 x 1 output Matrix. It determines which cluster the feature vector belongs. It starts with 0, 1, .... (number of clusters-1).
- TermCriteria: It determines the criteria in applying the algorithm. Max iterations, accuracy,etc.
- attempts: number of attempts made with different initial labelling. Also refer documentation for elaborate information on this parameter.
- flags: It can be
KMEANS_RANDOM_CENTERS (for random initialization of cluster centers).
KMEANS_PP_CENTERS (for kmeans++ version of initializing cluster centers)
KMEANS_USE_INITIAL_LABELS (for user defined initialization). - centers: Matrix holding center of each cluster. If we divide the 50 x 5feature vector into 2 clusters, we will have 2 centers of each in 5 dimensions.
#include "opencv2/highgui/highgui.hpp"#include "opencv2/core/core.hpp"#include <iostream> using namespace cv;using namespace std; int main( int /*argc*/, char** /*argv*/ ){ cout << "\n Usage in C++ API:\n double kmeans(const Mat& samples, int clusterCount, Mat& labels, TermCriteria termcrit, int attempts, int flags, Mat* centers) \n\n\n" << endl; Mat points(sampleCount,dimensions, CV_32F,Scalar(10)); Mat labels; Mat centers(clusterCount, 1, points.type()); int clusterCount = 2; int dimensions = 5; int sampleCount = 50; // values of 1st half of data set is set to 10 //change the values of 2nd half of the data set; i.e. set it to 20 for(int i =24;i<points.rows;i++) { for(int j=0;j<points.cols;j++) { points.at<float>(i,j)=20; } } kmeans(points, clusterCount, labels, TermCriteria( CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 10, 1.0), 3, KMEANS_PP_CENTERS, centers); // we can print the matrix directly. cout<<"Data: \n"<<points<<endl; cout<<"Center: \n"<<centers<<endl; cout<<"Labels: \n"<<labels<<endl; return 0;}
转自:http://www.developerstation.org/2012/01/kmeans-clustering-in-opencv-with-c.html
3.Kmeans
Finds centers of clusters and groups input samples around the clusters.
- C++: double kmeans(InputArray data, int K, InputOutputArray bestLabels, TermCriteria criteria, int attempts, int flags, OutputArray centers=noArray())
- Python: cv2.kmeans(data, K, criteria, attempts, flags[, bestLabels[, centers]]) → retval, bestLabels, centers
- C: int cvKMeans2(const CvArr* samples, int cluster_count, CvArr* labels, CvTermCriteria termcrit, int attempts=1, CvRNG* rng=0, int flags=0, CvArr*_centers=0, double* compactness=0 )
- Python: cv.KMeans2(samples, nclusters, labels, termcrit, attempts=1, flags=0, centers=None) → float
Parameters: - samples – Floating-point matrix of input samples, one row per sample.
- data – Data for clustering.
- cluster_count – Number of clusters to split the set by.
- K – Number of clusters to split the set by.
- labels – Input/output integer array that stores the cluster indices for every sample.
- criteria – The algorithm termination criteria, that is, the maximum number of iterations and/or the desired accuracy. The accuracy is specified as criteria.epsilon. As soon as each of the cluster centers moves by less than criteria.epsilon on some iteration, the algorithm stops.
- termcrit – The algorithm termination criteria, that is, the maximum number of iterations and/or the desired accuracy.
- attempts – Flag to specify the number of times the algorithm is executed using different initial labellings. The algorithm returns the labels that yield the best compactness (see the last function parameter).
- rng – CvRNG state initialized by RNG().
- flags –
Flag that can take the following values:
- KMEANS_RANDOM_CENTERS Select random initial centers in each attempt.
- KMEANS_PP_CENTERS Use kmeans++ center initialization by Arthur and Vassilvitskii [Arthur2007].
- KMEANS_USE_INITIAL_LABELS During the first (and possibly the only) attempt, use the user-supplied labels instead of computing them from the initial centers. For the second and further attempts, use the random or semi-random centers. Use one of KMEANS_*_CENTERS flag to specify the exact method.
- centers – Output matrix of the cluster centers, one row per each cluster center.
- _centers – Output matrix of the cluster centers, one row per each cluster center.
- compactness – The returned value that is described below.
The function kmeans implements a k-means algorithm that finds the centers of cluster_count clusters and groups the input samples around the clusters. As an output, contains a 0-based cluster index for the sample stored in the row of the samples matrix.
The function returns the compactness measure that is computed as
after every attempt. The best (minimum) value is chosen and the corresponding labels and the compactness value are returned by the function. Basically, you can use only the core of the function, set the number of attempts to 1, initialize labels each time using a custom algorithm, pass them with the ( flags =KMEANS_USE_INITIAL_LABELS ) flag, and then choose the best (most-compact) clustering.
Note
- An example on K-means clustering can be found at opencv_source_code/samples/cpp/kmeans.cpp
- (Python) An example on K-means clustering can be found at opencv_source_code/samples/python2/kmeans.py
partition
Splits an element set into equivalency classes.
- C++: template<typename _Tp, class _EqPredicate> int partition(const vector<_Tp>& vec, vector<int>& labels, _EqPredicate predicate=_EqPredicate())
Parameters: - vec – Set of elements stored as a vector.
- labels – Output vector of labels. It contains as many elements as vec. Each label labels[i] is a 0-based cluster index of vec[i] .
- predicate – Equivalence predicate (pointer to a boolean function of two arguments or an instance of the class that has the method booloperator()(const _Tp& a, const _Tp& b) ). The predicate returns true when the elements are certainly in the same class, and returns falseif they may or may not be in the same class.
The generic function partition implements an algorithm for splitting a set of elements into one or more equivalency classes, as described inhttp://en.wikipedia.org/wiki/Disjoint-set_data_structure . The function returns the number of equivalency classes.
转自:http://docs.opencv.org/modules/core/doc/clustering.html
- OpenCV 中 Kmeans 用法整理
- Matlab中kmeans函数用法
- 【opencv】kmeans
- opencv kmeans
- OpenCV中KMeans算法介绍与应用
- Matlab中kmeans聚类用法
- Opencv之KMEANS篇
- OpenCV kmeans代码
- OpenCV+KMeans算法
- opencv kmeans 算法例子
- python-opencv-kmeans聚类
- opencv KMeans.cpp 学习
- opencv中cvSetData用法
- opencv中cvSetData用法
- OpenCV 中 minMaxLoc 用法
- opencv之kmeans源代码注释
- Opencv Kmeans聚类算法
- OpenCv Kmeans算法及实现
- 推荐的75本经典书籍
- 【高手回避】poj3268,一道很水的dijkstra算法题
- jquery-uploadify3.2使用心得
- 优化MyEclipse 8.6.1的启动速度与运行速度
- test
- OpenCV 中 Kmeans 用法整理
- C++对C的加强 总结(4)
- 史上最通俗易懂的关于JavaScript 的 prototype、原型继承、this指针的讲解
- POJ 3278 Catch That Cow(图论:BFS)
- 如何在myeclipse中取消鼠标覆盖提示功能?
- Ubuntu14.04下maven建立java项目
- ADSTATS.SQL HANG AT DBMS_STATS.GATHER_SCHEMA_STATS. (文档 ID 466294.1)
- C++对C的加强 总结(5)
- 【存储管理】系统调用mmap()