一个快速聚类 Kmeans ( GPU Kmens)

来源：互联网发布：服务器监控软件免费编辑：程序博客网时间：2024/06/08 17:41

程序包的下载地址

https://github.com/serban/kmeans

使用方法

./cuda_main -i Image/color100.txt -n 3 -o

coor100.txt是聚类的样本文件，每一行是一个样本，第一个数是计数号，没行数据的个数是样本的特征维数

-n 是设置聚类中心数这里是3个聚类中心

使用时，自己根据自己的样本产生聚类需要的输入文件

我这里的实验 60万x384维样本不到一分钟 K40 GPU卡。

===============================================

如果报

1) 执行 make clean2) 编辑 Makefile 文件里的.     CFLAGS = $(OPTFLAGS) $(DFLAGS) $(INCFLAGS) -DBLOCK_SHARED_MEM_OPTIMIZATION=13) 修改为 -DBLOCK_SHARED_MEM_OPTIMIZATION=04) 然后执行 make cuda 变异可执行文件

===============================================

如果报cuda参数不对的错误，就要更改cuda_kmeans.cu中的程序

const unsigned int numThreadsPerClusterBlock = 128;

系统要安装NVIDIA CUDA的Tool Kit，执行cuda toolkit里的 devideQuery 查看 thread数目

~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release/deviceQuery |grep thread

输出

Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

因此这里改为1024

const unsigned int numThreadsPerClusterBlock = 1024;

重新编译cuda_main文件

make clean

make cuda

感谢NVIDIA 教育计划提供的K40 GPU

0 0