实验报告：CURE聚类预测

来源：互联网发布：python加减乘除函数编辑：程序博客网时间：2024/06/04 19:47

这周工作室考核，实现cure聚类预测。
CURE算法：
纯属个人理解：
1.将数据划分k个区。
2.每个区分别聚类，成为k个簇.
3.选择每个簇的分布较好的点作为簇的代表点，排除异常点及聚类缓慢的点。
4.利用每个簇分别聚类落在每个新形成得簇中的代表点向簇中心收缩。
5.以代表点计算簇间距离，将每个簇之间的距离排序，将最近的簇进行聚合。
library(fpc)
x1 = read.csv(“C:/Users/Administrator/Desktop/13.csv”, header=FALSE)
view(x1)
dd=dist(x1,method = “euclidean”)#####Euclid距离
t1=dbscan(dd,eps=0.42,MinPts = 10)
plot(t1)
sum=summary(dd)
plot(dd)
plot(t)
hc1=hclust(dd,method=”single”)hc2=hclust(dd,method=”complete”)
hc3=hclust(dd,method=”median”)hc4=hclust(dd,method=”average”)
opar=par(mfrow=c(2,2))
plot(hc1,hang=-1);plot(hc2,hang=-1);plot(hc3,hang=-1);plot(hc4,hang=-1)
par(opar)#####################################图示不同聚类效果
class=cutree(hc2,k=10)
order=order(class)
x2=x1[order,]
view(x2)

##################################################################将分成的十类提取出来

L1=as.data.frame(x1[class==1,])
L2=as.data.frame(x1[class==2,])
L3=as.data.frame(x1[class==3,])
L4=as.data.frame(x1[class==4,])
L5=as.data.frame(x1[class==5,])
L6=as.data.frame(x1[class==6,])
L7=as.data.frame(x1[class==7,])
L8=as.data.frame(x1[class==8,])
L9=as.data.frame(x1[class==9,])
L10=as.data.frame(x1[class==10,])

########################################################################求每个类中的欧式距离

dd1=dist(L1,method = “euclidean”)
dd2=dist(L2,method = “euclidean”)
dd3=dist(L3,method = “euclidean”)
dd4=dist(L4,method = “euclidean”)
dd5=dist(L5,method = “euclidean”)
dd6=dist(L6,method = “euclidean”)
dd7=dist(L7,method = “euclidean”)
dd8=dist(L8,method = “euclidean”)
dd9=dist(L9,method = “euclidean”)
dd10=dist(L10,method = “euclidean”)
1 12 14 17
12 39.15354
14 46.20606 47.41308
17 43.78356 37.89459 58.73670
41 29.25748 35.39774 41.31586 35.98611
dist(0)
3 16 25 26 31
16 39.98750
25 47.57100 55.06360
26 50.88222 56.86827 35.38361
31 50.66557 48.47680 50.69517 49.61854
32 52.56425 58.49786 47.91659 58.41233 58.87274
4 15 21
15 43.80639
21 33.39162 40.09988
48 37.49667 51.21523 28.33725
5 7 11 13 29 36 37 38 44
7 46.23851
11 46.27094 29.24038
13 42.83690 26.58947 20.19901
29 57.74080 50.41825 42.81355 42.43819
36 46.57252 42.81355 36.60601 38.62642 38.19686
37 53.42284 44.92215 40.26164 40.21194 26.38181 35.81899
38 40.54627 31.01612 27.47726 25.15949 42.49706 40.60788 43.45112
44 40.64480 33.55592 36.26293 39.63584 56.76266 46.09772 49.57822 38.49675
47 41.15823 35.91657 25.51470 27.22132 31.62278 30.38092 29.05168 31.24100 41.88078
6 8 10 18 19 20 22 24 27
8 30.46309
10 41.94043 48.38388
18 27.25803 32.32646 39.03844
19 48.40455 46.42198 37.17526 34.78505
20 55.75841 49.56813 49.89990 39.34463 38.23611
22 36.41428 42.44997 26.73948 32.51154 31.60696 45.55217
24 52.63079 51.26402 38.92300 42.22558 30.70831 38.74274 36.41428
27 46.51881 52.93392 32.20248 35.25621 38.06573 47.88528 35.49648 42.77850
28 37.45664 35.25621 43.22037 30.59412 46.34652 37.57659 44.71018 48.77499 41.34005
30 41.31586 41.17038 34.52535 34.32200 30.19934 40.81666 31.67018 31.16087 34.62658
35 32.04684 35.34119 35.12834 22.71563 43.49713 40.14972 33.21144 43.57752 37.29611
39 41.79713 37.45664 42.87190 34.64102 42.68489 46.66905 31.70173 52.92447 41.12177
40 34.35113 25.25866 39.17908 29.00000 34.10279 44.97777 32.15587 40.32369 40.44750
45 35.19943 41.38840 36.00000 29.93326 45.38722 46.26013 24.22808 40.73082 39.00000
46 28.60070 31.84337 39.61060 27.11088 44.07947 56.73623 35.66511 51.40039 38.93584
28 30 35 39 40 45
8
10
18
19
20
22
24
27
28
30 34.43835
35 25.49510 35.38361
39 38.36665 41.06093 36.93237
40 33.89690 28.23119 30.64311 37.88139
45 40.64480 35.18522 26.49528 34.40930 37.61649
46 43.27817 39.86226 36.06938 36.29049 30.65942 35.59494
dist(0)
23
43 47.65501
dist(0)
34
42 43.27817

############################################################按最远距离法聚类

hc1=hclust(dd1,method=”complete”) plot(hc1,hang=-1)
hc2=hclust(dd1,method=”complete”) plot(hc2,hang=-1)
hc3=hclust(dd1,method=”complete”) plot(hc3,hang=-1)
hc4=hclust(dd1,method=”complete”) plot(hc4,hang=-1)
hc5=hclust(dd1,method=”complete”) plot(hc5,hang=-1)
hc6=hclust(dd1,method=”complete”) plot(hc6,hang=-1)
hc7=hclust(dd1,method=”complete”) plot(hc7,hang=-1)
hc8=hclust(dd1,method=”complete”) plot(hc8,hang=-1)
hc9=hclust(dd1,method=”complete”) plot(hc9,hang=-1)
hc10=hclust(dd1,method=”complete”) plot(hc10,hang=-1)

plot(hc2,hang=-1)
heatmap(as.matrix(dd),labRow = F, labCol = F)
hc2=hclust(dd1,method=”complete”)

plot(hc2,hang=-1)
result=cutree(model1,k=3)
目前还在思考测算中，很多地方不足。
遇到问题：不明白如何在聚类后预测
待后续研究思考

1 0