Apriori关联算法

来源:互联网 发布:淘宝公众号 编辑:程序博客网 时间:2024/06/07 03:16

样本数据
a,c,e
b,d
b,c
a,b,c,d
a,b
b,c
a,b
a,b,c,e
a,b,c
a,c,e

setwd("/users/XXX/desktop/R/chapter5/示例程序")#Matrix是arules的依赖库library(Matrix)library(arules)#下面读txt内容可能会出错,需要打开txt把光标移到最后一行后再换行,也就是最后一行给个空行tr<-read.transactions("menu_orders.txt",format="basket",sep=",")summary(tr)transactions as itemMatrix in sparse format with 10 rows (elements/itemsets/transactions) and 5 columns (items) and a density of 0.54 most frequent items:#各个元素的频数      b       a       c       e       d (Other)       8       7       7       3       2       0 element (itemset/transaction) length distribution:sizes2 3 4 5 3 2    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.     2.0     2.0     2.5     2.7     3.0     4.0 includes extended item information - examples:  labels1      a2      b3      cinspect(tr)     items    [1]  {a,c,e}  [2]  {b,d}    [3]  {b,c}    [4]  {a,b,c,d}[5]  {a,b}    [6]  {b,c}    [7]  {a,b}    [8]  {a,b,c,e}[9]  {a,b,c}  [10] {a,c,e} #支持度0.2 置信度0.5rules0<-apriori(tr,parameter=list(support=0.2,confidence=0.5))rules0set of 18 rules inspect(riles0)     lhs      rhs support confidence lift     [1]  {}    => {c} 0.7     0.7000000  1.0000000[2]  {}    => {b} 0.8     0.8000000  1.0000000[3]  {}    => {a} 0.7     0.7000000  1.0000000[4]  {d}   => {b} 0.2     1.0000000  1.2500000[5]  {e}   => {c} 0.3     1.0000000  1.4285714[6]  {e}   => {a} 0.3     1.0000000  1.4285714[7]  {c}   => {b} 0.5     0.7142857  0.8928571[8]  {b}   => {c} 0.5     0.6250000  0.8928571[9]  {c}   => {a} 0.5     0.7142857  1.0204082[10] {a}   => {c} 0.5     0.7142857  1.0204082[11] {b}   => {a} 0.5     0.6250000  0.8928571[12] {a}   => {b} 0.5     0.7142857  0.8928571[13] {c,e} => {a} 0.3     1.0000000  1.4285714[14] {a,e} => {c} 0.3     1.0000000  1.4285714[15] {a,c} => {e} 0.3     0.6000000  2.0000000[16] {b,c} => {a} 0.3     0.6000000  0.8571429[17] {a,c} => {b} 0.3     0.6000000  0.7500000[18] {a,b} => {c} 0.3     0.6000000  0.8571429

有实际用处,比如我上次做的新闻标题分词,然后获得词与词之间的关联度,就可以用这个

0 0
原创粉丝点击