6.2、朴素贝叶斯实例

来源:互联网 发布:淘宝客新建导购推广 编辑:程序博客网 时间:2024/06/05 16:49

实例一、朴素贝叶斯对莺尾花进行分类

#1、加载数据data("iris")#2、创建测试集和训练集数据library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.2.3
set.seed(2005)index <- createDataPartition(iris$Species, p=0.7, list=F)train_iris <- iris[index, ]test_iris <- iris[-index, ]#3、建模library(e1071)model_iris <- naiveBayes(Species~., data=train_iris)#4、模型评估summary(model_iris)
##         Length Class  Mode     ## apriori 3      table  numeric  ## tables  4      -none- list     ## levels  3      -none- character## call    4      -none- call
pred <- predict(model_iris, train_iris, type="class")mean(pred==train_iris[, 5])
## [1] 0.952381
#5、预测pred_iris <- predict(model_iris, test_iris, type="class")mean(pred_iris==test_iris[, 5])
## [1] 1
table(pred_iris, test_iris[, 5])
##             ## pred_iris    setosa versicolor virginica##   setosa         15          0         0##   versicolor      0         15         0##   virginica       0          0        15

实例二、对打网球数据分类并预测

#1、加载数据data<-read.csv("F:/R/Rworkspace/NB/playingtennis.csv")str(data)
## 'data.frame':    14 obs. of  6 variables:##  $ Day        : Factor w/ 14 levels "D1","D10","D11",..: 1 7 8 9 10 11 12 13 14 2 ...##  $ Outlook    : Factor w/ 3 levels "Overcast","Rain",..: 3 3 1 2 2 2 1 3 3 2 ...##  $ Temperature: Factor w/ 3 levels "Cool","Hot","Mild": 2 2 2 3 1 1 1 3 1 3 ...##  $ Humidity   : Factor w/ 2 levels "High","Normal": 1 1 1 1 2 2 2 1 2 2 ...##  $ Wind       : Factor w/ 2 levels "Strong","Weak": 2 1 2 2 2 1 1 2 2 2 ...##  $ PlayTennis : Factor w/ 2 levels "No","Yes": 1 1 2 2 2 1 2 1 2 2 ...
summary(data)
##       Day        Outlook  Temperature   Humidity     Wind   PlayTennis##  D1     :1   Overcast:4   Cool:4      High  :7   Strong:6   No :5     ##  D10    :1   Rain    :5   Hot :4      Normal:7   Weak  :8   Yes:9     ##  D11    :1   Sunny   :5   Mild:6                                      ##  D12    :1                                                            ##  D13    :1                                                            ##  D14    :1                                                            ##  (Other):8
#从上可知:数据集中的Day属性对分类和预测无用,可以删除#2、数据清洗dataset <- data[, 2:6]#3、建模library(e1071)model <- naiveBayes(dataset[, 1:4], dataset[, 5])#4、预测new_data <- data.frame("Rain","Hot","High","Strong")predict(model, new_data)
## [1] Yes## Levels: No Yes
new_data <- data.frame("Sunny","Mild","Normal","Weak")predict(model, new_data)
## [1] Yes## Levels: No Yes
0 0
原创粉丝点击