决策树(Decision Tree)Demo

来源:互联网 发布:刘雯怎么培养气质知乎 编辑:程序博客网 时间:2024/06/05 02:10
#-*-coding:utf-8-*-from sklearn.feature_extraction import DictVectorizerimport csvfrom sklearn import treefrom sklearn import preprocessingfrom sklearn.externals.six import StringIO# 读取csv文件并将特征值放入dict的list中和标签的list中allElectronicsData = open(r'F:\data.csv', 'rb')reader = csv.reader(allElectronicsData)headers = reader.next()print(headers)featureList = []labelList = []for row in reader:    labelList.append(row[len(row)-1])    rowDict = {}    for i in range(1, len(row)-1):        rowDict[headers[i]] = row[i]    featureList.append(rowDict)print(featureList)# 矩阵化特征vec = DictVectorizer()dummyX = vec.fit_transform(featureList) .toarray()print("dummyX: " + str(dummyX))print(vec.get_feature_names())print("labelList: " + str(labelList))# 矩阵化标签lb = preprocessing.LabelBinarizer()dummyY = lb.fit_transform(labelList)print("dummyY: " + str(dummyY))# 使用决策树进行分类# entropy:使用ID3算法clf = tree.DecisionTreeClassifier(criterion='entropy')clf = clf.fit(dummyX, dummyY)print("clf: " + str(clf))# Visualize modelwith open("F:filename.dot", 'w') as f:    f = tree.export_graphviz(clf, feature_names=vec.get_feature_names(), out_file=f)oneRowX = dummyX[0, :]print("oneRowX: " + str(oneRowX))newRowX = oneRowXnewRowX[0] = 1newRowX[2] = 0print("newRowX: " + str(newRowX))predictedY = clf.predict(newRowX.reshape(1, -1))print("predictedY: " + str(predictedY))

data.csv里的数据如下:

这里写图片描述

原创粉丝点击