DMX - SQL SERVER 数据挖掘决策树

来源:互联网 发布:qq业务宣传图ps源码 编辑:程序博客网 时间:2024/05/20 06:26

在SQL SERVER中, 决策树速度快,应用广泛,可以用于分类,回归,关联分析。

BOL上有详细教程,这里不赘述。

下面是一例预测查询:

select TM.fullname,vba!format(PredictProbability([Bike Buyer]),'Percent') as [Probability]from[TM Decision Tree]natural prediction joinopenquery([AdventureWorksDW2012],'select FirstName + '' '' + LastName as FullName, DateDiff(yy,BirthDate,GetDate()) as Age,Education, Gender, HouseOwnerFlag as [House Owner Flag],MaritalStatus as [Marital Status], NumberChildrenAtHomeas [Number Children At Home], Occupation, TotalChildren as [TotalChildren],NumberCarsOwned as [Number Cars Owned], YearlyIncome as [Yearly Income]from ProspectiveBuyer') as TMwhere Predict([Bike Buyer]) = 1order by PredictProbability([Bike Buyer]) desc


当模型建好,需要考虑准确,进行交叉验证。

CALL SystemGetCrossValidationResults([Targeted Mailing],[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],2,0,'Bike Buyer',1,0.5)


 

然后,准确比较。

CALL SystemGetAccuracyResults ([Targeted Mailing],[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],3,'Bike Buyer',1,0.5)


ModelName AttributeName AttributeState PartitionIndex PartitionSize Test Measure Value
TM Decision Tree Bike Buyer 1 0 18484 Classification True Positive 6828
TM Decision Tree Bike Buyer 1 0 18484 Classification False Positive 2355
TM Decision Tree Bike Buyer 1 0 18484 Classification True Negative 6997
TM Decision Tree Bike Buyer 1 0 18484 Classification False Negative 2304
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Log Score -0.515976044561631
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Lift 0.177100303313995
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.281766535304062
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Positive 5591
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Positive 3106
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Negative 6246
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Negative 3541
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Log Score -0.673703697378885
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Lift 0.019372650496705
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.295231719425458
TM Neural Net Bike Buyer 1 0 18484 Classification True Positive 6165
TM Neural Net Bike Buyer 1 0 18484 Classification False Positive 2739
TM Neural Net Bike Buyer 1 0 18484 Classification True Negative 6613
TM Neural Net Bike Buyer 1 0 18484 Classification False Negative 2967
TM Neural Net Bike Buyer 1 0 18484 Likelihood Log Score -0.601339200639234
TM Neural Net Bike Buyer 1 0 18484 Likelihood Lift 0.091737147236361
TM Neural Net Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.350182211614771

 

简单解释,

 

 Null hypothesis (H0) is trueNull hypothesis (H0) is falseReject null hypothesisType I error
False positiveCorrect outcome
True negativeFail to reject null hypothesisCorrect outcome
True positiveType II error
False negative

 

如果需要,可以计算敏感性和明确性。

LIFT正好,LOG SCORE近0好,因此,上面三个模型比较,优劣顺序,决策树-》神经元网络-》朴素贝叶斯。

 

原创粉丝点击