机器学习LR入门

来源:互联网 发布:渔趣网有淘宝店吗 编辑:程序博客网 时间:2024/05/17 02:52
监督学习关注对事物未知表现的预测,一般包括分类问题(Classification)和回归问题(Regression),无监督学习倾向于对事物本身特性的分析,常用的技术包括
数据降维(Dimensionality Reduction)和聚类问题(Clustering)
下面为‘良/恶性乳腺癌肿瘤预测’代码
# -*- coding: utf-8 -*-import pandas as pd# 读取数据df_train=pd.read_csv('E:\Datasets\Breast-Cancer/breast-cancer-train.csv')df_test=pd.read_csv('E:\Datasets\Breast-Cancer/breast-cancer-test.csv')df_test_positive=df_test.loc[df_test['Type']==0][['Clump Thickness', 'Cell Size']]df_test_negative=df_test.loc[df_test['Type']==1][['Clump Thickness', 'Cell Size']]import  matplotlib.pyplot as pltplt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker='o',s=200,c='red')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'],marker='x',s=150,c='black')import  numpy as npfrom sklearn.linear_model import LogisticRegressionlr=LogisticRegression()lr.fit(df_train[['Clump Thickness', 'Cell Size']],df_train['Type'])print 'Score:',lr.score(df_test[['Clump Thickness', 'Cell Size']],df_test['Type'])intercept=lr.intercept_# 得到的coef_形式为[[]],因此用coef=lr.coef_[0,:]去掉嵌套的[]coef=lr.coef_[0,:]lx = np.arange(0, 12)ly=(-intercept-lx*coef[0])/coef[1]import  matplotlib.pyplot as pltplt.plot(lx,ly,c='blue')plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker='o',s=200,c='red')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'],marker='x',s=150,c='black')plt.show()
0 0
原创粉丝点击