python机器学习及实战-Python基础综合实践
来源:互联网 发布:淘宝八十字评论 编辑:程序博客网 时间:2024/06/07 04:50
#读取数据import pandas as pddf_train = pd.read_csv('breast-cancer-train.csv')df_test = pd.read_csv('breast-cancer-test.csv')#print(df_train.info())#print(df_test.info())df_test_negative = df_test.loc[df_test['Type'] == 0][['Clump Thickness', 'Cell Size']]#将Type这一列值等于0的行的Clump Thickness,Cell Size列取出来,有点拗口df_test_positive = df_test.loc[df_test['Type'] == 1][['Clump Thickness', 'Cell Size']]#将Type这一列值等于1的行的Clump Thickness,Cell Size列取出来#print(df_test_negative)#print(df_test_positive)#绘制散点图1import matplotlib.pyplot as pltplt.scatter(df_test_negative['Clump Thickness'], df_test_negative['Cell Size'], marker = 'o', s = 200, c = 'red')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()#绘制二维直线图2import numpy as npintercept = np.random.random([1])coef = np.random.random([2])lx=np.arange(0, 12)ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c='yellow')plt.show()#测试样本正样本和负样本散点图图3plt.scatter(df_test_negative['Clump Thickness'], df_test_negative['Cell Size'], marker = 'o', s = 200, c = 'red')plt.scatter(df_test_positive['Clump Thickness'], df_test_positive['Cell Size'], marker = 'x', s = 150, c = 'black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()#训练样本前十行训练的线性分类器图4from sklearn.linear_model import LogisticRegressionlr = LogisticRegression()lr.fit(df_train[['Clump Thickness', 'Cell Size']][:10], df_train['Type'][:10])#采用训练样本的前十行进行训练#print(df_train[['Clump Thickness', 'Cell Size']][:10])#print(df_train['Type'][:10])print('Testing accuracy (10 training samples):', lr.score(df_test[['Clump Thickness', 'Cell Size']], df_test['Type']))intercept = lr.intercept_coef = lr.coef_[0, :]ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c = 'green')plt.scatter(df_test_negative['Clump Thickness'], df_test_negative['Cell Size'], marker = 'o', s = 200, c = 'red')plt.scatter(df_test_positive['Clump Thickness'], df_test_positive['Cell Size'], marker = 'x', s = 200, c = 'black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()#所有样本训练的线性分类器图5lr = LogisticRegression()lr.fit(df_train[['Clump Thickness', 'Cell Size']], df_train['Type'])#采用所有训练样本进行训练print('Testing accuracy (all traning samples):', lr.score(df_test[['Clump Thickness', 'Cell Size']], df_test['Type']))intercept = lr.intercept_coef = lr.coef_[0, :]ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c = 'blue')plt.scatter(df_test_negative['Clump Thickness'], df_test_negative['Cell Size'], marker = 'o', s = 200, c = 'red')plt.scatter(df_test_positive['Clump Thickness'], df_test_positive['Cell Size'], marker = 'x', s= 200, c = 'black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()
运行结果:
Testing accuracy (10 training samples): 0.868571428571Testing accuracy (all traning samples): 0.937142857143效果图:
所用到的训练数据和测试数据链接链接:http://pan.baidu.com/s/1c30cDS 密码:h8a0
阅读全文
0 0
- python机器学习及实战-Python基础综合实践
- Python机器学习及实践:
- Python机器学习实践与Kaggle实战
- 读 《Python机器学习及实践 》感受
- 《Python机器学习及实践》----模型实用技巧
- 《机器学习实战》Logistic回归python 2.7实践错误总结
- 机器学习(4):python基础及fft、svd、股票k线图、分形等实践
- python机器学习及实践学习笔记2-编码问题
- 《Python机器学习及实践》----监督学习经典模型
- Python && 机器学习基础
- python与机器学习实战
- bayes python 机器学习实战
- python 机器学习实战KNN
- python机器学习实战:Adaboost
- Python机器学习实战教程
- 机器学习实战python实例
- 《python机器学习及实践》书籍代码练习
- Python机器学习及实践——简介篇2
- (1668): 割点和割边
- Java实现主线程等待子线程
- makefile 指定文件的生成目录
- 1035. Password (20)
- Python高级特性(切片 迭代 列表生成式 生成器 迭代器)学习笔记
- python机器学习及实战-Python基础综合实践
- 2017 ACM-ICPC 亚洲区(乌鲁木齐赛区)网络赛 E题
- Ubuntu下安装anaconda以及TensorFlow的安装步骤(附上相关的问题解决)
- 如何在windows下像在linux使用命令行
- QPalette
- base64加解密的三种方法
- auto_ptr 和 STL容器的冲突与陷阱
- JAVA工具类(10)--- 随机生成字符串工具类randomUtil
- 删除右键文件下拉菜单(非打开方式)中的wine选项