《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter1.1
来源:互联网 发布:济南网络推广招聘 编辑:程序博客网 时间:2024/05/29 18:36
本博客代码是对书《python机器学习及实践-从零开始通往kaggle竞赛之路》,基于Python3.6的实现,并且使用的所需的库是最新的(2017/12/8)。
chapter1_1
import pandas as pd #导入pandas 库df_train = pd.read_csv('../Datasets/Breast-Cancer/breast-cancer-train.csv') #读取目录下的数据,如果代码与文件路径不在一起,则需要另行设置df_test = pd.read_csv('../Datasets/Breast-Cancer/breast-cancer-test.csv')print(df_train.head(5)) #显示df_train 前列5行数据,了解数据大概样式print(df_test.head(5))df_test_negative = df_test.loc[df_test['Type'] == 0][['Clump Thickness', 'Cell Size']] #先对test 的“Type”行进行判断,然后切分其他两列数据df_test_positive = df_test.loc[df_test['Type'] == 1][['Clump Thickness', 'Cell Size']]print(df_test_negative.head())print(df_test_positive.head())import matplotlib.pyplot as pltplt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker = 'o', s=20, c='green')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'], marker = 'x', s=10, c='red')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()import numpy as npintercept = np.random.random([1])coef = np.random.random([2])lx = np.arange(0, 12)ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c='yellow')plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'], marker = 'o', s=200, c='red')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'], marker = 'x', s=150, c='black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()from sklearn.linear_model import LogisticRegressionlr = LogisticRegression()lr.fit(df_train[['Clump Thickness', 'Cell Size']][:10], df_train['Type'][:10])print ('Testing accuracy (10 training samples):', lr.score(df_test[['Clump Thickness', 'Cell Size']], df_test['Type']))intercept = lr.intercept_coef = lr.coef_[0, :]ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c='green')plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'], marker = 'o', s=200, c='red')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'], marker = 'x', s=150, c='black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()lr = LogisticRegression()lr.fit(df_train[['Clump Thickness', 'Cell Size']], df_train['Type'])print ('Testing accuracy (all training samples):', lr.score(df_test[['Clump Thickness', 'Cell Size']], df_test['Type']))intercept = lr.intercept_coef = lr.coef_[0, :]ly = (-intercept - lx * coef[0]) / coef[1]plt.plot(lx, ly, c='blue')plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'], marker = 'o', s=200, c='red')plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'], marker = 'x', s=150, c='black')plt.xlabel('Clump Thickness')plt.ylabel('Cell Size')plt.show()
发布修改代码已经过作者同意,如果有疑问,可以留言给我。
阅读全文
0 0
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter1.1
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter2.1.1.1
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter2.1.1.2
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter2.1.1.3
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter2.1.1.4
- 《Python 机器学习及实践--从零开始通往kaggle竞赛之路》笔记
- Python机器学习实践与Kaggle实战
- kaggle机器学习竞赛冠军及分享
- Python机器学习及实践:
- 《python机器学习及实践》书籍代码练习
- 通往python之路(一)
- 机器学习-Kaggle竞赛-Titanic
- kaggle机器学习教程(Python实现)
- Python机器学习实战与kaggle实战
- Kaggle 机器学习竞赛冠军及优胜者的源代码汇总
- Kaggle 机器学习竞赛冠军及优胜者的源代码汇总
- Kaggle 机器学习竞赛冠军及优胜者的源代码汇总
- Kaggle 机器学习竞赛冠军及优胜者的源代码汇总
- 物料主数据增强(MM01/MM02/MM03)
- LED驱动
- Eclipse快捷键
- Android自定义ViewGroup:实现简单的垂直方向线性布局(2)
- 项目中遇到的 蜜汁 VC 管理fang'an
- 《python机器学习及实践-从零开始通往kaggle竞赛之路(代码Python 3.6 版)》chapter1.1
- 希尔排序
- Ubuntu安装yaml
- LeetCode练习记录2017/12/8
- android7.1增加一个开机自启动的bin应用遇到的权限问题
- No mapping found for HTTP request with URI [/service/model/xxx/json] in DispatcherServlet
- 学习Maven日记-1
- Python 配置文件(.ini、 .conf、 .cfg)的读写
- jsp学习4-属性相关