针对python机器学习与实战代码在python3上运行出现的错误分析和warning的修改代码34—38
来源:互联网 发布:编程课 编辑:程序博客网 时间:2024/05/23 02:23
#导入model_selection进行数据分割from sklearn.model_selection import train_test_splitimport numpy as npx = boston.datay = boston.targetx_train, x_test, y_train, y_test = train_test_split(x, y , test_size=0.25, random_state=33)print("The max target value is", np.max(boston.target))print("The min target value is", np.min(boston.target))print("The average target value is", np.mean(boston.target))from sklearn.preprocessing import StandardScalerss_x = StandardScaler()ss_y = StandardScaler()x_train = ss_x.fit_transform(x_train)x_test = ss_x.transform(x_test)y_train = ss_y.fit_transform(y_train)y_test = ss_y.transform(y_test)from sklearn.linear_model import LinearRegressionlr = LinearRegression()lr.fit(x_train, y_train)lr_y_pred = lr.predict(x_test)from sklearn.linear_model import SGDRegressorsgdr = SGDRegressor()sgdr.fit(x_train, y_train)sgdr_y_pred = sgdr.predict(x_test)print('The value of default measurement of LinearRegression is', lr.score(x_train, y_train))from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_errorprint('The value of R-sqaured of LinearRegression is', r2_score(y_test, lr_y_pred))print('The mean squared error of LinearRegression is', mean_squared_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(lr_y_pred)))print('The mean absolute error of LinearRegression is', mean_absolute_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(lr_y_pred)))print('The value of default of SDGRegressor is', sgdr.score(x_test, y_test))print('The vlue of R-squared of SGDRegressor is', r2_score(y_test, sgdr_y_pred))print('The mean squared error of SGDRegressor is', mean_squared_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(sgdr_y_pred)))print('The mean absolute error of SGDRegressor is', mean_absolute_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(sgdr_y_pred)))
运行python机器学习及实践中利用线性回归对Boston房价进行预测的时候,会出现如下错误:
Traceback (most recent call last): File "D:\Python362\a_机器学习及实战\LinearRegression.py", line 22, in <module> y_train = ss_y.fit_transform(y_train) File "D:\Python362\lib\site-packages\sklearn\base.py", line 518, in fit_transform return self.fit(X, **fit_params).transform(X) File "D:\Python362\lib\site-packages\sklearn\preprocessing\data.py", line 590, in fit return self.partial_fit(X, y) File "D:\Python362\lib\site-packages\sklearn\preprocessing\data.py", line 612, in partial_fit warn_on_dtype=True, estimator=self, dtype=FLOAT_DTYPES) File "D:\Python362\lib\site-packages\sklearn\utils\validation.py", line 410, in check_array "if it contains a single sample.".format(array))ValueError: Expected 2D array, got 1D array instead:array=[ 33.8 20.3 10.2 22. 21.2 24.2 29. 22.7 21.8 34.9 25.2 20.9 19.4 20. 14. 30.1 33.1 20.6 22.6 33.4 20.1 10.5 15.6 16.8 22.6 34.6 19.8 17.8 22. 17.4 15.4 16.7 22.6 15.1 21.4 15.3 7.4 13.9 17.6 25. 46.7 17.1 23.1 18.7 21.9 18.9 26.7 22.3 25. 14.6 42.8 17.3 22.2 36.5 22.8 19.9 36.2 50. 25. 22.2 17.5 23.9 19.6 24.7 28.4 8.7 21.7 20. 19.9 24.5 15. 7. 15.2 20.4 8.5 17.1 30.1 15. 19.4 23.2 17. 18.9 50. 25. 46. 7.2 17.8 35.1 24.3 5. 16.6 21.8 28.5 22. 20.3 21.7 26.4 30.7 50. 17.2 26.6 21. 23.4 19.5 20.7 23.3 48.8 15.6 19.6 17.4 21.7 14.6 37.9 9.7 17.8 12.1 20.1 29.9 26.4 18.8 32.5 15.7 13.4 21.7 23.6 11.9 13.8 22.2 13. 33.2 50. 22.3 22.4 23.8 29.1 20.8 23.7 19.8 13.9 28.4 45.4 23.7 50. 18. 17.1 18.9 10.4 24.7 23.9 23. 20.2 8.5 14.2 20.3 18.5 12. 19.3 20.6 16.1 12.3 23.1 22.7 20.3 16.7 27.9 21.4 8.1 37.6 15.6 29.6 22.9 24.8 24.4 50. 28.7 50. 16.5 18.2 50. 16.2 14.1 21.2 18.4 25. 50. 21.2 20.4 15.2 22. 19.8 22.1 23.9 24.6 23.9 21.7 44.8 7.2 18.5 20.1 23.3 19.2 29.1 31. 22.9 27.5 39.8 22. 22.8 22.9 14.3 14.5 22.4 19.3 32. 20.1 18.3 24.5 18.4 23.1 22.6 20.2 17.8 31.6 43.5 36.4 11.3 20.5 23.2 29.8 20.6 24.3 18.1 19.1 21.4 31.5 19.2 14.3 24.8 21.1 18.2 48.3 19.4 21.2 10.9 27.5 34.7 14.4 22.8 17.8 50. 24.4 12.8 30.8 28.2 25. 33.1 27.5 12.7 43.1 13.4 21.5 33.4 23.8 21. 26.6 18.5 23. 24.1 20.5 32.2 14.4 11.8 19.5 23.7 13.2 29. 18.2 18.6 23. 42.3 17.2 16.2 20. 30.3 20.9 20.4 24.8 18.7 16.8 22.5 18.8 23.7 23.8 19.6 20.4 16.1 44. 19.3 17.4 10.2 11.7 37.2 11. 23.6 22.8 15. 34.9 17.9 24.4 24.5 6.3 29.4 10.4 38.7 20. 19.4 37. 50. 18.7 48.5 35.4 23.4 7. 50. 20.7 35.4 9.6 25.1 16.1 27. 16.6 13.3 25. 24. 19.6 29.6 21.7 19.1 22. 13.3 27.1 22.9 33.2 13.5 14.5 8.3 41.7 31.2 23.9 23.1 24.3 18.3 20.8 28. 19.5 21.5 13.1 12.5 31.7 13.1 23.1 14.5 22.2 13.1 37.3 22. 10.2 5. 19.3 16. 18.6 50. 31.6 24.1 15.6 19.4 23.3 23.2 13.6].Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.主要是工具包版本更新造成的,面对上面问题,我们根据上面错误的提示相应的找出出错的两行代码:
y_train = ss_y.fit_transform(y_train)
y_test = ss_y.transform(y_test)
主要问题出现在上面的两行代码中,例如数据格式为[1, 2, 3, 4]就会出错,如果把这行数据转换成[[1], [2], [3], [4]]就不会出错了,所以要对上面导致出错的两行代码做出修改:
y_train = ss_y.fit_transform(y_train.reshape(-1, 1))
y_test = ss_y.transform(y_test.reshape(-1,1))
再运行程序,错误消除。但是代码还会出现一些提醒,如下:
Warning (from warnings module): File "D:\Python362\lib\site-packages\sklearn\linear_model\stochastic_gradient.py", line 84 "and default tol will be 1e-3." % type(self), FutureWarning)FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.Warning (from warnings module): File "D:\Python362\lib\site-packages\sklearn\utils\validation.py", line 547 y = column_or_1d(y, warn=True)DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().第一个提醒是说版本不同,SGDRegressor所默认设置的迭代次数也不同,既然这样为了消除这个提醒,我们可以在sgdr = SGDRegressor()中把迭代次数重新设置一下sgdr = SGDRegressor(max_iter=5)本文设置最大迭代次数为5,也可以设置成1000,这个根据自己的需求设置,第一个warning消除。
第二个提醒是要对预测输出y做出ravel()转换,例如下面操作:
>>> x = np.array([[1, 2, 3], [4, 5, 6]])>>> print(np.ravel(x))[1 2 3 4 5 6]也就是说我的y是2D的形式(shapes, 1),要把二维的形式改成1D的形式(shapes, ),,这就可以对fit输入的y_train作y_train.ravel()这样的转换,即把sgdr.fit(x_train, y_train)代码修改为sgdr.fit(x_train, y_train.ravel()),warning就会消失了
最终修改后的完整代码为:
from sklearn.datasets import load_bostonboston = load_boston()print(boston.DESCR)#导入model_selection进行数据分割from sklearn.model_selection import train_test_splitimport numpy as npx = boston.datay = boston.targetx_train, x_test, y_train, y_test = train_test_split(x, y , test_size=0.25, random_state=33)print("The max target value is", np.max(boston.target))print("The min target value is", np.min(boston.target))print("The average target value is", np.mean(boston.target))from sklearn.preprocessing import StandardScalerss_x = StandardScaler()ss_y = StandardScaler()x_train = ss_x.fit_transform(x_train)x_test = ss_x.transform(x_test)y_train = ss_y.fit_transform(y_train.reshape(-1, 1))#修改1y_test = ss_y.transform(y_test.reshape(-1,1))#修改2from sklearn.linear_model import LinearRegressionlr = LinearRegression()lr.fit(x_train, y_train)lr_y_pred = lr.predict(x_test)from sklearn.linear_model import SGDRegressorsgdr = SGDRegressor(max_iter=5)#修改3sgdr.fit(x_train, y_train.ravel())#修改4sgdr_y_pred= sgdr.predict(x_test)print('The value of default measurement of LinearRegression is', lr.score(x_train, y_train))from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_errorprint('The value of R-sqaured of LinearRegression is', r2_score(y_test, lr_y_pred))print('The mean squared error of LinearRegression is', mean_squared_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(lr_y_pred)))print('The mean absolute error of LinearRegression is', mean_absolute_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(lr_y_pred)))print('The value of default of SDGRegressor is', sgdr.score(x_test, y_test))print('The vlue of R-squared of SGDRegressor is', r2_score(y_test, sgdr_y_pred))print('The mean squared error of SGDRegressor is', mean_squared_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(sgdr_y_pred)))print('The mean absolute error of SGDRegressor is', mean_absolute_error(ss_y.inverse_transform(y_test), ss_y.inverse_transform(sgdr_y_pred)))最终运行结果:
Boston House Prices dataset===========================Notes------Data Set Characteristics: :Number of Instances: 506 :Number of Attributes: 13 numeric/categorical predictive :Median Value (attribute 14) is usually the target :Attribute Information (in order): - CRIM per capita crime rate by town - ZN proportion of residential land zoned for lots over 25,000 sq.ft. - INDUS proportion of non-retail business acres per town - CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) - NOX nitric oxides concentration (parts per 10 million) - RM average number of rooms per dwelling - AGE proportion of owner-occupied units built prior to 1940 - DIS weighted distances to five Boston employment centres - RAD index of accessibility to radial highways - TAX full-value property-tax rate per $10,000 - PTRATIO pupil-teacher ratio by town - B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town - LSTAT % lower status of the population - MEDV Median value of owner-occupied homes in $1000's :Missing Attribute Values: None :Creator: Harrison, D. and Rubinfeld, D.L.This is a copy of UCI ML housing dataset.http://archive.ics.uci.edu/ml/datasets/HousingThis dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonicprices and the demand for clean air', J. Environ. Economics & Management,vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, 'Regression diagnostics...', Wiley, 1980. N.B. Various transformations are used in the table onpages 244-261 of the latter.The Boston house-price data has been used in many machine learning papers that address regressionproblems. **References** - Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261. - Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann. - many more! (see http://archive.ics.uci.edu/ml/datasets/Housing)The max target value is 50.0The min target value is 5.0The average target value is 22.5328063241The value of default measurement of LinearRegression is 0.754922212481The value of R-sqaured of LinearRegression is 0.6763403831The mean squared error of LinearRegression is 25.0969856921The mean absolute error of LinearRegression is 3.5261239964The value of default of SDGRegressor is 0.655430817284The vlue of R-squared of SGDRegressor is 0.655430817284The mean squared error of SGDRegressor is 26.7183404942The mean absolute error of SGDRegressor is 3.53570681806
阅读全文
6 0
- 针对python机器学习与实战代码在python3上运行出现的错误分析和warning的修改代码34—38
- python机器学习与实战中运行代码出现警告的处理
- python机器学习-预测分析核心算法3-2代码在python3下运行遇到的一些问题
- python机器学习及实战代码13-16,程序运行时出现提醒及修改
- Python3下机器学习实战KNN代码出现AttributeError: ‘dict’ object has no attribute错误
- apriori算法的代码,python实现,参考《机器学习实战》
- 机器学习实战python版本matplotlib安装遇到的各种问题和代码演示
- <机器学习系统设计>第十章(上)----代码与错误分析
- 机器学习实战--第三版代码修改
- 读书笔记:机器学习实战(1)——章2的knn代码和个人改进与注释
- 读书笔记:机器学习实战(2)——章3的决策树代码和个人理解与注释
- 读书笔记:机器学习实战(3)——章4的朴素贝叶斯分类代码和个人理解与注释
- 读书笔记:机器学习实战(5)——章6的支持向量机代码和个人理解与注释
- 爬虫实战13—网页分类与针对文本的机器学习应用
- 机器学习实战:KNN 代码注释(python)
- 机器学习实战:python算法代码汇总
- 机器学习实战-决策树ID3-python代码
- 机器学习实战-AdaBoost-python代码
- PhpStorm 8.0.3中文版
- JSP
- iOS 多线程概念
- 你绝对不能错过的7款开源硬件架构
- ReactRouter升级 v2 to v4
- 针对python机器学习与实战代码在python3上运行出现的错误分析和warning的修改代码34—38
- 提升PPT逼格的利器!只需1招,让PPT页面化腐朽为神奇~
- Python异常:Django1.11--context must be a dict rather than RequestContext.
- 解决MySQL每隔8小时自动断开连接的问题
- JDBC中DAO+service设计思想
- #if _MSC_VER > 1000 #pragma once #endif 作用解释
- Python3.6简单的操作Mysql数据库
- js中数组原型Array、自定义原型函数Array.prototype
- 综合案例【session、context、filter、listener...】