调用python的sklearn实现Logistic Reression算法
来源:互联网 发布:面包车拉货软件 编辑:程序博客网 时间:2024/06/05 00:27
转载自:http://www.itnose.NET/detail/6197189.html
先说如何实现,其中的导入数据库和类、方法的关系,之前不是很清楚,现在知道了。。。
from numpy import * from sklearn.datasets import load_iris # import datasets# load the dataset: irisiris = load_iris() samples = iris.data#print samples target = iris.target # import the LogisticRegressionfrom sklearn.linear_model import LogisticRegression classifier = LogisticRegression() # 使用类,参数全是默认的classifier.fit(samples, target) # 训练数据来学习,不需要返回值x = classifier.predict([5, 3, 5, 2.5]) # 测试数据,分类返回标记print x #其实导入的是sklearn.linear_model的一个类:LogisticRegression, 它里面有许多方法#常用的方法是fit(训练分类模型)、predict(预测测试样本的标记)#不过里面没有返回LR模型中学习到的权重向量w,感觉这是一个缺陷
上面使用的
classifier = LogisticRegression() # 使用类,参数全是默认的
是默认的,所有的参数全都是默认的,其实我们可以自己设置许多。这需要用到官方给定的参数说明,如下:
sklearn.linear_model.LogisticRegression class sklearn.linear_model. LogisticRegression ( penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None )Logistic Regression (aka logit, MaxEnt) classifier.
In the multiclass case, the training algorithm uses a one-vs.-all (OvA) scheme, rather than the “true” multinomial LR.
This class implements L1 and L2 regularized logistic regression using the liblinear library. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).
Parameters: Attributes:penalty : string, ‘l1’ or ‘l2’ 惩罚项的种类
Used to specify the norm used in the penalization.
dual : boolean
Dual or primal formulation. Dual formulation is only implemented for l2 penalty. Prefer dual=False when n_samples > n_features.
C : float, optional (default=1.0)
Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.
fit_intercept : bool, default: True
Specifies if a constant (a.k.a. bias or intercept) should be added the decision function.
intercept_scaling : float, default: 1
when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased
class_weight : {dict, ‘auto’}, optional 考虑类不平衡,类似于代价敏感
Over-/undersamples the samples of each class according to the given weights. If not given, all classes are supposed to have weight one. The ‘auto’ mode selects weights inversely proportional to class frequencies in the training set.
random_state: int seed, RandomState instance, or None (default) :
The seed of the pseudo random number generator to use when shuffling the data.
tol: float, optional :
Tolerance for stopping criteria.
`coef_` : array, shape = [n_classes, n_features]
Coefficient of the features in the decision function.
coef_ is readonly property derived from raw_coef_ that follows the internal memory layout of liblinear.
`intercept_` : array, shape = [n_classes]
Intercept (a.k.a. bias) added to the decision function. If fit_intercept is set to False, the intercept is set to zero.
LogisticRegression类中的方法有如下几种,我们常用的是fit和predict~
Methods
decision_function(X)Predict confidence scores for samples.densify()Convert coefficient matrix to dense array format.fit(X, y)Fit the model according to the given training data. 用来训练LR分类器,其中的X是训练样本,y是对应的标记向量fit_transform(X[, y])Fit to data, then transform it.get_params([deep])Get parameters for this estimator.predict(X)Predict class labels for samples in X. 用来预测测试样本的标记,也就是分类。X是测试样本集predict_log_proba(X)Log of probability estimates.predict_proba(X)Probability estimates.score(X, y[, sample_weight])Returns the mean accuracy on the given test data and labels.set_params(**params)Set the parameters of this estimator.sparsify()Convert coefficient matrix to sparse format.transform(X[, threshold])Reduce X to its most important features.使用predict返回的就是测试样本的标记向量,其实个人觉得还应有LR分类器中的重要过程参数:权重向量,其size应该是和feature的个数相同。但是就没有这个方法,所以这就萌生了自己实现LR算法的念头,那样子就可以输出权重向量了。
参考链接:
http://www.cnblogs.com/xupeizhi/archive/2013/07/05/3174703.html
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression
转载自:http://www.itnose.net/detail/6197189.html
- 调用python的sklearn实现Logistic Reression算法
- 调用python的sklearn实现Logistic Reression算法
- 调用python的sklearn实现Logistic Reression算法
- python sklearn 分类算法简单调用
- 基于python的sklearn库的决策树算法基本实现
- logistic回归算法原理及python实现
- python sklearn 分类算法简单调用(借鉴)
- 机器学习教程之3-逻辑回归(logistic regression)的sklearn实现
- Logistic回归 Python实现
- Logistic回归算法实现
- 机器学习算法 --- 用Python原生码实现Logistic回归
- Logistic Regression 逻辑回归算法例子,python代码实现
- 理解Logistic回归算法原理与Python实现
- logistic回归算法详细分析与Python代码实现注释
- 【python sklearn】kmeans算法运用
- 调用sklearn实现逻辑回归
- sklearn logistic使用
- python中sklearn机器学习实现的博客
- 面向对象——抽象类abstract
- 1020. 月饼 (25)
- ArcGIS Portal 环境快速部署方法--ArcGIS Enterprise Builder
- epoll——高并发的功臣
- ubuntu14.04如何在线安装eclipse以及C/C++开发组件,搭建软件开发平台
- 调用python的sklearn实现Logistic Reression算法
- 微信小程序 wx.uploadFile 的编码坑
- 重构--七层登录
- STL空间配置器
- 默认成员函数的几种调用情景
- Python数据分析笔记
- openCV鼠标事件实例
- HotSpot虚拟机中对象的创建
- TCP/IP详解学习笔记(13)-TCP坚持定时器,TCP保活定时器