Classification and logistic regression
来源:互联网 发布:linux more cat 编辑:程序博客网 时间:2024/04/30 22:27
logistic 回归
1.问题:
在上面讨论回归问题时,讨论的结果都是连续类型,但如果要求做分类呢?即讨论结果为离散型的值。
2.解答:
假设:
其中:g(z) 的图形如下:
由此可知:当hθ(x) <0.5时我们可以认为为0,反之为1,这样就变成离散型的数据了。推导迭代式:
- 利用概率论进行推导,找出样本服从的分布类型,利用最大似然法求出相应的
θ - 因此:
- 利用概率论进行推导,找出样本服从的分布类型,利用最大似然法求出相应的
结果:
注意:这里的迭代式增量迭代法
Newton迭代法:
1.问题:
上述迭代法,收敛速度很慢,在利用最大似然法求解的时候可以运用Newton迭代法,即
2.解答:
推导:
- Newton迭代法是求
θ ,且f(θ)=0 ,刚好:l′(θ)=0 - 所以可以将Newton迭代法改写成:
- Newton迭代法是求
定义:
- 其中:
l′(θ) = -
因此:H矩阵就是l′′(θ) ,即H−1 =1/l′′(θ) - 所以:
- 其中:
应用:
- 特征值比较少的情况,否则
H−1 的计算量是很大的
- 特征值比较少的情况,否则
Logistic 0、1分类:
1.自己设定迭代次数
自己编写相应的循环,给出迭代次数以及下降坡度alpha,进行增量梯度下降。
主要函数及功能:
- Logistic_Regression 相当于主函数
- gradientDecent 梯度下降更新
θ 函数 - computeCost 计算损失
J 函数
Logistic_Regression
%% part0: 准备data = load('ex2data1.txt');x = data(:,[1,2]);y = data(:,3);pos = find(y==1);neg = find(y==0);x1 = x(:,1);x2 = x(:,2);plot(x(pos,1),x(pos,2),'r*',x(neg,1),x(neg,2),'co');pause;%% part1: GradientDecent and compute cost of J[m,n] = size(x);x = [ones(m,1),x];theta = zeros(3,1);J = computeCost(x,y,theta);theta = gradientDecent(x, y, theta);X = 25:100;Y = ( -theta(1,1) - theta(3,1)*X)/theta(2,1);plot(x(pos,2),x(pos,3),'r*',x(neg,2),x(neg,3),'co', X, Y, 'b');pause;
gradientDecent
function theta = gradientDecent(x, y, theta)%% compute GradientDecent 更新theta,利用的是增量梯度下降m = size(x,1);alph = 0.001;for iter = 1:150000 for j = 1:3 dec = 0; for i = 1:m dec = dec + (y(i) - sigmoid(x(i,:)*theta))*x(i,j); end theta(j,1) = theta(j,1) + dec*alph/m; endendend
sigmoid
function g = sigmoid(z)%% SIGMOID Compute sigmoid functoong = 1/(1+exp(-z));end
computeCost
function J = computeCost(x, y, theta)%% compute cost: Jm = size(x,1);J = 0;for i = 1:m J = J + y(i)*log(sigmoid(x(i,:)*theta)) + (1 - y(i))*log(1 - sigmoid(x(i,:)*theta));endJ = (-1/m)*J;end
结果如下:
2. 利用fminunc函数:
给出损失
主要函数及功能:
- Logistics_Regression 相当于主函数
- computeCost给出
J 和θ 的计算方式 - sigmoid函数
Logistics_Regression
%% part0: 准备data = load('ex2data1.txt');x = data(:,[1,2]);y = data(:,3);pos = find(y==1);neg = find(y==0);x1 = x(:,1);x2 = x(:,2);plot(x(pos,1),x(pos,2),'r*',x(neg,1),x(neg,2),'co');pause;%% part1: GradientDecent and compute cost of J[m,n] = size(x);x = [ones(m,1),x];theta = zeros(3,1);options = optimset('GradObj', 'on', 'MaxIter', 400);% Run fminunc to obtain the optimal theta% This function will return theta and the cost [theta, cost] = ... fminunc(@(t)(computeCost(x,y,t)), theta, options);X = 25:100;Y = ( -theta(1,1) - theta(3,1)*X)/theta(2,1);plot(x(pos,2),x(pos,3),'r*',x(neg,2),x(neg,3),'co', X, Y, 'b');pause;
sigmoid
function g = sigmoid(z)%% SIGMOID Compute sigmoid functoong = zeros(size(z));g = 1.0 ./ (1.0 + exp(-z));end
computeCost
function [J,grad] = computeCost(x, y, theta)%% compute cost: Jm = size(x,1);grad = zeros(size(theta));hx = sigmoid(x * theta); J = (1.0/m) * sum(-y .* log(hx) - (1.0 - y) .* log(1.0 - hx)); grad = (1.0/m) .* x' * (hx - y);end
结果
Logistic multi_class
1.条件
- 自己做的数据:
1,5,11,6,11.5,3.5,12.5,3.5,12,6,13,7,14,6,13.5,4.5,12,4,12,5,14,4,15,5,16,4,15,3,14,2,14,3,25,3,25,2,25,1.5,27,1.5,25,2.5,26,2.5,25.5,2.5,25,1,26,2,26,3,25,4,27,5,27,2,28,1,28,3,27,4,37,5,38.5,5.5,39,4,38,5.5,38,4.5,39.5,5.5,38,4.5,38.5,4.5,37,6,36,5,39,5,39,6,38,6,38,7,310,6,310,4,3
数据离散图:
2.算法推到
花费
J :更新
θ :算法思路(这个算法也叫one_vs_all):
如果样本分成K类,,那我们训练K组
θ ,依次考虑每一类样本,然后把其它的所有样本当做一类样本,这样就把这类样本和其它分开了。我们把考虑的那类样本的y 值改为1,其它为0。这样就得到K组θ 值。
3.代码实现:
这里采用fminuc函数实现
1.函数级功能简介:
- Logistic_Regression : 相当于主函数
- oneVsAll: 写成一个循环,依次计算出K组
θ ,利用fminunc调用计算函数 - computeCost:其中主要写
J &θ 更新函数
2.代码:
- Logistic_Regerssion:
%% part0: 准备data = load('data.txt');x = data(:,[1,2]);y = data(:,3);y1 = find(y==1);y2 = find(y==2);y3 = find(y==3);plot(x(y1,1),x(y1,2),'r*',x(y2,1),x(y2,2),'c+',x(y3,1),x(y3,2),'bo');pause;%% part1: GradientDecent and compute cost of J[m,n] = size(x);x = [ones(m,1),x];theta = zeros(3,3);% Run fminunc to obtain the optimal theta% This function will return theta and the cost [thetas,cost]= one_vs_all(x,y,theta);X = 1:10;Y1 = -(thetas(1,1) + thetas(2,1)*X)/thetas(3,1);Y2 = -(thetas(1,2) + thetas(2,2)*X)/thetas(3,2);Y3 = -(thetas(1,3) + thetas(2,3)*X)/thetas(3,3);plot(x(y1,2),x(y1,3),'r*',x(y2,2),x(y2,3),'c+',x(y3,2),x(y3,3),'bo');hold onplot(X,Y1,'r',X,Y2,'g',X,Y3,'c');
- one_vs_all:
function [theta,cost] = one_vs_all(x, y, theta)%% compute cost: Joptions = optimset('GradObj', 'on', 'MaxIter', 400);n = size(x,2);cost = zeros(n,1);num_labels = 3;for i = 1:num_labels L = logical(y==i); [theta(:,i), cost(i,1)] = ... fminunc(@(t)(computeCost(x,L,t)), theta(:,i), options);end
- computeCost:
function [J,grad] = computeCost(x, y, thetas)%% compute cost: Jm = size(x,1);grad = zeros(size(thetas));hx = sigmoid(x * thetas); J = (1.0/m) * sum(-y .* log(hx) - (1 - y) .* log(1 - hx)); grad = (1.0/m) .* x' * (hx - y);end
3.效果:
θ &J cost:
thetas = 6.3988 5.1407 -24.4266 -2.0773 0.2173 2.1641 0.9857 -1.9490 2.2038>> costcost = 0.1715 0.2876 0.1031
图形显示:
注意三条线组成的三角形,,这个地方的点不属于任何类别。
补充:
1.regularized Logistic Regerssion
- regularized 和 普通的Logistics没有太大的区别,只是在
J 的计算和θ 更新中加上了以前的结果。
2.one_vs_all:
1.简介:
其实one_vs_all还有一种算法,把
θ 当做单隐层前馈神经网络进行计算,比如说我们有K类样本,第一类样本我们可以看成[1,0,0,0...] 共k个数,,然后依次,,第i 个为1则代表第i 类样本。计算方式和上面multi_class一样。前馈神经网络模型如下:
2.代码:
函数介绍:
- one_vs_all:相当于主函数,
- IrCostFunction:花费
J 和θ 更新 - myPredict:统计训练误差
数据 和 训练得到的
θ :
点击这儿下载训练结果:
Local minimum found.Optimization completed because the size of the gradient is less thanthe default value of the function tolerance.<stopping criteria details>Local minimum found.Optimization completed because the size of the gradient is less thanthe default value of the function tolerance.<stopping criteria details>Training Set Accuracy: 100.000000
- one_vs_all:
function [all_theta,cost] = oneVsAll(X, y, num_labels)%ONEVSALL trains multiple logistic regression classifiers and returns all%the classifiers in a matrix all_theta, where the i-th row of all_theta %corresponds to the classifier for label i% [all_theta] = ONEVSALL(X, y, num_labels, lambda) trains num_labels% logisitc regression classifiers and returns each of these classifiers% in a matrix all_theta, where the i-th row of all_theta corresponds % to the classifier for label i% Some useful variablesm = size(X, 1);n = size(X, 2);% You need to return the following variables correctly all_theta = zeros(n+1,num_labels);% Add ones to the X data matrixX = [ones(m, 1),X];% ====================== YOUR CODE HERE ======================% Instructions: You should complete the following code to train num_labels% logistic regression classifiers with regularization% parameter lambda. %% Hint: theta(:) will return a column vector.%% Hint: You can use y == c to obtain a vector of 1's and 0's that tell use % whether the ground truth is true/false for this class.%% Note: For this assignment, we recommend using fmincg to optimize the cost% function. It is okay to use a for-loop (for c = 1:num_labels) to% loop over the different classes.%% fmincg works similarly to fminunc, but is more efficient when we% are dealing with large number of parameters.%% Example Code for fmincg:%% % Set Initial theta% initial_theta = zeros(n + 1, 1);% % % Set options for fminunc% options = optimset('GradObj', 'on', 'MaxIter', 50);% % % Run fmincg to obtain the optimal theta% % This function will return theta and the cost % [theta] = ...% fmincg (@(t)(lrCostFunction(t, X, (y == c), lambda)), ...% initial_theta, options);%cost = zeros(num_labels,1);options = optimset('GradObj', 'on', 'MaxIter', 50);for i =1:num_labels L = logical(y==i); [all_theta(:,i),cost(i,1)] = ... fminunc (@(t)(lrCostFunction(t, X, L)),all_theta(:,i), options);endmyPredict(all_theta,X,y);% =========================================================================end
- IrCostFunction:
function [J,grad] = lrCostFunction(thetas,x, y)%LRCOSTFUNCTION Compute cost and gradient for logistic regression with %regularization% J = LRCOSTFUNCTION(theta, X, y, lambda) computes the cost of using% theta as the parameter for regularized logistic regression and the% gradient of the cost w.r.t. to the parameters. % Initialize some useful valuesm = length(y); % number of training examples%单独调试该函数时用的代码%x = [ones(m,1),x];%theta = zeros(size(x,2),1);%y = logical(y==1);% ====================== YOUR CODE HERE ======================% Instructions: Compute the cost of a particular choice of theta.% You should set J to the cost.% Compute the partial derivatives and set grad to the partial% derivatives of the cost w.r.t. each parameter in theta%% Hint: The computation of the cost function and gradients can be% efficiently vectorized. For example, consider the computation%% sigmoid(X * theta)%% Each row of the resulting matrix will contain the value of the% prediction for that example. You can make use of this to vectorize% the cost function and gradient computations. %% Hint: When computing the gradient of the regularized cost function, % there're many possible vectorized solutions, but one solution% looks like:% grad = (unregularized gradient for logistic regression)% temp = theta; % temp(1) = 0; % because we don't add anything for j = 0 % grad = grad + YOUR_CODE_HERE (using the temp variable)%grad = zeros(size(thetas));hx = sigmoid(x * thetas); J = (1.0/m) * sum(-y .* log(hx) - (1 - y) .* log(1 - hx)); grad = (1.0/m) .* x' * (hx - y);% ================================================x=============end
- myPredict:
function p = myPredict(Theta1,X,y)%PREDICT Predict the label of an input given a trained neural network% p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the% trained weights of a neural network (Theta1, Theta2)% Useful valuesm = size(X, 1);num_labels = 10;% You need to return the following variables correctly p = zeros(size(X, 1), 1);% ====================== YOUR CODE HERE ======================% Instructions: Complete the following code to make predictions using% your learned neural network. You should set p to a % vector containing labels between 1 to num_labels.%% Hint: The max function might come in useful. In particular, the max% function can also return the index of the max element, for more% information see 'help max'. If your examples are in rows, then, you% can use max(A, [], 2) to obtain the max for each row.%z_2 = X*Theta1;a_2 = sigmoid(z_2);for i = 1:m for j = 1:num_labels if a_2(i,j) >= 0.5 p(i,1) = j; break; end endend fprintf('\nTraining Set Accuracy: %f\n', mean(double(p == y)) * 100);% =========================================================================end
与本博客相关知识链接:
- 由Logistics Regression multi_class中的one_vs _all算法 ——> 双隐层前馈神经网络 :BP神经网络
- 由Logistic Regerssion —–> SVM : 特征空间映射
- Logistic Regerssion 的理论解释: 概率论解释
2 0
- Classification and logistic regression
- Machine Learning—Classification and logistic regression
- Logistic Regression & linear classification
- MachineLearning_note:Logistic Regression(Classification)
- Logistic regression -binary classification
- 网易公开课讲义2 Classification and logistic regression 笔记
- 分类和逻辑回归(Classification and logistic regression)
- 分类和逻辑回归(Classification and logistic regression)
- Logistic Regression求解classification问题
- 第三讲.Classification and logistic regression (分类及logistic回归 / 二分类和多分类)
- MLlib - Classification and Regression
- Softmax Regression and Logistic Regression
- Linear Regression and Logistic Regression
- Logistic Regression and SVM
- 机器学习算法笔记1_2:分类和逻辑回归(Classification and Logistic regression)
- 斯坦福公开课Machine Learning笔记(二)--Classification and Logistic Regression
- Logistic Regression to do Binary Classification
- Machine Learning - Logistic Regression - Two-class Classification
- 原生广告的简介
- 论文选读Range-Sample Depth Feature for Action Recognition
- java queue 实战
- jQuery使用on代替delegate,live 写法区别
- java笔记24 装饰设计模式
- Classification and logistic regression
- c++错题库
- ViewPager实现应用的欢迎界面
- 解决在MyEclipse中导入jQuery出现红叉
- 第一篇~
- 冒泡排序
- Oracle的SQL优化二
- 开始写博客
- mysql权限操作