逻辑回归模型实例(by Fminunc and Gradient descent)

来源：互联网发布：mac使用qq截图编辑：程序博客网时间：2024/06/15 00:55

本实例整理自斯坦福机器学习课程课后练习ex2

一、Binary classification

本例中是用机器学习通过已知的输入（两次考试成绩）和输出(是否被录取)来建立一个模型；然后根据输入的成绩来判断一个学生是否能被录取。

1.可视化(visualizing)

data = load('ex2data1.txt');X = data(:,1:2);y = data(:,3);%% step 1.Visualizingpos = find(y==1);neg = find(y==0);plot(X(pos,1),X(pos,2),'k+','LineWidth',2,'MarkerSize',7);hold on;plot(X(neg,1),X(neg,2),'ko','MarkerSize',7,'MarkerFaceColor', 'y');xlabel('Exam 1 score');ylabel('Exam 2 score');legend('Admitted', 'Not admitted');hold off;

这里写图片描述

2.Hypothesis function and Cost function

假设函数的形式如下：

h θ (x) = g (θ T x), g (z) = 1 1 + e - z; ⟹ h θ (x) = 1 1 + e - θ T x

然后先完成函数

g(z)在matlab中的定义：

%sigmoid.mfunction g = sigmoid(z)g = zeros(size(z));Denominator = 1 + exp(-z);g = 1 ./ Denominator ;end

代价函数的形式如下：

J (θ) = - 1 m [\sum i = 1 m y (i) log h θ (x (i)) + (1 - y (i)) log (1 - h θ (x (i)))]

此时，我们建立一个计算J(θ)的函数，名为costFunction()，其中返回值就是代价值。

% costFunction.mfunction J= costFunction( X,y,theta )m = length(y);h_theta = sigmoid(X * theta);J = (1/m)*(sum(-y .* log(h_theta) - ( 1 - y) .* log(1-h_theta)));end

3.确立优化参数的方法(fit parameter)

在确定好假设函数和代价之后，我们就需要选择一种方法求解我们的参数θ，即Minimize J(θ)；使得在这种情况下，函数J(θ)取得最小值(minimum)或者局部最小值(local minimum).

目前我们知道的方法有梯度下降(Gradient descent)算法和进阶优化算法(Advanced algorithm)。现在分别就进阶优化算和梯度下降算法来求解。

Advanced Optimization:

我们将使用Octave/Matlab中内置的一个函数fminunc来进行求解。此时我们将不需要手动来设置α的值，只需写出cost function 以及gradient即可，所以我们在costFunction.m中加上对每个参数的求导计算公式。

function [J,grad] = costFunction( X,y,theta )m = length(y);grad = zeros(size(theta));h_theta = sigmoid(X * theta);J = (1/m)*(sum(-y .* log(h_theta) - ( 1 - y) .* log(1-h_theta)));for i = 1:size(theta)grad(i) = (1/m) * sum((h_theta - y ) .* X(:,i));endend

下面是利用fminuc进行求解：

% Set options for fminuncoptions = optimset('GradObj', 'on', 'MaxIter', 400);% Run fminunc to obtain the optimal theta% This function will return theta and the cost[theta, cost] = ...fminunc(@(t)(costFunction(t, X, y)), initial theta, options);

下面是关于fminuc的使用介绍：

In this code snippet, we first defined the options to be used with fminunc.
Specifically, we set the GradObj option to on, which tells fminunc that our
function returns both the cost and the gradient. This allows fminunc to
use the gradient when minimizing the function. Furthermore, we set the
MaxIter option to 400, so that fminunc will run for at most 400 steps before
it terminates.

To specify the actual function we are minimizing, we use a “short-hand”
for specifying functions with the @(t) ( costFunction(X, y,t) ) . This
creates a function, with argument t, which calls your costFunction. This
allows us to wrap the costFunction for use with fminunc.

If you have completed the costFunction correctly, fminunc will converge on the right optimization parameters and return the final values of the cost and θ. Notice that by using fminunc, you did not have to write any loops yourself, or set a learning rate like you did for gradient descent. This is all done by fminunc: you only needed to provide a function calculating the cost and the gradient.

%% 以下是计算结果cost J = 0.203506theta -24.932770theta 0.204406theta 0.199616

画出Decision boundary

plotDecisionBoundary(theta,X,y);

这里写图片描述

预测：

chance = sigmoid([1,63,63]*theta);fprintf('chance = %f\n',chance);%% 预测结果chance = 0.627294

Matlab源码下载

梯度下降算法:

θ j : = θ j - α 1 m \sum i = 1 m (h θ (x (i)) - y (i)) x (i) j

此时，我们建立一个梯度下降的函数，名为gradientDescen()，其中theta为最终得到的参数值，J_history为每次梯度下降后的代价值。

%gradientDescent.mfunction [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)m = length(y); % number of training examplesJ_history = zeros(num_iters, 1);temp_matrix = zeros(size(X,2),1);for iter = 1:num_iters     prediction = sigmoid(X*theta) - y;     for i = 1:size(X,2);        temp_matrix(i) = sum(prediction .* X(:,i));    end    theta = theta - alpha * (1/m) * temp_matrix;    % Save the cost J in every iteration        J_history(iter) = costFunction(X, y, theta);    %fprintf('j=%f\n',J_history);endend

% test.m data = load('ex2data1.txt');X = data(:,1:2);y = data(:,3);%% step 1.VisualizingplotData(X,y);xlabel('Exam 1 score');ylabel('Exam 2 score');legend('Admitted', 'Not admitted');%% step 2. Hypothesis and Cost function % sigmoid.m  costFunction.m m = length(y); X = [ones(m,1),X];theta =-5*ones(size(X,2),1)+1E-10;%theta =zeros(size(X,2),1)-1E-6; %% step 3. Fit parameter [theta, J_history] = gradientDescent(X, y, theta, 0.0013, 100) figure; plot(J_history,'-r','LineWidth',2); axis([87 100 0.385 0.42]); hold on;figure;plotDecisionBoundary(theta,X,y);

这里写图片描述

我们可以看到，代价值接近0.38，但与之前的0.2相比还是略微的偏大。所以自然而然其最终的拟合程度肯定不如之前的高，而事实也是如此。

这里写图片描述

Matlab源码下载

阅读全文

0 0