【机器学习】逻辑回归（matlab实现）

来源：互联网发布：学c语言入门看什么书编辑：程序博客网时间：2024/05/18 03:32

参考吴恩达的course

1 载入数据

数据集下载地址
链接：http://pan.baidu.com/s/1kUUGiP5 密码：vvsa

载入数据，数据有三列，前两列是x1、x2，第三列是 y，可以形象化为两位面试官给应聘者打分，分数为x1、x2，y的值是0或者1，表示录用或者不录用。

data = load('ex2data1.txt');%(100,3)X = data(:, [1, 2]); y = data(:, 3);%y是0,1

可视化数据

plotData(X, y);% Put some labels hold on;xlabel('x1')ylabel('x2')legend('positive', 'negative')hold off;

plotData（）具体实现为：

function plotData(X, y)figure; hold on;pos = find(y == 1);neg = find(y == 0);%画正样本plot(X(pos , 1) , X(pos , 2) , 'k+' , 'LineWidth' , 2 , 'MarkerSize' , 10);% LineWidth：线宽% MarkerSize：标记点的大小% k黑% MarkerFaceColor：标记点内部的填充颜色% y黄%画负样本plot(X(neg , 1) , X(neg , 2) , 'ko' , 'MarkerFaceColor' ,'y', 'MarkerSize' , 10);hold off;end

横纵坐标分别为x1、x2

2 实现

2.1 初始化

[m, n] = size(X);%（100，2）X = [ones(m, 1) X];%（100，3），第一列全是1initial_theta = zeros(n + 1, 1);%（3，1）

2.2 sigmoid函数

sigmoid函数为：

g (z) = 1 1 + e - z

推导可知：

g' (z) = d d z 1 1 + e - z = 1 ( 1 + e - z ) 2 (e - z)

= 1 1 + e - z \cdot (1 - 1 1 + e - z)

= g (z) (1 - g (z))

这一条性质非常好

预测函数为sigmoid函数的形式：

h θ (x) = g (θ T x) = 1 1 + e - θ T x

具体实现为：

function g = sigmoid(z)g = zeros(size(z));g = 1 ./ (1 + exp(-z));end

2.3 计算损失函数和梯度

logistic regression 交叉熵损失函数如下：

J (θ) = 1 m \sum i = 1 m [- y (i) l o g h (x (i)) - (1 - y (i)) l o g (1 - h (x (i)))]

这个地方log是ln，求梯度过程如下

\partial \partial θ j J (θ) = - 1 m \sum i = 1 m ⎛ ⎝ ⎜ y (i) 1 g ( θ T x ( i ) ) - (1 - y (i)) 1 1 - g ( θ T x ( i ) ) ⎞ ⎠ ⎟ \partial \partial θ j g (θ T x (i))

括号右边的数乘进去

= - 1 m \sum i = 1 m ⎛ ⎝ ⎜ y (i) 1 g ( θ T x ( i ) ) - (1 - y (i)) 1 1 - g ( θ T x ( i ) ) ⎞ ⎠ ⎟ g (θ T x (i)) (1 - g (θ T x (i))) \partial \partial θ j θ T x (i)

展开括号

= - 1 m \sum i = 1 m (y (i) (1 - g (θ T x (i))) - (1 - y (i)) g (θ T x (i))) x (i) j

= 1 m \sum i = 1 m (h θ (x (i)) - y (i)) x (i) j

[cost, grad] = costFunction(initial_theta, X, y);

调用costFunction（）函数，具体实现如下：

function [J, grad] = costFunction(theta, X, y)m = length(y);J = 0;grad = zeros(size(theta));H = sigmoid(X*theta);T = -y.*log(H) - (1 - y).*log(1 - H);% Matlab中log()的默认值为ln（）% log(exp(1)) = 1J = 1/m*sum(T);%交叉熵损失for i = 1 : m,    grad = grad + (H(i) - y(i)) * X(i,:)';endgrad = 1/m*grad;end

结果显示

fprintf('Cost at initial theta (zeros): %f\n', cost);fprintf('Gradient at initial theta (zeros): \n');fprintf(' %f \n', grad);

Cost at initial theta (zeros): 0.693147
Gradient at initial theta (zeros):
-0.100000
-12.009217
-11.262842

优化过程直接调用matlab的fminunc（）函数，这样不用设置学习率了，也不用写循环迭代了，替换一下(costFunction(t, X, y))，设为自己的就行。

options = optimset('GradObj', 'on', 'MaxIter', 400);%  Run fminunc to obtain the optimal theta%  This function will return theta and the cost [theta, cost] = ...    fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

查看结果

fprintf('Cost at theta found by fminunc: %f\n', cost);fprintf('theta: \n');fprintf(' %f \n', theta);

Cost at theta found by fminunc: 0.203506
theta:
-24.932760
0.204406
0.199616

2.4 画出分类边界

调用plotDecisionBoundary（）函数

% Plot BoundaryplotDecisionBoundary(theta, X, y);% Put some labels hold on;% Labels and Legendxlabel('x1')ylabel('x2')% Specified in plot orderlegend('positive', 'nagetive')hold off;

plotDecisionBoundary（）函数具体实现为：

function plotDecisionBoundary(theta, X, y)%the decision boundary defined by theta%   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the %   positive examples and o for the negative examples. X is assumed to be %   a either %   1) Mx3 matrix, where the first column is an all-ones column for the %      intercept.%   2) MxN, N>3 matrix, where the first column is all-ones% Plot DataplotData(X(:,2:3), y);hold onif size(X, 2) <= 3    % Only need 2 points to define a line, so choose two endpoints    plot_x = [min(X(:,2))-2,  max(X(:,2))+2];    % Calculate the decision boundary line    plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));    % Plot, and adjust axes for better viewing    plot(plot_x, plot_y)    % Legend, specific for the exercise    legend('Admitted', 'Not admitted', 'Decision Boundary')    axis([30, 100, 30, 100])else    % Here is the grid range    u = linspace(-1, 1.5, 50);    v = linspace(-1, 1.5, 50);    z = zeros(length(u), length(v));    % Evaluate z = theta*x over the grid    for i = 1:length(u)        for j = 1:length(v)            z(i,j) = mapFeature(u(i), v(j))*theta;        end    end    z = z'; % important to transpose z before calling contour    % Plot z = 0    % Notice you need to specify the range [0, 0]    contour(u, v, z, [0, 0], 'LineWidth', 2)endhold offend

结果如下：

2.5 预测和评估

2.5.1 预测

x1 、x2 为45，85，预测y的概率，预测出来y是0-1之间的数

y = sigmoid([1 45 85] * theta);fprintf(['For  45 and 85, we predict y ' ...         'probability of %f\n\n'], y);

结果为：0.774321

2.5.2 评估准确性

调用了predict（）函数

% Compute accuracy on our training setp = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);fprintf('\nProgram paused. Press enter to continue.\n');

具体实现为：

function p = predict(theta, X)m = size(X, 1);p = zeros(m, 1);p = (sigmoid(X*theta) >= 0.5);

sigmoid函数计算出来，如果大于等于0.5，我们就认为应聘者通过面试，否者未通过。

最终在这个数据集上的准确率为：

Train Accuracy: 89.000000

阅读全文

0 0