【机器学习】逻辑回归(matlab实现)
来源:互联网 发布:学c语言入门看什么书 编辑:程序博客网 时间:2024/05/18 03:32
参考吴恩达的course
1 载入数据
数据集下载地址
链接:http://pan.baidu.com/s/1kUUGiP5 密码:vvsa
载入数据,数据有三列,前两列是
data = load('ex2data1.txt');%(100,3)X = data(:, [1, 2]); y = data(:, 3);%y是0,1
可视化数据
plotData(X, y);% Put some labels hold on;xlabel('x1')ylabel('x2')legend('positive', 'negative')hold off;
plotData()具体实现为:
function plotData(X, y)figure; hold on;pos = find(y == 1);neg = find(y == 0);%画正样本plot(X(pos , 1) , X(pos , 2) , 'k+' , 'LineWidth' , 2 , 'MarkerSize' , 10);% LineWidth:线宽% MarkerSize:标记点的大小% k黑% MarkerFaceColor:标记点内部的填充颜色% y黄%画负样本plot(X(neg , 1) , X(neg , 2) , 'ko' , 'MarkerFaceColor' ,'y', 'MarkerSize' , 10);hold off;end
横纵坐标分别为
2 实现
2.1 初始化
[m, n] = size(X);%(100,2)X = [ones(m, 1) X];%(100,3),第一列全是1initial_theta = zeros(n + 1, 1);%(3,1)
2.2 sigmoid函数
sigmoid函数为:
推导可知:
这一条性质非常好
预测函数为sigmoid函数的形式:
具体实现为:
function g = sigmoid(z)g = zeros(size(z));g = 1 ./ (1 + exp(-z));end
2.3 计算损失函数和梯度
logistic regression 交叉熵损失函数如下:
这个地方log是ln,求梯度过程如下
括号右边的数乘进去
展开括号
[cost, grad] = costFunction(initial_theta, X, y);
调用costFunction()函数,具体实现如下:
function [J, grad] = costFunction(theta, X, y)m = length(y);J = 0;grad = zeros(size(theta));H = sigmoid(X*theta);T = -y.*log(H) - (1 - y).*log(1 - H);% Matlab中log()的默认值为ln()% log(exp(1)) = 1J = 1/m*sum(T);%交叉熵损失for i = 1 : m, grad = grad + (H(i) - y(i)) * X(i,:)';endgrad = 1/m*grad;end
结果显示
fprintf('Cost at initial theta (zeros): %f\n', cost);fprintf('Gradient at initial theta (zeros): \n');fprintf(' %f \n', grad);
Cost at initial theta (zeros): 0.693147
Gradient at initial theta (zeros):
-0.100000
-12.009217
-11.262842
优化过程直接调用matlab的fminunc()函数,这样不用设置学习率了,也不用写循环迭代了,替换一下(costFunction(t, X, y)),设为自己的就行。
options = optimset('GradObj', 'on', 'MaxIter', 400);% Run fminunc to obtain the optimal theta% This function will return theta and the cost [theta, cost] = ... fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
查看结果
fprintf('Cost at theta found by fminunc: %f\n', cost);fprintf('theta: \n');fprintf(' %f \n', theta);
Cost at theta found by fminunc: 0.203506
theta:
-24.932760
0.204406
0.199616
2.4 画出分类边界
调用plotDecisionBoundary()函数
% Plot BoundaryplotDecisionBoundary(theta, X, y);% Put some labels hold on;% Labels and Legendxlabel('x1')ylabel('x2')% Specified in plot orderlegend('positive', 'nagetive')hold off;
plotDecisionBoundary()函数具体实现为:
function plotDecisionBoundary(theta, X, y)%the decision boundary defined by theta% PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the % positive examples and o for the negative examples. X is assumed to be % a either % 1) Mx3 matrix, where the first column is an all-ones column for the % intercept.% 2) MxN, N>3 matrix, where the first column is all-ones% Plot DataplotData(X(:,2:3), y);hold onif size(X, 2) <= 3 % Only need 2 points to define a line, so choose two endpoints plot_x = [min(X(:,2))-2, max(X(:,2))+2]; % Calculate the decision boundary line plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1)); % Plot, and adjust axes for better viewing plot(plot_x, plot_y) % Legend, specific for the exercise legend('Admitted', 'Not admitted', 'Decision Boundary') axis([30, 100, 30, 100])else % Here is the grid range u = linspace(-1, 1.5, 50); v = linspace(-1, 1.5, 50); z = zeros(length(u), length(v)); % Evaluate z = theta*x over the grid for i = 1:length(u) for j = 1:length(v) z(i,j) = mapFeature(u(i), v(j))*theta; end end z = z'; % important to transpose z before calling contour % Plot z = 0 % Notice you need to specify the range [0, 0] contour(u, v, z, [0, 0], 'LineWidth', 2)endhold offend
结果如下:
2.5 预测和评估
2.5.1 预测
y = sigmoid([1 45 85] * theta);fprintf(['For 45 and 85, we predict y ' ... 'probability of %f\n\n'], y);
结果为:0.774321
2.5.2 评估准确性
调用了predict()函数
% Compute accuracy on our training setp = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);fprintf('\nProgram paused. Press enter to continue.\n');
具体实现为:
function p = predict(theta, X)m = size(X, 1);p = zeros(m, 1);p = (sigmoid(X*theta) >= 0.5);
sigmoid函数计算出来,如果大于等于0.5,我们就认为应聘者通过面试,否者未通过。
最终在这个数据集上的准确率为:
Train Accuracy: 89.000000
- 【机器学习】逻辑回归(matlab实现)
- 机器学习-MATLAB逻辑回归
- 【机器学习】线性回归(matlab实现)
- 二,机器学习算法之逻辑回归(python实现)
- 机器学习 Python实现逻辑回归
- 【机器学习】逻辑回归原理及其实现
- 机器学习:逻辑回归python实现
- 【机器学习】逻辑回归原理及其实现
- 机器学习-softmax逻辑回归源码实现
- 机器学习之逻辑回归python实现
- 机器学习-逻辑回归
- 机器学习:逻辑回归
- 机器学习---逻辑回归
- 【机器学习】逻辑回归
- 机器学习----逻辑回归
- 机器学习 逻辑回归
- 机器学习:逻辑回归
- 机器学习--逻辑回归
- 你不能错过的 spark 学习资源
- Roman to Integer---罗马数字转换为十进制整数
- PHP常用的和不常用但是很方便的内置函数
- xcode IOS开发中如何让APP的不同版本共存于设备上
- 创建一个Mac OS X 下的Cocoa Application对话框程序
- 【机器学习】逻辑回归(matlab实现)
- 编译busybox提示arm-linux-gcc内容中arm-none-linux-gnueabi-找不到
- FZU 2195
- JDBC MetaData (元数据)
- Flask 自带jsonify 直接返回json一条龙
- Linux终端执行shell脚本,提示权限不够
- js去除数组中的空值,假值
- stdint.h[int8_t, int16_t, int32_t, int64_t]
- Java GC