Linear Regression Exercise

来源:互联网 发布:软件测试管理方法 编辑:程序博客网 时间:2024/05/21 13:57

线性回归,是利用一个或多个自变量和因变量的关系,进行的最小平方和函数的建模。

本文是最简单的二元线性回归问题的解决。理论文档参考:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex2/ex2.html。训练数据分为两类:小孩身高和小孩年龄,样本大小为50。通过线性回归分析小孩身高和年龄的关系,并估计出年龄为3.5岁和7岁各自对应的身高。

% This file describes how linear regression works for two-dimensionalclc,clear,close allx = load('ex2x.dat');y = load('ex2y.dat');% Add x0 = 1 intercept term to every example for term theta0m = length(y);x = [ones(m, 1), x];theta = zeros(2, 1);thetaTemp = zeros(2, 1);alpha = 0.07;iter = 1500;theta0 = zeros(iter, 1);theta1 = zeros(iter, 1);J_value = zeros(iter, 1);% Repeattemp = 1;while (temp <= iter)    for i = 1 : m        thetaTemp = thetaTemp - alpha / m * (theta' * x(i, :)' - y(i)) * x(i, :)';    end%     if (abs(thetaTemp (1) - theta(1)) < 0.0000001 && abs(thetaTemp (2) - theta(2)) < 0.0000001)%         break;%     end    theta = thetaTemp;    theta0(temp) = theta(1);    theta1(temp) = theta(2);    J_value(temp) = 1 / 2 / m * sum((theta' * x' - y').^2);    temp = temp + 1;end% Plot the Trend line% Mark the top ten points for direction judgementfigure, plot3(theta0, theta1, J_value, '-v');for i = 1 : 10    text(theta0(i), theta1(i), J_value(i) + 0.02, num2str(i));endxlabel('\theta_0'); ylabel('\theta_1'); zlabel('J-value')title('Decrease Trend')grid on;axis square% Plot the original dataset and trained data by using linear regressiony1 = x * theta;figure; plot(x(:, 2), y, 'o', x(:, 2), y1 ,'-');legend('Training data','Linear regression');ylabel('Height in meters');xlabel('Age in years');title(['RMS = ' num2str(sqrt(sum((y1 - y).^2) / m))])%%J_vals = zeros(100, 100);   % initialize Jvals to 100x100 matrix of 0'stheta0_vals = linspace(-3, 3, 100);theta1_vals = linspace(-1, 1, 100);for i = 1:length(theta0_vals)  for j = 1:length(theta1_vals)  t = [theta0_vals(i); theta1_vals(j)];  J_vals(i,j) = 1 / 2 / m * sum((t' * x' - y').^2);    endend% Plot the surface plot% Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flippedJ_vals = J_vals';figure;surf(theta0_vals, theta1_vals, J_vals)xlabel('\theta_0'); ylabel('\theta_1')figure;% Plot the cost function with 15 contours spaced logarithmically% between 0.01 and 100contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 2, 15))xlabel('\theta_0'); ylabel('\theta_1')

实验结果:

theta =
    0.7502
    0.0639

参数与代价函数之间变换关系如图:标号表示迭代的次数(前十次)。


训练之后线性回归曲线与训练样本之间关系如图:


代价函数与参数之间的曲面图:


代价函数的等高图:



0 0