机器学习(Machine Learning)心得体会(1)线性回归

机器学习,我自己的理解就是让计算机去模拟人类的行为方式,也就是人工智能的一个雏形?!(不知道理解的额对不对)总的先分为supervised learning和unsupervised learning两类,而其中前者又主要分为regression analysis和classification两类问题(连续与离散),后者的话主要是clustering algorithm问题(聚类算法),两者的区别是有无确定的输出(不知道这么说是否合适)。


我们的主要内容是回归分析(regression analysis)中的线性回归问题。






在前面,我们假设有一个函数 h 用于我们预测,由于此处我们只有一个输入,所以我们不妨设为h(x)=theta0+theta1x,theta0和theta1是两个未知参数,把这个函数画在坐标图上,我们可以得到的是一条曲线,我们需要把这个曲线进行拟合,使其接近我们我们的期望,那么怎么确定他是最优的呢,我们引入一个函数J。如下图


function J = computeCost(X, y, theta)% ====================== YOUR CODE HERE ======================% Instructions: Compute the cost of a particular choice of theta%               You should set J to the cost.J=1/(2*m)*(X*theta-y)'*(X*theta-y);





function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)    % ====================== YOUR CODE HERE ======================    % Instructions: Perform a single gradient step on the parameter vector    %               theta.     %    % Hint: While debugging, it can be useful to print out the values    %       of the cost function (computeCost) and gradient here.    %    theta=theta-alpha/m*(X'*(X*theta-y));

另一种方法则更为简洁,称为normal equation,利用矩阵来进行求解(本人矩阵很渣,不作过多解释,直接贴代码吧)


function [theta] = normalEqn(X, y)% ====================== YOUR CODE HERE ======================% Instructions: Complete the code to compute the closed form solution%               to linear regression and put the result in theta.%% ---------------------- Sample Solution ----------------------theta=pinv(X'*X)*X'*y;




function [X_norm, mu, sigma] = featureNormalize(X)%FEATURENORMALIZE Normalizes the features in X %   FEATURENORMALIZE(X) returns a normalized version of X where%   the mean value of each feature is 0 and the standard deviation%   is 1. This is often a good preprocessing step to do when%   working with learning algorithms.% You need to set these values correctlyX_norm = X;mu = zeros(1, size(X, 2));sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ======================% Instructions: First, for each feature dimension, compute the mean%               of the feature and subtract it from the dataset,%               storing the mean value in mu. Next, compute the %               standard deviation of each feature and divide%               each feature by it's standard deviation, storing%               the standard deviation in sigma. %%               Note that X is a matrix where each column is a %               feature and each row is an example. You need %               to perform the normalization separately for %               each feature. %% Hint: You might find the 'mean' and 'std' functions useful.%       mu = mean(X,1);  sigma = std(X);  i = 1;  times = size(X, 2);  while i <= times,      X_norm(:,i) = (X(:,i) - mu(1,i))/sigma(1,i);      i = i + 1;  end



%% Machine Learning Online Class - Exercise 1: Linear Regression%  Instructions%  ------------% %  This file contains code that helps you get started on the%  linear exercise. You will need to complete the following functions %  in this exericse:%%     warmUpExercise.m%     plotData.m%     gradientDescent.m%     computeCost.m%     gradientDescentMulti.m%     computeCostMulti.m%     featureNormalize.m%     normalEqn.m%%  For this exercise, you will not need to change any code in this file,%  or any other files other than those mentioned above.%% x refers to the population size in 10,000s% y refers to the profit in $10,000s%%% Initializationclear ; close all; clc%% ==================== Part 1: Basic Function ====================% Complete warmUpExercise.m fprintf('Running warmUpExercise ... \n');fprintf('5x5 Identity Matrix: \n');warmUpExercise()fprintf('Program paused. Press enter to continue.\n');pause;%% ======================= Part 2: Plotting =======================fprintf('Plotting Data ...\n')data = load('ex1data1.txt');X = data(:, 1); y = data(:, 2);m = length(y); % number of training examples% Plot Data% Note: You have to complete the code in plotData.mplotData(X, y);fprintf('Program paused. Press enter to continue.\n');pause;%% =================== Part 3: Gradient descent ===================fprintf('Running Gradient Descent ...\n')X = [ones(m, 1), data(:,1)]; % Add a column of ones to xtheta = zeros(2, 1); % initialize fitting parameters% Some gradient descent settingsiterations = 1500;alpha = 0.01;% compute and display initial costcomputeCost(X, y, theta)% run gradient descenttheta = gradientDescent(X, y, theta, alpha, iterations);% print theta to screenfprintf('Theta found by gradient descent: ');fprintf('%f %f \n', theta(1), theta(2));% Plot the linear fithold on; % keep previous plot visibleplot(X(:,2), X*theta, '-')legend('Training data', 'Linear regression')hold off % don't overlay any more plots on this figure% Predict values for population sizes of 35,000 and 70,000predict1 = [1, 3.5] *theta;fprintf('For population = 35,000, we predict a profit of %f\n',...    predict1*10000);predict2 = [1, 7] * theta;fprintf('For population = 70,000, we predict a profit of %f\n',...    predict2*10000);fprintf('Program paused. Press enter to continue.\n');pause;%% ============= Part 4: Visualizing J(theta_0, theta_1) =============fprintf('Visualizing J(theta_0, theta_1) ...\n')% Grid over which we will calculate Jtheta0_vals = linspace(-10, 10, 100);theta1_vals = linspace(-1, 4, 100);% initialize J_vals to a matrix of 0'sJ_vals = zeros(length(theta0_vals), length(theta1_vals));% Fill out J_valsfor i = 1:length(theta0_vals)    for j = 1:length(theta1_vals)  t = [theta0_vals(i); theta1_vals(j)];      J_vals(i,j) = computeCost(X, y, t);    endend% Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flippedJ_vals = J_vals';% Surface plotfigure;surf(theta0_vals, theta1_vals, J_vals)xlabel('\theta_0'); ylabel('\theta_1');% Contour plotfigure;% Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))xlabel('\theta_0'); ylabel('\theta_1');hold on;plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);

