coursera机器学习笔记（第一周、第二周）

来源：互联网发布：投资tw域名编辑：程序博客网时间：2024/05/16 08:17

一、机器学习的分类
这里写图片描述
二、线性回归模型
Hypothesis:hθ(x)=θTx=θ0x0+θ1x1+θ2x2+⋯+θnxn(x0=1)
Parameters:θ0,θ1,…,θn
Cost function:J(θ0,θ1,…,θn)=12m∑i=1m(hθ(x(i))−y(i))2
Gradient descent:

R e p e a t {θ j : = θ j - α \partial \partial θ j J (θ 0, θ 1, \dots, θ n)}

其中

α∂∂θjJ(θ0,θ1,…,θn)=1m∑i=1m(hθ(x(i))−y(i))x(i)i
注：
1.对每

j=0,…,n需同步更新

θj
2.关于α的取值
①如果α太小，梯度下降会很慢
②如果α太大，可能不会收敛，甚至会发散
三、编程作业
此次作业一共为以下8个文件：
warmUpExercise.m
plotData.m
gradientDescent.m
computeCost.m
gradientDescentMulti.m
computeCostMulti.m
featureNormalize.m
normalEqn.m
1.warmUpExercise.m

function A = warmUpExercise()%WARMUPEXERCISE Example function in octave%   A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrixA = [];% ============= YOUR CODE HERE ==============% Instructions: Return the 5x5 identity matrix %               In octave, we return values by defining which variables%               represent the return values (at the top of the file)%               and then set them accordingly. A=eye(5);% ===========================================end

2.plotData.m

function plotData(x, y)%PLOTDATA Plots the data points x and y into a new figure %   PLOTDATA(x,y) plots the data points and gives the figure axes labels of%   population and profit.figure; % open a new figure window% ====================== YOUR CODE HERE ======================% Instructions: Plot the training data into a figure using the %               "figure" and "plot" commands. Set the axes labels using%               the "xlabel" and "ylabel" commands. Assume the %               population and revenue data have been passed in%               as the x and y arguments of this function.%% Hint: You can use the 'rx' option with plot to have the markers%       appear as red crosses. Furthermore, you can make the%       markers larger by using plot(..., 'rx', 'MarkerSize', 10);plot(x, y, 'rx', 'MarkerSize', 10);ylabel('Profit in $10,000s');xlabel('Population of City in 10,000s');  % ============================================================end

3.gradientDescent.m

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)%GRADIENTDESCENT Performs gradient descent to learn theta%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by %   taking num_iters gradient steps with learning rate alpha% Initialize some useful valuesm = length(y); % number of training examplesJ_history = zeros(num_iters, 1);for iter = 1:num_iters    % ====================== YOUR CODE HERE ======================    % Instructions: Perform a single gradient step on the parameter vector    %               theta.     %    % Hint: While debugging, it can be useful to print out the values    %       of the cost function (computeCost) and gradient here.    %     tempTheta = theta;     theta(1) = tempTheta(1) - alpha / m * sum(X * tempTheta - y);    theta(2) = tempTheta(2) - alpha / m * sum((X * tempTheta - y) .* X(:,2));%============================================================    % Save the cost J in every iteration        J_history(iter) = computeCost(X, y, theta);endend

4.computeCost.m

function J = computeCost(X, y, theta)%COMPUTECOST Compute cost for linear regression%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the%   parameter for linear regression to fit the data points in X and y% Initialize some useful valuesm = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ======================% Instructions: Compute the cost of a particular choice of theta%               You should set J to the cost.temp =  sum(((X * theta - y).^2));J = 1 / (2*m) * temp;%==========================================================end

5.gradientDescentMulti.m

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)%GRADIENTDESCENTMULTI Performs gradient descent to learn theta%   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by%   taking num_iters gradient steps with learning rate alpha% Initialize some useful valuesm = length(y); % number of training examplesJ_history = zeros(num_iters, 1);for iter = 1:num_iters% ====================== YOUR CODE HERE ======================    % Instructions: Perform a single gradient step on the parameter vector    %               theta.     %    % Hint: While debugging, it can be useful to print out the values    %       of the cost function (computeCostMulti) and gradient here.    %tempTheta = theta;     for i = 1 : size(X,2)        theta(i) = tempTheta(i) - alpha / m * sum((X * tempTheta - y) .* X(:,i));    end    % ============================================================    % Save the cost J in every iteration        J_history(iter) = computeCostMulti(X, y, theta);endend

6.computeCostMulti.m

function J = computeCostMulti(X, y, theta)%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables%   J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the%   parameter for linear regression to fit the data points in X and y% Initialize some useful valuesm = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ======================% Instructions: Compute the cost of a particular choice of theta%               You should set J to the cost.J = 1 / (2*m) * sum(((X * theta - y).^2));% =========================================================================end

7.featureNormalize.m

function [X_norm, mu, sigma] = featureNormalize(X)%FEATURENORMALIZE Normalizes the features in X %   FEATURENORMALIZE(X) returns a normalized version of X where%   the mean value of each feature is 0 and the standard deviation%   is 1. This is often a good preprocessing step to do when%   working with learning algorithms.% You need to set these values correctlyX_norm = X;mu = zeros(1, size(X, 2));sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ======================% Instructions: First, for each feature dimension, compute the mean%               of the feature and subtract it from the dataset,%               storing the mean value in mu. Next, compute the %               standard deviation of each feature and divide%               each feature by it's standard deviation, storing%               the standard deviation in sigma. %%               Note that X is a matrix where each column is a %               feature and each row is an example. You need %               to perform the normalization separately for %               each feature. %% Hint: You might find the 'mean' and 'std' functions useful.%       for i = 1 : size(X,2)    mu(i) = mean(X(:,i));     sigma(i) = std(X(:,i));    X_norm(:,i) = (X(:,i) - mu(i)) / sigma(i);end% ============================================================end

8.normalEqn.m

function [theta] = normalEqn(X, y)%NORMALEQN Computes the closed-form solution to linear regression %   NORMALEQN(X,y) computes the closed-form solution to linear %   regression using the normal equations.theta = zeros(size(X, 2), 1);% ====================== YOUR CODE HERE ======================% Instructions: Complete the code to compute the closed form solution%               to linear regression and put the result in theta.%% ---------------------- Sample Solution ----------------------theta = (X' * X) \ X' * y;% -------------------------------------------------------------% ============================================================end

阅读全文

0 0