线性回归 Linear Regression

来源:互联网 发布:python3 socket编程 编辑:程序博客网 时间:2024/05/21 09:33

Ng网上有三个DL教程:

1  wiki上的

2  本文的    文本教程, 和Ng机器学习 网易中文 

3   Ng openclassroom视频 中的   练习中文      练习英文    这个使用的最多!

Rachel Zhang的笔记1 


本文用的代码在这里。视频在这里。


成本函数J又称目标函数。


ex1/目录下的文件ex1_linreg.m用于表示线性回归的文件,此文件的主要流程如下:

1  加载数据文件集文件housing.data(此文件也在ex1/目录下)

2   数据集被分成训练集和测试集。学习算法的输入保存在变量train.X和test.X中,输入值表示房屋的面积。目标值表示房屋的价格,
对于训练集和测试集对应的目标值分别保存在train.y和test.y变量中。你可以用训练集去找预测房屋价格最恰当的theta值。然后大测试集上验证其效果。


3   调用minFunc函数。此函数在linear_regression.m文件中定义,作用是通过最小化成本函数J找到最佳的theta值。这个函数就是我们需要实现的,需要实现的功能:计算成本函数和梯度
4   打印训练误差和测试误差。也可能针对测试集样本画出预测价格和实际价格

文件ex1_linreg.m调用linear_regression.m文件中的minFunc函数,因此你必须要实现此函数。minFunc函数的参数:训练集X,训练目标y,和参数theta 



这个练习没有完成之前的输出结果如下图:




完成后的输出结果如下图:


训练和测试RMS 在 4.5和 5之间。


linear_regression.m文件如下(还没有实现对应的代码):

function [f,g] = linear_regression(theta, X,y)  %  % Arguments:  %   theta - A vector containing the parameter values to optimize.  %   X - The examples stored in a matrix.  %       X(i,j) is the i'th coordinate of the j'th example.  %   y - The target value for each example.  y(j) is the target for example j.  %    m=size(X,2);  n=size(X,1);  f=0;  g=zeros(size(theta));  %  % TODO:  Compute the linear regression objective by looping over the examples in X.  %        Store the objective function value in 'f'.  %  % TODO:  Compute the gradient of the objective with respect to theta by looping over  %        the examples in X and adding up the gradient for each example.  Store the  %        computed gradient in 'g'.  %%% YOUR CODE HERE %%%

-------------------------------------------------------------------------------------------

文件ex1_linreg.m(还没有实现)

%%This exercise uses a data from the UCI repository:% Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository% http://archive.ics.uci.edu/ml% Irvine, CA: University of California, School of Information and Computer Science.%%Data created by:% Harrison, D. and Rubinfeld, D.L.% ''Hedonic prices and the demand for clean air''% J. Environ. Economics & Management, vol.5, 81-102, 1978.%addpath ../commonaddpath ../common/minFunc_2012/minFuncaddpath ../common/minFunc_2012/minFunc/compiled% Load housing data from file.data = load('housing.data');data=data'; % put examples in columns% Include a row of 1s as an additional intercept feature.data = [ ones(1,size(data,2)); data ];% Shuffle examples.data = data(:, randperm(size(data,2)));% Split into train and test sets% The last row of 'data' is the median home price.train.X = data(1:end-1,1:400);train.y = data(end,1:400);test.X = data(1:end-1,401:end);test.y = data(end,401:end);m=size(train.X,2);n=size(train.X,1);% Initialize the coefficient vector theta to random values.theta = rand(n,1);% Run the minFunc optimizer with linear_regression.m as the objective.%% TODO:  Implement the linear regression objective and gradient computations% in linear_regression.m%tic;options = struct('MaxIter', 200);theta = minFunc(@linear_regression, theta, options, train.X, train.y);fprintf('Optimization took %f seconds.\n', toc);% Run minFunc with linear_regression_vec.m as the objective.%% TODO:  Implement linear regression in linear_regression_vec.m% using MATLAB's vectorization features to speed up your code.% Compare the running time for your linear_regression.m and% linear_regression_vec.m implementations.%% Uncomment the lines below to run your vectorized code.%Re-initialize parameters%theta = rand(n,1);%tic;%theta = minFunc(@linear_regression_vec, theta, options, train.X, train.y);%fprintf('Optimization took %f seconds.\n', toc);% Plot predicted prices and actual prices from training set.actual_prices = train.y;predicted_prices = theta'*train.X;% Print out root-mean-squared (RMS) training error.train_rms=sqrt(mean((predicted_prices - actual_prices).^2));fprintf('RMS training error: %f\n', train_rms);% Print out test RMS erroractual_prices = test.y;predicted_prices = theta'*test.X;test_rms=sqrt(mean((predicted_prices - actual_prices).^2));fprintf('RMS testing error: %f\n', test_rms);% Plot predictions on test data.plot_prices=true;if (plot_prices)  [actual_prices,I] = sort(actual_prices);  predicted_prices=predicted_prices(I);  plot(actual_prices, 'rx');  hold on;  plot(predicted_prices,'bx');  legend('Actual Price', 'Predicted Price');  xlabel('House #');  ylabel('House price ($1000s)');end


0 0