UFLDL softmax_regression.m

Reached Maximum Number of Iterations
Optimization took 69.236617 seconds.
Training accuracy: 94.4%
Test accuracy: 92.1%

function [f,g] = softmax_regression(theta, X,y)  %  % Arguments:  %   theta - A vector containing the parameter values to optimize.  %       In minFunc, theta is reshaped to a long vector.  So we need to  %       resize it to an n-by-(num_classes-1) matrix.  %       Recall that we assume theta(:,num_classes) = 0.  %  %   X - The examples stored in a matrix.    %       X(i,j) is the i'th coordinate of the j'th example.  %   y - The label for each example.  y(j) is the j'th example's label.  %  m=size(X,2);  n=size(X,1);  % theta is a vector;  need to reshape to n x num_classes.  theta=reshape(theta, n, []);  num_classes=size(theta,2)+1;  % initialize objective value and gradient.  f = 0;  g = zeros(size(theta));  %  % TODO:  Compute the softmax objective function and gradient using vectorized code.  %        Store the objective function value in 'f', and the gradient in 'g'.  %        Before returning g, make sure you form it back into a vector with g=g(:);  %%%% YOUR CODE HERE %%%   theta = [theta,zeros(785,1)];  predict = exp(theta' * X);  predict = bsxfun(@rdivide,predict,sum(predict));  I = sub2ind(size(predict),y,1:size(predict,2));  f = f - sum(log(predict(I)));  %f = f + sum(log(1-predict(setdiff(1:m*n,I))));  delta = full(sparse(y,1:m,1))-predict;  g = -X * delta';  g = g(:,1:9);  g=g(:); % make gradient a vector for minFunc
