Deep Learning by Andrew Ng --- Softmax regression

来源：互联网发布：纳米材料就业知乎编辑：程序博客网时间：2024/05/17 21:44

这是UFLDL的编程练习。

Weight decay（Softmax 回归有一个不寻常的特点：它有一个“冗余”的参数集）后的cost function和梯度函数：

cost function： $J (θ) = - 1 m ⎡ ⎣ \sum i = 1 m \sum j = 1 k 1 {y (i) = j} log e θ T j x ( i ) \sum k l = 1 e θ T l x ( i ) ⎤ ⎦ + λ 2 \sum i = 1 k \sum j = 0 n θ 2 i j$
梯度函数：

\nabla θ j J (θ) = - 1 m \sum i = 1 m [x (i) (1 {y (i) = j} - p (y (i) = j | x (i); θ))] + λ θ j

p(y(i)=j|x(i);θ))等于UFLDL练习中step2中的h。

bsxfun函数的使用：

to prevent overflow, simply subtract some large constant value from each of the $θ T j x (i)$ terms before computing the exponential：
% M is the matrix as described in the text
M = bsxfun(@minus, M, max(M, [], 1));
use the following code to compute the hypothesis：
% M is the matrix as described in the text
M = bsxfun(@rdivide, M, sum(M）

练习题答案（建议自己完成，后参考）：

softmaxCost.m:

M = theta*data; %exp(theta(l)' * x(i))M = bsxfun(@minus, M, max(M, [], 1));  h = exp(M);h =  bsxfun(@rdivide, h, sum(h));  size(groundTruth);cost = -1/numCases*sum(sum(groundTruth.*log(h)))+lambda/2*sum(sum(theta.^2));  thetagrad = -1/numCases*((groundTruth-h)*data')+lambda*theta;

softPredict.m:

[index ,  pred]= max(theta * data,[],1);

0 0