UFLFL Exercise: Learning color features with Sparse Autoencoders
来源:互联网 发布:祝99什么意思网络用语 编辑:程序博客网 时间:2024/04/28 05:41
这是UFLDL线性解码器的练习题。
在以往练习的基础上修改sparseAutoencoderLinearCost.m。
sparseAutoencoderLinearCost.m
function [cost,grad] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ... lambda, sparsityParam, beta, data)% visibleSize: the number of input units (probably 64) % hiddenSize: the number of hidden units (probably 25) % lambda: weight decay parameter% sparsityParam: The desired average activation for the hidden units (denoted in the lecture% notes by the greek alphabet rho, which looks like a lower-case "p").% beta: weight of sparsity penalty term% data: Our 64x10000 matrix containing the training data. So, data(:,i) is the i-th training example. % The input theta is a vector (because minFunc expects the parameters to be a vector). % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this % follows the notation convention of the lecture notes. % Set the initial value of W1,W2,b1,b2 to be a zero-neighboring matrixW1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);% Cost and gradient variables (your code needs to compute these values). % Here, we initialize them to zeros. W1grad = zeros(size(W1)); W2grad = zeros(size(W2));b1grad = zeros(size(b1)); b2grad = zeros(size(b2));%% ---------- YOUR CODE HERE --------------------------------------% Instructions: Compute the cost/optimization objective J_sparse(W,b) for the Sparse Autoencoder,% and the corresponding gradients W1grad, W2grad, b1grad, b2grad.%% W1grad, W2grad, b1grad and b2grad should be computed using backpropagation.% Note that W1grad has the same dimensions as W1, b1grad has the same dimensions% as b1, etc. Your code should set W1grad to be the partial derivative of J_sparse(W,b) with% respect to W1. I.e., W1grad(i,j) should be the partial derivative of J_sparse(W,b) % with respect to the input parameter W1(i,j). Thus, W1grad should be equal to the term % [(1/m) \Delta W^{(1)} + \lambda W^{(1)}] in the last block of pseudo-code in Section 2.2 % of the lecture notes (and similarly for W2grad, b1grad, b2grad).% % Stated differently, if we were using batch gradient descent to optimize the parameters,% the gradient descent update to W1 would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2. % n_data = size(data,2);%1.foreward propagate to get activationshidden_activations = sigmoid(W1 * data + repmat(b1,1,n_data));out_activations = (W2 * hidden_activations + repmat(b2,1,n_data));%2.backward propagate to get residualout_residual = -(data-out_activations);%.*(out_activations.*(1-out_activations));avg_activations = sum(hidden_activations,2) ./ n_data;KL = beta*(-sparsityParam./avg_activations + (1-sparsityParam)./(1-avg_activations));KL = repmat(KL,1,n_data);hidden_residual = (W2'*out_residual+KL).*(hidden_activations.*(1-hidden_activations));%3.partial derivative and update daltaW deltabW2grad = W2grad + out_residual * hidden_activations';b2grad = b2grad + sum(out_residual,2);W1grad = W1grad + hidden_residual * data';b1grad = b1grad + sum(hidden_residual,2);W1grad = W1grad/n_data + lambda*W1;W2grad = W2grad/n_data + lambda*W2;b1grad = b1grad/n_data;b2grad = b2grad/n_data;%4.update W1,W1,b1,b2% alpha = 0.01;% W1 = W1 - alpha * W1grad;% W2 = W2 - alpha * W2grad;% b1 = b1 - alpha * b1grad;% b2 = b2 - alpha * b2grad;%5.calculate costcost = out_activations - data;cost = sum(cost(:).^2)/2/n_data + (lambda/2)*(sum(W1(:).^2) + sum(W2(:).^2)) + ... beta*sum(sparsityParam .* log(sparsityParam./avg_activations(:,1)) + ... (1-sparsityParam) .* log((1-sparsityParam)./(1-avg_activations(:,1))));%-------------------------------------------------------------------% After computing the cost and gradient, we will convert the gradients back% to a vector format (suitable for minFunc). Specifically, we will unroll% your gradient matrices into a vector.grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];end%-------------------------------------------------------------------% Here's an implementation of the sigmoid function, which you may find useful% in your computation of the costs and the gradients. This inputs a (row or% column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)). function sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x));end
结果:
0 0
- UFLFL Exercise: Learning color features with Sparse Autoencoders
- UFLDL Exercise:Learning color features with Sparse Autoencoders
- Exercise:Learning color features with Sparse Autoencoders 代码示例
- Stanford UFLDL教程 Exercise:Learning color features with Sparse Autoencoders
- UFLDL教程: Exercise:Learning color features with Sparse Autoencoders
- UFLDL教程答案(7):Exercise:Learning color features with Sparse Autoencoders
- Convolutional neural networks(CNN) (十) Learning color features with Sparse Autoencoders Exercise
- 深度学习笔记6:Learning color features with Sparse Autoencoders
- UFLFL Exercise:Self-Taught Learning
- 卷积神经Extracting and Composing Robust Features with Denoising Autoencoders
- Sparse Autoencoder4-Autoencoders and Sparsity
- Extracting and Composing Robust Features with Denoising Autoencoders(经典文章阅读)
- Deep learning------------ Autoencoders
- Autoencoders in Deep Learning
- UFLDL Exercise:Sparse Autoencoder
- UFLDL Exercise:Sparse Autoencoder
- UFLDL Exercise:Sparse Autoencoder
- Sparse Autoencoder Exercise
- LintCode:打劫房屋 III
- hdu 2660 Accepted Necklace DFS
- php二维数组排序 根据某一键值排序 array_multisort
- Spring定时任务的几种实现
- 图像处理(二十四)Gradient Domain High Dynamic Range Compression学习笔记
- UFLFL Exercise: Learning color features with Sparse Autoencoders
- cell
- 每日一则(5):jquery模拟超链接提示功能。
- 16浙理工新生赛 K KI的目标 (树上dfs)
- 三次握手、四次挥手
- iOS改变控件形状常用方法性能分析
- python 在不同层级目录import 模块的方法
- java NIO
- 彩笔笔记2016/12/7——map