matrix/vector derivatives
来源:互联网 发布:软件质量定义 编辑:程序博客网 时间:2024/06/05 10:22
Gradients for vectorized operations
The above sections were concerned with single variables, but all concepts extend in a straight-forward manner to matrix and vector operations. However, one must pay closer attention to dimensions and transpose operations.
Matrix-Matrix multiply gradient. Possibly the most tricky operation is the matrix-matrix multiplication (which generalizes all matrix-vector and vector-vector) multiply operations:
# forward passW = np.random.randn(5, 10)X = np.random.randn(10, 3)D = W.dot(X)# now suppose we had the gradient on D from above in the circuitdD = np.random.randn(*D.shape) # same shape as DdW = dD.dot(X.T) #.T gives the transpose of the matrixdX = W.T.dot(dD)
Tip: use dimension analysis! Note that you do not need to remember the expressions for dW
and dX
because they are easy to re-derive based on dimensions. For instance, we know that the gradient on the weights dW
must be of the same size as W
after it is computed, and that it must depend on matrix multiplication of X
and dD
(as is the case when both X,W
are single numbers and not matrices). There is always exactly one way of achieving this so that the dimensions work out. For example, X
is of size [10 x 3] and dD
of size [5 x 3], so if we want dW
and W
has shape [5 x 10], then the only way of achieving this is with dD.dot(X.T)
, as shown above.
Work with small, explicit examples. Some people may find it difficult at first to derive the gradient updates for some vectorized expressions. Our recommendation is to explicitly write out a minimal vectorized example, derive the gradient on paper and then generalize the pattern to its efficient, vectorized form.
Erik Learned-Miller has also written up a longer related document on taking matrix/vector derivatives which you might find helpful. Find it here.
- matrix/vector derivatives
- Derivatives of matrix
- 矩阵微分(matrix derivatives)
- Matrix derivatives(矩阵求导)
- Derivatives of scalars, vector functions and matrices
- mul,vector,matrix
- CNN,梯度求解,Derivatives derivatives derivatives
- Practical Derivatives
- Automatic Derivatives
- 用vector来实现matrix。
- OpenGL GLSL matrix-vector operator
- Convert a vector to matrix
- UVA - 10895Matrix Transpose(vector)
- Vector Norm and Matrix Norm
- Mean Vector and Covariance Matrix
- Lesson 7 Matrix-matrix and matrix-vector multiplication
- QML 基础类型 vector 和 matrix
- GSL Matrix and Vector Basic Operations Examples
- JQuery动态实现table行自增自减
- 百度知道采集器
- EAST 自然场景文本检测实践(EAST: An Efficient and Accurate Scene Text Detector)
- hdu5950(递推的矩阵快速幂)
- 2017.9.15 最大数maxnumber 思考记录
- matrix/vector derivatives
- Android Studio导入和删除模块
- 33. Search in Rotated Sorted Array
- Android图片压缩方法集合
- Android 热修复 Tinker接入及源码浅析
- 模块打包工具webpack的使用
- SourceTree使用介绍
- 数据结构Java实现02----线性表与顺序表
- 绘图: matplotlib核心剖析