CS224d Assignment1 答案, Part(3/4)
来源:互联网 发布:手机淘宝点购买没反应 编辑:程序博客网 时间:2024/06/16 07:08
Assignment1的答案一共被我分成了4部分,分别包含第1,2,3,4题。这部分包含第3题的答案。
3. word2vec (40 points + 5 bonus)
(a). (3 points) Assume you are given a predicted word vector
where
Hint: It will be helpful to use notation from question 2. For instance, letting
where
解:设词向量的维度为
(b)(3 points) As in the previous part, derive gradients for the “output” word vectors
解:同(a)中一样,设
其中
所以:
(c). (6 points) Repeat part (a) and (b) assuming we are using the negative sampling loss for the predicted vector
where
After you’ve done this, describe with one sentence why this cost function is much more efficient to compute than the softmax-CE loss (you could provide a speed-up ratio, i.e. the runtime of the softmax-CE loss divided by the runtime of the negative sampling loss).
Note: the cost function here is the negative of what Mikolov et al had in their original paper, because we are doing a minimization instead of maximization in our code.
解:设所取的
之所以(6)式比(5)式快是因为:
(d). (8 points) Derive gradients for all of the word vectors for skip-gram and CBOW given the previous parts and given a set of context words
Hint: feel free to use
Recall that for skip-gram, the cost for a context centered around
where
CBOW is slightly different. Instead of using
then the CBOW cost is
Note: To be consistent with the
解:设
skip-gram对应的答案:
其中
CBOW对应的答案:
ps: 感觉这个答案好简单的样子,为什么要给8分呢?
(e)(f)(g)(h). 见代码,略
附一张训出来的图,也就是我跑完q3_run.py之后出现的图,reddit 上有人讨论怎么看这个图是否合理:
- CS224d Assignment1 答案, Part(3/4)
- CS224d Assignment1 答案, Part(1/4)
- CS224d Assignment1 答案, Part(2/4)
- CS224d-Assignment1
- My solution to cs224n assignment1(3-4)
- CS224D Lecture 4 札记
- CS224d Problem set 3作业
- 20170228#cs231n#5.Neural Networks Part 1神经网络1 /Assignment1-NeuralNetwork
- udacity assignment1
- cs231n assignment1
- CS231n-assignment1
- CS224d-Assignment2
- CS224d-Lecture8
- part 3 Cache Memory 4
- PART 4
- Java异常的面试问题及答案-Part 1
- Java异常的面试问题及答案-Part 1
- Python Assignment1 知识总结
- vim安装插件YouCompleteMe,出错,YouCompleteMe unavailable: requires Vim compiled with Python 2.x support
- 编译防火墙——C++的Pimpl惯用法解析
- socket学习
- linux线程系列(5)线程清理和控制函数
- SVM简单代码实现MATLAB
- CS224d Assignment1 答案, Part(3/4)
- C++ 将当前系统时间转换成标准格式的时间和时间戳
- KEIL MDK 查看代码量、RAM使用情况--RO-data、RW-data、ZI-data的解释
- hdu 4571 Travel in time 最短路+dp
- C# 方法重载
- Codeforces Round #416 (Div. 2) 题解
- 常见数学符号读法
- [2017/06/02]腾讯后台开发实习生面试总结
- 各种逆元求法 组合数取模 comb (组合数 Lucas)