Word Vectors详解(2)
来源:互联网 发布:韦德生涯场均数据 编辑:程序博客网 时间:2024/06/05 06:35
3.3 Skip-Gram Model
Another approach is to create a model such that use the center word to generate the context.
Let’s discuss the Skip-Gram model above. The setup is largely the same but we essentially swap our
How does it work:
1. We get our embedded word vectors for the center word:
2. Generate a score vector
4. Turn the scores into probabilities
5. We desire our probabilities generated
Note that
Where
Skip-gram treats each context word equally : the models computes the probability for each word of appearing in the context independently of its distance to the center word
shortage
Loss functions
To solve this problem, a simple idea is we could instead just approximate it. We have a method called Negative Sampling
Negative Sampling
For every training step, instead of looping over the entire vocabulary, we can just sample several negative examples! We “sample” from a noise distribution (
While negative sampling is based on Skip-Gram model or CBOW, it is in fact optimizing a different objective.
Consider a pair
Now, we build a new objective function that tries to maximize the probability of a word and context being in the corpus data if it indeed is, and maximize the probability of a word and context not being in the corpus data if it indeed is not. We take a simple maximum likelihood approach of these two probabilities. (Here we take
Note that maximizing the likelihood is the same as minimizing the negative log likelihood
Note that
For skip-gram, our new objective function for observing the context word c-m+j given the center word c would be:
For CBOW, our new objective function for observing the center word
In the above formulation, {
“bombastic” is now 3x more likely to be sampled while “is” only went up marginally.
- Word Vectors详解(2)
- Word Vectors详解(1)
- rotate matrix from 2 vectors
- Lesson 2 Matrices and vectors
- 理解GloVe模型(Global vectors for word representation)
- [NLP论文阅读]Learned in Translation: Contextualized Word Vectors
- AppCompat v23.2 -- Vectors的时代
- Vectors(2): 绘制优美的路径动画
- 论文读书笔记-Alternate Equivalent Substitutes:Recognition of Synonyms Using Word Vectors
- 论文读书笔记-Alternate Equivalent Substitutes:Recognition of Synonyms Using Word Vectors
- DG Lecture 2 part 2: points, vectors, directional derivative
- DG Lecture 2 part 1: points, vectors, directional derivative
- C++ Vectors
- vectors.S
- term vectors
- C++ Vectors
- C++ Vectors
- Gradient Vectors
- SQL 创建删除表相关语句
- Ubuntu 16.04 安装 PyCharm
- 接口
- 数据结构与算法分析 c++ 平衡二叉树 AvlTree
- YOLO v2 训练自己的图像数据
- Word Vectors详解(2)
- 什么是md5
- commitAllowingStateLoss()
- 纯css实现图片翻转效果
- JDP
- 欢迎使用CSDN-markdown编辑器
- Category初见
- Google advances AI with ‘one model to learn them all
- 侧滑栏的右上角图标修改