word2vec

来源:互联网 发布:淘宝衣服是死人衣服吗 编辑:程序博客网 时间:2024/05/17 23:27

Main idea of word2vec

  • Instead of capturing cooccurrence counts directly
  • Predict surrounding words of every word
  • Faster and can easily incorporate a new sentence/document or add a word to the vocabulary

Details

  • Predict surrounding words in a window of length c of every word.
  • Objective function: Maximize the log probability of any context word given the current center word:
    J(θ)=1Tt=1Tcjc,j0logp(wt+j|wt)

    p(wt+j|wt)=p(wo|wi)=exp(vTwovwi)Ww=1exp(vTwvwi)

    where v and v are input and output vector representations of w (so every word has two vectors)

GloVe

J=12ijf(Pij)(wiw~jlogPij)2

0 0
原创粉丝点击