Lecture 2 - Simple Word Vector representations: word2vec, GloVe

来源:互联网 发布:网络教育大学有哪些 编辑:程序博客网 时间:2024/06/05 02:35

How to represent meaning in a computer?
这里写图片描述

Discrete Representation:
One-Hot Representation

But the one-hot representation has a problem: hard to compute similarity.

Distributional Representation:
Full Document & Window Based
Full document, like word-document coocurrence matrix -> LDA -> suitable for text classification.
Window based, like the following:
这里写图片描述

But the window based method also has a problem: the matrix dimension is too high.

Solution: SVD, 什么是奇异值分解
这里写图片描述
这里写图片描述

But SVG has to cost much time!
So we think about obtaining low dimensional vectors directly.

这里写图片描述
这里写图片描述

Warm Up: Gradient Descent
这里写图片描述

0 0