deeplearning4j对word2vec的介绍
来源:互联网 发布:天刀太白捏脸数据女 编辑:程序博客网 时间:2024/06/05 23:58
Word2Vec
Contents
- Introduction
- Neural Word Embeddings
- Amusing Word2vec Results
- Just Give Me the Code
- Anatomy of Word2Vec
- Setup, Load and Train
- A Code Example
- Troubleshooting & Tuning Word2Vec
- Word2vec Use Cases
- Foreign Languages
- GloVe (Global Vectors) & Doc2Vec
Word2vec is a two-layer neural net that processes text. Its input is a text corpus and its output is a set of vectors: feature vectors for words in that corpus. While Word2vec is not adeep neural network, it turns text into a numerical form that deep nets can understand.Deeplearning4j implements a distributed form of Word2vec for Java andScala, which works on Spark with GPUs.
翻译:word2vec是一个用来处理文本信息的两层神经网络。它的输入是一个文本语料库,它的输出的一组向量:这些语料中单词的特征向量。然而word2vec并不是一个深层神经网络,它将文本转化成一种能够让深层网络理解的数值形式。Deeplearning4j为java和scala实现了一种分布式的word2vec,可以工作在spark和GPUs上。
Word2vec’s applications extend beyond parsing sentences in the wild. It can be applied just as well togenes, code,likes, playlists, social media graphs and other verbal or symbolic series in which patterns may be discerned.
翻译:word2vec是对在自然状态下语句解析的延展性的应用。同样可以应用到基因、代码、喜好、播放列表、社交媒介图或者其他的语言或符号系列的模式识别。
Why? Because words are simply discrete states like the other data mentioned above, and we are simply looking for the transitional probabilities between those states: the likelihood that they will co-occur. So gene2vec, like2vec and follower2vec are all possible. With that in mind, the tutorial below will help you understand how to create neural embeddings for any group of discrete and co-occurring states.
翻译:为什么呢?因为单词是最简单的独立的状态,就像上面提到的数据,并且我们仅仅是在寻找这些状态之间的过渡概率:他们将共现的可能性。所以,gene2vec、like2vec和follower2vec都是可能的。记住这一点,下面的教程将帮助你理解如何为任何独立状态或共现状态的组合创建神经嵌入。
The purpose and usefulness of Word2vec is to group the vectors of similar words together in vectorspace. That is, it detects similarities mathematically. Word2vec creates vectors that are distributed numerical representations of word features, features such as the context of individual words. It does so without human intervention.
翻译:word2vec的目标和用处是为了把向量空间中相似词语的向量分到同组。这就是说,他能发现数学上的相似之处。word2vec创建的向量,这些向量是由单词的分布的数值来表现的单词特征,例如这些特征可以是单词的上下文。而这过程无需人工干预。
Given enough data, usage and contexts, Word2vec can make highly accurate guesses about a word’s meaning based on past appearances. Those guesses can be used to establish a word’s association with other words (e.g. “man” is to “boy” what “woman” is to “girl”), or cluster documents and classify them by topic. Those clusters can form the basis of search,sentiment analysis and recommendations in such diverse fields as scientific research, legal discovery, e-commerce and customer relationship management.
The output of the Word2vec neural net is a vocabulary in which each item has a vector attached to it, which can be fed into a deep-learning net or simply queried to detect relationships between words.
Measuring cosine similarity, no similarity is expressed as a 90 degree angle, while total similarity of 1 is a 0 degree angle, complete overlap; i.e. Sweden equals Sweden, while Norway has a cosine distance of 0.760124 from Sweden, the highest of any other country.
discrete 离散的,独立的 adj.
co-occur 共现,同现 n.
detect 发现 vt.
mathematically 算术的,数学上的 adv.
numerical 数值的 adj.
representation 表现,代表 n.
参考资料:
http://deeplearning4j.org/word2vec
- deeplearning4j对word2vec的介绍
- deeplearning4j之word2vec学习
- 对word2vec的理解
- deeplearning4j的官网
- deeplearning4j
- Deeplearning4j
- word2vec学习+使用介绍
- word2vec简单介绍
- 对word2vec训练的model的结果操作笔记
- gensim实现python对word2vec的训练和计算
- gensim实现python对word2vec的训练和计算
- word2vec基于负采样的模型原理介绍
- DeepLearning4J安装及遇到的问题
- Deeplearning4j的本机CPU优化
- word2vec的学习思路
- word2vec的学习思路
- Word2Vec的一些理解
- word2vec的学习思路
- String类的一些有用的方法
- Java中==和equals()的区别
- 由JavaScript中的for in循环遇到的.符号和[]符号的问题
- Matlab-GUIDE使用说明(Matlab-GUI界面)
- php中pdo的一些用法
- deeplearning4j对word2vec的介绍
- 关于iOS--弹出窗口的使用
- 70. Climbing Stairs
- 将Sprite转换为Image
- VMware Workstation10.0.4 build-2249910安装ubuntu-14.04-desktop-amd64记录
- 基于U-BOOT-2010.09移植OK6410开发版记录(二)
- 241. Different Ways to Add Parentheses
- scanf正则表达式
- 初次邂逅Hibernate之进一步了解