Terminology Extraction
来源:互联网 发布:如何知道mysql的密码 编辑:程序博客网 时间:2024/05/21 14:48
Terminology Extraction
最近老师让我研究这个,赶紧在网上看了看,好少的资料,哭......
从translated网站看了他们对这项技术的实现,目前只支持english,italian,french,没有中文....
Introduction
Terminology is the sum of the terms which identify a specific topic. Extracting terminology is the process of extracting terminology from a text.
The idea is to compare the frequency of words in a given document with their frequency in the language. Words which appear very frequently in the document but rarely in the language are probably terms.
Technology
It uses Poisson statistics, the Maximum Likelihood Estimation and Inverse Document Frequency (Latent Semantic Analysis) between the frequency of words in a given document and a generic corpus of 100 million words per language. It uses a probabilistic part ff speech tagger to take into account the probability that a particular sequence could be a term. It creates n-grams of words by minimising the relative entropy.
Why have we developed this?
Translated has developed this technology to help its translators to be aware of the difficulties in a document and to simplify the process of creating glossaries.
We also use it to improve search results in traditional search engines (es. Google) by giving a better estimation of how much a keyword is relevant to a document.
- Terminology Extraction
- terminology
- C++ Terminology
- Basic Terminology
- Kernel Terminology
- JAZZ Terminology
- Alsa Terminology
- PRINCE2 terminology
- H264 Terminology
- Debugging Terminology
- Networking Terminology
- Terminology | OAuth
- Audio Terminology
- 术语(Terminology)
- [caffe] terminology
- ARM terminology
- Event Extraction
- Information Extraction
- 学习数据库:SqlServer 2005之注册网络服务器
- 欧盟新规限制视频摄制产品
- 正在做的一个拼图游戏
- Linux环境下USB的原理、驱动和配置--本文由CSDN 特别约稿,作者为北京中科红旗软件技术有限公司 嵌入式工程师 梁国军
- 星号的秘密
- Terminology Extraction
- 你的工作就是最好的面试
- 大数据n!(n的阶乘)计算方法讨论
- 通过RS232发送和接收短信
- pb学习(一)
- 今天惊险的交易。
- 有关ASP.NET的一些基本说明
- 用marquee标签实现文字滚动
- 今天写错的3个SQL语句