Detecting Part of Speech--POS
来源:互联网 发布:卓有成效的管理者知乎 编辑:程序博客网 时间:2024/05/29 18:27
- POS
- The tagging process
- Importance Of POS
POS
The context of the word is an important aspect of determining what type of word it is.
The tagging process
Tagging is the process of assigning a description to a token or a portion of text. This description is called a tag. POS tagging is the process of assigning a POS tag to a token. These tags are normally tags such as noun, verb, and adjective.
Process
- Tokenizing the text
- Determining/Identifying possible tags
- resolving ambiguous tags
Methods
- Rule-based: Rule-based taggers uses a set of rules and a dictionary of words and possible tags. The rules are used when a word has multiple tags. Rules often use the previous and/or following words to select a tag.
- Stochastic: Stochastic taggers use is either based on the Markov model or are cue-based, which uses either decision trees or maximum entropy. Markov models are finite state machines where each state has two probability distributions. Its objective is to find the optimal sequence of tags for a sentence. Hidden Markov Models (HMM) are also used. In these models, the state transitions are not visible.
Importance Of POS
Proper tagging of a sentence can enhance the quality of downstream processing tasks.
Determining the POS, phrases, clauses, and any relationship between them is called parsing
POS tagging is used for many downstream processes such as question analysis and analyzing the sentiment of text.
Text indexing will frequently use POS data.
Speech processing can use tags to help decide how to pronounce words.
//Opennlptry(Inputstream modelIn = new FileInputStream(new File(getModelDir(),"en-pos-maxent.bin"));){ POSModel model = new POSModel(modelIn); POSTaggerME tagger = new POSTaggerME(model); String tags[] = tagger.tag(sentence); for(int i = 0; i < sentence.length; i++ ) { System.out.print(sentence[i] + "/" + tags[i] + " "); } Sequence topSequence[] = tagger.topKSequence(sentence); for(int i = 0; i < topSequence.length; i ++) { System.out.println(topSequence[i]); double probabilities[] = topSequence[i].getProbs(); }}catch(IOException){}
//StanfordnlpMaxentTagger tagger = new MaxentTagger(getModelDir() + "//wsj-0-18-bidirectional-distsim.tagger");List<List<HasWord>> sentences = MaxentTagger.tokenizeText(new BufferedReader(new FileReader("sentences.txt")));List<TaggedWord> taggedSentence = tagger.tagSentence(sentence);for (List<HasWord> sentence : sentences) { List<TaggedWord> taggedSentence = tagger.tagSentence(sentence); System.out.println(taggedSentence);}List<TaggedWord> taggedSentence = tagger.tagSentence(sentence);for (List<HasWord> sentence : sentences) { List<TaggedWord> taggedSentence = tagger.tagSentence(sentence); System.out.println(Sentence.listToString(taggedSentence, false));}
- Detecting Part of Speech--POS
- 转载 POS tagging :part-of-speech tagging
- RegExp and classfier used in part-of-speech(POS) tagging
- Part-of-Speech 标记 含义
- HMM Part-of-Speech Tagging
- Stanford POS_Stanford Log-linear Part-Of-Speech Tagger
- Stanford Log-linear Part-Of-Speech Tagger学习
- Stanford 英文词性标注(Part-of-speech)缩写查询
- Alphabetical list of part-of-speech tags used in the Penn Treebank Project:
- Detecting SQL Injection in Oracle-part one
- POS(Point Of Sales)
- Week2-1parts of speech
- CMUSphinx Learn - Basic concepts of speech
- Unit 2: Reading The Parts of Speech
- Speech
- SPEECH
- The mechanism and implementation of detecting memory leak
- Comparative analysis of methods for detecting interacting loci
- CEF Windows开发环境搭建
- window.onload和$(document).ready()的区别
- ScrollView嵌套LinearLayout又嵌套ListView的布局样式
- 数据库杂记
- Observer:观察者模式学习
- Detecting Part of Speech--POS
- 论文笔记《Deep Neural Networks for Object Detection》
- 怎么去掉input textarea 选中后的边线框,textarea 不可以拉
- Matlab读写xml文件
- Java批量插入数据
- 网页设置下载apk
- React-Native学习之第三方开源组件--侧滑栏----react-native-side-menu
- linux C 段错误一览
- Classifying Texts and Documents