Week4-4Earley Parser
来源:互联网 发布:法学网络课程 编辑:程序博客网 时间:2024/06/05 06:38
Background
- Developed by Jay Earley in 1970
- No need to convert grammar to CNF
- Left to right
Complexity
fast than
Earley Parser
- look for both full and partial constituents
- when reading word k, it has already identified all hypotheses that are consistent with words 1 to k-1
Data structure
- It uses dynamic programming table, just like CKY
- Example entry in column 1:
- [0:1] VP -> VP . PP
- created when processing word 1
- corresponds to words 0 to 1 (the part on the left of . represents the part that we have found, thus VP, and if we found later PP, we will find the whole non terminal)
- the dot(.) separates the completed(known) part from the incomplete(possibly unattainable) part
3 types of entries
- ‘scan’- for words
- ‘predict’ - for non-terminals
- ‘complete’ - otherwise
Example
Take this book.
at the end we could find that it is either a verb phrase or a sentence.
The problem of CFG
Agreement
- Number
- Chen is/ People are
- Person
- I am/ Chen is
- was/ is/ will be
- Case
- Gender
Combinatorial explosion
- Many combinations of rules are needed to express agreement
- S -> NP VP
- S -> 1sgNP 1sgVP
- S -> 2sgNP 2sgVP
- …
Subcategorization frames
For different type of words, the rules we have are different.
- direct object
- prepositional phrase
- predictive adjective
- bare infinitive
- to-infinitive
- participial phrase
- that-clause
- question-form clause
CFG independence assumption
The probability of different non terminals are not independent in the context of rules.
Remark: The solution of it is the Lexicalized CFG(PCFG).
Conclusion
Because the possibilities of combinations, the number of the parses of a sentence is exponential, so to find all the parses, the you have to spend exponential time.
0 0
- Week4-4Earley Parser
- week4-4
- coursera-android-week4-4
- week4
- 自然语言处理中的Earley算法
- week4、run 4 automation test cases of bright
- POMM-week4
- FERMI-week4
- 周报week4
- Oct week4
- Leetcode Week4
- leetcode week4
- week4 神经网络
- 记录week4
- Algorithm-week4
- LeetCode Week4
- DM8168 HDVPSS的VIP Parser模块(4)
- C程序设计 WEEK4
- (经典)详解WINDOWS映像劫持技术
- 使用JMeter进行基本压力测试
- Zend Studio 配色方案插件
- string.IsNullOrEmpty()是什么意思啊
- bzoj1048 分割矩阵 记忆化搜索
- Week4-4Earley Parser
- 结构体对齐
- Android NDK学习之 一. Android NDK简介
- UIImageView的使用
- c语言内存分配
- 在Spring3中,配置DataSource的方法有6种。
- CentOS 7中没有ifconfig命令,而且不能发现eth0
- 趣味i**1/((i-1)**(i-1)) 发现
- 正则表达式判断用户昵称