wordnet的特点

来源:互联网 发布:淘宝网拖鞋 编辑:程序博客网 时间:2024/05/01 16:26

Wordnet是国际上非常有影响力的英语词汇知识库

相比于一般的知识表示方法,Wordnet更能够在语义的层面上给自然语言处理工作者带来一些帮助

其特点可以总结如下:


1.在Wordnet中,synset为最基本的单位。synset,顾名思义,就是Synonyms set(同义词集合)的意思,每一个synset都对应着一个独特的语义,在一个synset里可能包含一个或一组词条。当然,每一个词条也可能对应着几个不同的synset

举个例子,在car这个词条下,就存储着以下五个synset:

1. (598) car, auto, automobile, machine, motorcar -- (a motor vehicle with four wheels; usually propelled by an internal combustion engine; "he needs a car to get to work")
2. (24) car, railcar, railway car, railroad car -- (a wheeled vehicle adapted to the rails of railroad; "three cars had jumped the rails")
3. (1) cable car, car -- (a conveyance for passengers or freight on a cable railway; "they took a cable car to the top of the mountain")
4. car, gondola -- (the compartment that is suspended from an airship and that carries personnel and the cargo and the power plant)
5. car, elevator car -- (where passengers ride up and down; "the car was on the top floor")

也可以看到,每一个synset都包含着一组词条


2.Wordnet除了标明了词与词之间的同义关系,还建立了词之间的反义关系,上下位关系

反义关系好理解,上下位关系主要表征的是,一个词属于哪个父类,又含有哪些子类


3.受益于Wordnet上下位关系的层次结构,Wordnet可以提供计算两个词之间距离的功能


具体参考

Python自然语言处理

统计自然语言处理


0 0
原创粉丝点击