My solution to cs224n assignment3

来源：互联网发布：高达网络限定编辑：程序博客网时间：2024/06/05 02:01

My solution

a primer on NER

　　NER(Named entity recognition)命名实体识别是一种序列标注问题，
输入一个句子，输出一个标注的序列。

标注的种类有:
- Person (PER) (He or she are not considered named entities.)
- Organization (ORG)
- Location (LOC)
- Miscellaneous (MISC) (杂项)
- O(不是命名实体)

标注的评价指标:
　　对于非空的标注，计算Recall, Precision, F1-score(如果直接计算全体标注正确率，会因为O比较多造成较大干扰)。

另外有entity-level评价指标:
　　也就是说计算entitiy的Recall, Precision和F1-score的值，那么只有在一个词组全部标注对的时候，才算标注正确了这个entity。

更全面的刻画是confusion matrix，例如:

gold/guess PER ORG LOC MISC O PER 2973 59 41 14 62 ORG 152 1648 94 62 136 LOC 57 104 1868 25 40 MISC 47 58 45 1012 106 O 46 49 12 33 42619

window into NER

　　
最简单的思路是根据window里的x直接通过一个神经网络来预测。

这里写图片描述

以下为一些问题和解答:

(a)混淆

Q: Provide 2 examples of sentences containing a named entity with an ambiguous type
A: 这个问题只是举例子，比较简单，这个问题说明命名实体识别存在歧义，答案给的句子是:

“Spokesperson for Levis, Bill Murray, said … “, where it is ambiguous whether Levis is a person or an organization.
“Heartbreak is a new virus,” where Heartbreak could either be a MISC named entity (it’s actually the name of a virus), or simply a noun.

Q: Why might it be important to use features apart from the word itself to predict named entity labels?
A: 因为很多时候有些普通名词可能是一个组织，很多时候人名也可能是组织，所以说，单凭单个单词没有办法获得全部信息，需要结合周围语境综合判断，这就是window的出发点。

Q: Describe at least two features (apart from the word) that would help in predicting whether a word is part of a named entity or not.
A: 例如单词的大小写情况，以及词性，前后缀之类的。

(b)维度和时间复杂度分析

　　比较简单，略。

(c)(d)代码和分析

　　代码实现略，另外如果在IDE中，注意设置命令行参数。
　　分析的结果是给出混淆矩阵，ORG容易被误识成PER。
　　

2.3.RNN/GRU for NER

RNN, GRU模型应用于序列标注问题比较常见了，下面是几个注意点。

没法把F1-score直接作为损失函数

　　因为F1-score涉及整语料，而且是离散的不可导，所以F1-score没法直接作为损失函数而采用交叉熵损失函数。

检查梯度和预训练RNNs的方法

　　课件里有涉及到。

GRU模拟自动机的习题

　　assignment里涉及了GRU模拟自动机的习题，主要是不等式操作，比较简单。这给了一个预训练的思路，先给出大致的自动机模型，预训练参数按照不等式随即设置。
　　
　　

阅读全文

0 0