
来源:互联网 发布:德雷克船长拉杆箱 淘宝 编辑:程序博客网 时间:2024/05/16 23:55


The curve itself doesn’t matter. I was even suggested by the reviewers to use DET curves as they can better show the results.

For detections, we work on the sentence level. For example, if your sentence contains the keyword, and your decision for detection is YES, then you get a correct detection. We didn’t check the alignment explicitly (i.e., if the boundary of the detected keyword matches the keyword in the sentence exactly), but we only have short sentences, so I guess that’s OK. We make one YES/NO decision for each sentence.

False alarm rate = # of false alarms / # of sentences
False rejection rate = # of false rejections / # of sentences (this is different from the traditional ROC curve)


es I’d suggest to use Librispeech, as we have 1000 hours for that. You’ll have to do forced alignment to generate those time information. If you train your system with Librispeech, I’m sure you’ll have training data alignment somewhere, and you can use that. For evaluation data, you can use your trained model and the reference to do the alignment, and that should be sufficient I think (and that’s what we did for the paper, but we worked on another dataset that is not publicly available).

上次周会至今在进行Librispeech test 训练集上的切词工作(切出模版),尽量选取原作者使用的若干关键词进行实验,同时修改绘制评测曲线的PYTHON脚本。结果整理在下篇推出。

0 0