文本摘要评测工具ROUGE的搭建和测试

来源:互联网 发布:c语言双向冒泡排序算法 编辑:程序博客网 时间:2024/06/06 01:48

  文本摘要任务中最常用的评价方法是ROUGE(Recall-Oriented Understudy for Gisting Evaluation)。ROUGE受到了机器翻译自动评价方法BLEU的启发,不同之处在于,采用召回率来作为指标。基本思想是将模型生成的摘要与参考摘要的n元组贡献统计量作为评判依据。

  现在主要采用软件是PERL语言编写的版本,见地址:

https://github.com/andersjo/pyrouge/tree/master/tools/ROUGE-1.5.5)


  然而这个工具的搭建,相对还比较麻烦,因此把整个搭建的过程记录一下:


(1)安装PERL语言,一般的Ubuntu环境都具备

(2)安装PERL语言的相关库,主要是XML语言解析器

(3)对数据进行处理,主要是WordNet数据的处理,主要是原来给的文件会存在无法打开的问题,即报如下错误:(Cannot open exception db file for reading: data/WordNet-2.0.exc.db)


处理步骤如下:

cd pythonrouge/RELEASE-1.5.5/data/rm WordNet-2.0.exc.db./WordNet-2.0-Exceptions/buildExeptionDB.pl ./WordNet-2.0-Exceptions ./smart_common_words.txt ./WordNet-2.0.exc.db


然后进行测试:

./ROUGE-1.5.5.pl -e data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a ROUGE-test.xml

 其中测试文件可以从如下网址下载:ROUGE-test.xml (https://raw.githubusercontent.com/summanlp/evaluation/master/ROUGE-RELEASE-1.5.5/sample-test/ROUGE-test.xml)


测试呈现的结果如下:

omnisky@omnisky:~/software/ROUGE-1.5.5$ ./ROUGE-1.5.5.pl -e data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -b 75 -m -a ROUGE-test.xml---------------------------------------------11 ROUGE-1 Average_R: 0.22536 (95%-conf.int. 0.18124 - 0.27016)11 ROUGE-1 Average_P: 0.20359 (95%-conf.int. 0.17004 - 0.23980)11 ROUGE-1 Average_F: 0.21027 (95%-conf.int. 0.17278 - 0.24804)---------------------------------------------11 ROUGE-2 Average_R: 0.03522 (95%-conf.int. 0.01812 - 0.05479)11 ROUGE-2 Average_P: 0.02964 (95%-conf.int. 0.01698 - 0.04433)11 ROUGE-2 Average_F: 0.03109 (95%-conf.int. 0.01669 - 0.04702)---------------------------------------------11 ROUGE-3 Average_R: 0.00243 (95%-conf.int. 0.00000 - 0.00774)11 ROUGE-3 Average_P: 0.00171 (95%-conf.int. 0.00000 - 0.00545)11 ROUGE-3 Average_F: 0.00201 (95%-conf.int. 0.00000 - 0.00640)---------------------------------------------11 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)11 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)11 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------11 ROUGE-L Average_R: 0.17144 (95%-conf.int. 0.14160 - 0.19972)11 ROUGE-L Average_P: 0.15459 (95%-conf.int. 0.13442 - 0.17638)11 ROUGE-L Average_F: 0.15979 (95%-conf.int. 0.13524 - 0.18293)---------------------------------------------11 ROUGE-W-1.2 Average_R: 0.10366 (95%-conf.int. 0.08657 - 0.12032)11 ROUGE-W-1.2 Average_P: 0.14874 (95%-conf.int. 0.13072 - 0.16891)11 ROUGE-W-1.2 Average_F: 0.12004 (95%-conf.int. 0.10260 - 0.13592)---------------------------------------------11 ROUGE-S* Average_R: 0.02919 (95%-conf.int. 0.01857 - 0.04092)11 ROUGE-S* Average_P: 0.02341 (95%-conf.int. 0.01557 - 0.03158)11 ROUGE-S* Average_F: 0.02400 (95%-conf.int. 0.01604 - 0.03186)---------------------------------------------11 ROUGE-SU* Average_R: 0.06207 (95%-conf.int. 0.04583 - 0.07847)11 ROUGE-SU* Average_P: 0.05303 (95%-conf.int. 0.04009 - 0.06763)11 ROUGE-SU* Average_F: 0.05310 (95%-conf.int. 0.04099 - 0.06553)---------------------------------------------12 ROUGE-1 Average_R: 0.24238 (95%-conf.int. 0.18967 - 0.29513)12 ROUGE-1 Average_P: 0.25533 (95%-conf.int. 0.19308 - 0.31825)12 ROUGE-1 Average_F: 0.24485 (95%-conf.int. 0.19150 - 0.30122)---------------------------------------------12 ROUGE-2 Average_R: 0.05210 (95%-conf.int. 0.02453 - 0.08236)12 ROUGE-2 Average_P: 0.05569 (95%-conf.int. 0.02581 - 0.08922)12 ROUGE-2 Average_F: 0.05265 (95%-conf.int. 0.02501 - 0.08296)---------------------------------------------12 ROUGE-3 Average_R: 0.01023 (95%-conf.int. 0.00114 - 0.02271)12 ROUGE-3 Average_P: 0.01027 (95%-conf.int. 0.00125 - 0.02146)12 ROUGE-3 Average_F: 0.00995 (95%-conf.int. 0.00119 - 0.02145)---------------------------------------------12 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)12 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)12 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------12 ROUGE-L Average_R: 0.18008 (95%-conf.int. 0.13709 - 0.22455)12 ROUGE-L Average_P: 0.18728 (95%-conf.int. 0.14248 - 0.23337)12 ROUGE-L Average_F: 0.18062 (95%-conf.int. 0.13810 - 0.22318)---------------------------------------------12 ROUGE-W-1.2 Average_R: 0.10847 (95%-conf.int. 0.08398 - 0.13339)12 ROUGE-W-1.2 Average_P: 0.17875 (95%-conf.int. 0.13756 - 0.22048)12 ROUGE-W-1.2 Average_F: 0.13289 (95%-conf.int. 0.10403 - 0.16220)---------------------------------------------12 ROUGE-S* Average_R: 0.03833 (95%-conf.int. 0.02085 - 0.05926)12 ROUGE-S* Average_P: 0.04319 (95%-conf.int. 0.02107 - 0.06921)12 ROUGE-S* Average_F: 0.03788 (95%-conf.int. 0.01997 - 0.05816)---------------------------------------------12 ROUGE-SU* Average_R: 0.07160 (95%-conf.int. 0.04882 - 0.09699)12 ROUGE-SU* Average_P: 0.08071 (95%-conf.int. 0.05108 - 0.11638)12 ROUGE-SU* Average_F: 0.07160 (95%-conf.int. 0.04794 - 0.09681)---------------------------------------------13 ROUGE-1 Average_R: 0.20161 (95%-conf.int. 0.15184 - 0.25908)13 ROUGE-1 Average_P: 0.19956 (95%-conf.int. 0.14511 - 0.25978)13 ROUGE-1 Average_F: 0.20030 (95%-conf.int. 0.14833 - 0.25923)---------------------------------------------13 ROUGE-2 Average_R: 0.04886 (95%-conf.int. 0.02609 - 0.07824)13 ROUGE-2 Average_P: 0.04829 (95%-conf.int. 0.02445 - 0.07861)13 ROUGE-2 Average_F: 0.04846 (95%-conf.int. 0.02523 - 0.07828)---------------------------------------------13 ROUGE-3 Average_R: 0.00887 (95%-conf.int. 0.00250 - 0.01758)13 ROUGE-3 Average_P: 0.00909 (95%-conf.int. 0.00250 - 0.01804)13 ROUGE-3 Average_F: 0.00897 (95%-conf.int. 0.00250 - 0.01758)---------------------------------------------13 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)13 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)13 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------13 ROUGE-L Average_R: 0.17044 (95%-conf.int. 0.12873 - 0.21975)13 ROUGE-L Average_P: 0.16849 (95%-conf.int. 0.12400 - 0.22144)13 ROUGE-L Average_F: 0.16919 (95%-conf.int. 0.12604 - 0.21969)---------------------------------------------13 ROUGE-W-1.2 Average_R: 0.10327 (95%-conf.int. 0.08048 - 0.12969)13 ROUGE-W-1.2 Average_P: 0.16067 (95%-conf.int. 0.12237 - 0.20421)13 ROUGE-W-1.2 Average_F: 0.12550 (95%-conf.int. 0.09682 - 0.15816)---------------------------------------------13 ROUGE-S* Average_R: 0.03974 (95%-conf.int. 0.02107 - 0.06491)13 ROUGE-S* Average_P: 0.04116 (95%-conf.int. 0.01983 - 0.07039)13 ROUGE-S* Average_F: 0.04018 (95%-conf.int. 0.02016 - 0.06705)---------------------------------------------13 ROUGE-SU* Average_R: 0.06653 (95%-conf.int. 0.04305 - 0.09595)13 ROUGE-SU* Average_P: 0.06719 (95%-conf.int. 0.04110 - 0.10081)13 ROUGE-SU* Average_F: 0.06650 (95%-conf.int. 0.04165 - 0.09775)---------------------------------------------14 ROUGE-1 Average_R: 0.23816 (95%-conf.int. 0.18633 - 0.28642)14 ROUGE-1 Average_P: 0.20187 (95%-conf.int. 0.15801 - 0.24672)14 ROUGE-1 Average_F: 0.21741 (95%-conf.int. 0.16959 - 0.26309)---------------------------------------------14 ROUGE-2 Average_R: 0.04832 (95%-conf.int. 0.02575 - 0.07404)14 ROUGE-2 Average_P: 0.04008 (95%-conf.int. 0.02100 - 0.06148)14 ROUGE-2 Average_F: 0.04350 (95%-conf.int. 0.02320 - 0.06550)---------------------------------------------14 ROUGE-3 Average_R: 0.00626 (95%-conf.int. 0.00129 - 0.01275)14 ROUGE-3 Average_P: 0.00551 (95%-conf.int. 0.00125 - 0.01106)14 ROUGE-3 Average_F: 0.00583 (95%-conf.int. 0.00129 - 0.01172)---------------------------------------------14 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)14 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)14 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------14 ROUGE-L Average_R: 0.18917 (95%-conf.int. 0.15208 - 0.22554)14 ROUGE-L Average_P: 0.16072 (95%-conf.int. 0.12931 - 0.19636)14 ROUGE-L Average_F: 0.17285 (95%-conf.int. 0.13969 - 0.20706)---------------------------------------------14 ROUGE-W-1.2 Average_R: 0.11376 (95%-conf.int. 0.09309 - 0.13418)14 ROUGE-W-1.2 Average_P: 0.15239 (95%-conf.int. 0.12539 - 0.18210)14 ROUGE-W-1.2 Average_F: 0.12955 (95%-conf.int. 0.10691 - 0.15301)---------------------------------------------14 ROUGE-S* Average_R: 0.04052 (95%-conf.int. 0.02377 - 0.05965)14 ROUGE-S* Average_P: 0.03102 (95%-conf.int. 0.01668 - 0.05095)14 ROUGE-S* Average_F: 0.03427 (95%-conf.int. 0.01952 - 0.05277)---------------------------------------------14 ROUGE-SU* Average_R: 0.07426 (95%-conf.int. 0.05233 - 0.09732)14 ROUGE-SU* Average_P: 0.05627 (95%-conf.int. 0.03818 - 0.07912)14 ROUGE-SU* Average_F: 0.06277 (95%-conf.int. 0.04369 - 0.08486)---------------------------------------------21 ROUGE-1 Average_R: 0.12268 (95%-conf.int. 0.09798 - 0.14879)21 ROUGE-1 Average_P: 0.15320 (95%-conf.int. 0.12216 - 0.18730)21 ROUGE-1 Average_F: 0.13279 (95%-conf.int. 0.10711 - 0.15971)---------------------------------------------21 ROUGE-2 Average_R: 0.01529 (95%-conf.int. 0.00592 - 0.02711)21 ROUGE-2 Average_P: 0.02223 (95%-conf.int. 0.00779 - 0.04143)21 ROUGE-2 Average_F: 0.01766 (95%-conf.int. 0.00648 - 0.03171)---------------------------------------------21 ROUGE-3 Average_R: 0.00146 (95%-conf.int. 0.00000 - 0.00387)21 ROUGE-3 Average_P: 0.00189 (95%-conf.int. 0.00000 - 0.00500)21 ROUGE-3 Average_F: 0.00165 (95%-conf.int. 0.00000 - 0.00436)---------------------------------------------21 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)21 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)21 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------21 ROUGE-L Average_R: 0.11136 (95%-conf.int. 0.08935 - 0.13337)21 ROUGE-L Average_P: 0.14091 (95%-conf.int. 0.11120 - 0.17367)21 ROUGE-L Average_F: 0.12123 (95%-conf.int. 0.09858 - 0.14549)---------------------------------------------21 ROUGE-W-1.2 Average_R: 0.07130 (95%-conf.int. 0.05835 - 0.08458)21 ROUGE-W-1.2 Average_P: 0.14244 (95%-conf.int. 0.11389 - 0.17316)21 ROUGE-W-1.2 Average_F: 0.09280 (95%-conf.int. 0.07623 - 0.10966)---------------------------------------------21 ROUGE-S* Average_R: 0.00977 (95%-conf.int. 0.00502 - 0.01464)21 ROUGE-S* Average_P: 0.01637 (95%-conf.int. 0.00837 - 0.02539)21 ROUGE-S* Average_F: 0.01103 (95%-conf.int. 0.00590 - 0.01624)---------------------------------------------21 ROUGE-SU* Average_R: 0.03016 (95%-conf.int. 0.02244 - 0.03836)21 ROUGE-SU* Average_P: 0.05010 (95%-conf.int. 0.03553 - 0.06724)21 ROUGE-SU* Average_F: 0.03441 (95%-conf.int. 0.02570 - 0.04326)---------------------------------------------22 ROUGE-1 Average_R: 0.16619 (95%-conf.int. 0.13350 - 0.20500)22 ROUGE-1 Average_P: 0.15684 (95%-conf.int. 0.12675 - 0.19079)22 ROUGE-1 Average_F: 0.15540 (95%-conf.int. 0.12640 - 0.18731)---------------------------------------------22 ROUGE-2 Average_R: 0.01970 (95%-conf.int. 0.00940 - 0.03235)22 ROUGE-2 Average_P: 0.02285 (95%-conf.int. 0.00885 - 0.04183)22 ROUGE-2 Average_F: 0.01963 (95%-conf.int. 0.00867 - 0.03326)---------------------------------------------22 ROUGE-3 Average_R: 0.00267 (95%-conf.int. 0.00000 - 0.00645)22 ROUGE-3 Average_P: 0.00179 (95%-conf.int. 0.00000 - 0.00439)22 ROUGE-3 Average_F: 0.00214 (95%-conf.int. 0.00000 - 0.00523)---------------------------------------------22 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)22 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)22 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------22 ROUGE-L Average_R: 0.14274 (95%-conf.int. 0.11681 - 0.17154)22 ROUGE-L Average_P: 0.13564 (95%-conf.int. 0.11091 - 0.16389)22 ROUGE-L Average_F: 0.13356 (95%-conf.int. 0.11087 - 0.15692)---------------------------------------------22 ROUGE-W-1.2 Average_R: 0.08851 (95%-conf.int. 0.07349 - 0.10502)22 ROUGE-W-1.2 Average_P: 0.13443 (95%-conf.int. 0.10985 - 0.16234)22 ROUGE-W-1.2 Average_F: 0.10269 (95%-conf.int. 0.08630 - 0.11951)---------------------------------------------22 ROUGE-S* Average_R: 0.02048 (95%-conf.int. 0.01204 - 0.03044)22 ROUGE-S* Average_P: 0.01755 (95%-conf.int. 0.00929 - 0.02714)22 ROUGE-S* Average_F: 0.01595 (95%-conf.int. 0.00957 - 0.02308)---------------------------------------------22 ROUGE-SU* Average_R: 0.04477 (95%-conf.int. 0.03270 - 0.05895)22 ROUGE-SU* Average_P: 0.04262 (95%-conf.int. 0.02754 - 0.05872)22 ROUGE-SU* Average_F: 0.03765 (95%-conf.int. 0.02812 - 0.04741)---------------------------------------------23 ROUGE-1 Average_R: 0.12235 (95%-conf.int. 0.08927 - 0.15829)23 ROUGE-1 Average_P: 0.11503 (95%-conf.int. 0.08510 - 0.14914)23 ROUGE-1 Average_F: 0.11823 (95%-conf.int. 0.08752 - 0.15313)---------------------------------------------23 ROUGE-2 Average_R: 0.00681 (95%-conf.int. 0.00000 - 0.01641)23 ROUGE-2 Average_P: 0.00607 (95%-conf.int. 0.00000 - 0.01473)23 ROUGE-2 Average_F: 0.00641 (95%-conf.int. 0.00000 - 0.01550)---------------------------------------------23 ROUGE-3 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)23 ROUGE-3 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)23 ROUGE-3 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------23 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)23 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)23 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)---------------------------------------------23 ROUGE-L Average_R: 0.10965 (95%-conf.int. 0.08185 - 0.14105)23 ROUGE-L Average_P: 0.10383 (95%-conf.int. 0.07819 - 0.13277)23 ROUGE-L Average_F: 0.10635 (95%-conf.int. 0.07990 - 0.13597)---------------------------------------------23 ROUGE-W-1.2 Average_R: 0.06674 (95%-conf.int. 0.05082 - 0.08413)23 ROUGE-W-1.2 Average_P: 0.10003 (95%-conf.int. 0.07684 - 0.12576)23 ROUGE-W-1.2 Average_F: 0.07981 (95%-conf.int. 0.06101 - 0.10034)---------------------------------------------23 ROUGE-S* Average_R: 0.01001 (95%-conf.int. 0.00430 - 0.01689)23 ROUGE-S* Average_P: 0.00899 (95%-conf.int. 0.00360 - 0.01568)23 ROUGE-S* Average_F: 0.00939 (95%-conf.int. 0.00387 - 0.01613)---------------------------------------------23 ROUGE-SU* Average_R: 0.02865 (95%-conf.int. 0.01887 - 0.04049)23 ROUGE-SU* Average_P: 0.02590 (95%-conf.int. 0.01739 - 0.03617)23 ROUGE-SU* Average_F: 0.02692 (95%-conf.int. 0.01793 - 0.03771)---------------------------------------------24 ROUGE-1 Average_R: 0.28540 (95%-conf.int. 0.23134 - 0.34089)24 ROUGE-1 Average_P: 0.27811 (95%-conf.int. 0.20830 - 0.35585)24 ROUGE-1 Average_F: 0.26995 (95%-conf.int. 0.21567 - 0.32693)---------------------------------------------24 ROUGE-2 Average_R: 0.09309 (95%-conf.int. 0.05629 - 0.13592)24 ROUGE-2 Average_P: 0.11162 (95%-conf.int. 0.05591 - 0.18063)24 ROUGE-2 Average_F: 0.09350 (95%-conf.int. 0.05490 - 0.13769)---------------------------------------------24 ROUGE-3 Average_R: 0.03075 (95%-conf.int. 0.01081 - 0.05737)24 ROUGE-3 Average_P: 0.04022 (95%-conf.int. 0.00900 - 0.08575)24 ROUGE-3 Average_F: 0.03122 (95%-conf.int. 0.00986 - 0.05847)---------------------------------------------24 ROUGE-4 Average_R: 0.00860 (95%-conf.int. 0.00000 - 0.02009)24 ROUGE-4 Average_P: 0.00703 (95%-conf.int. 0.00000 - 0.01639)24 ROUGE-4 Average_F: 0.00774 (95%-conf.int. 0.00000 - 0.01805)---------------------------------------------24 ROUGE-L Average_R: 0.24161 (95%-conf.int. 0.19599 - 0.29178)24 ROUGE-L Average_P: 0.24108 (95%-conf.int. 0.17423 - 0.31614)24 ROUGE-L Average_F: 0.23010 (95%-conf.int. 0.18351 - 0.28161)---------------------------------------------24 ROUGE-W-1.2 Average_R: 0.14412 (95%-conf.int. 0.11828 - 0.17208)24 ROUGE-W-1.2 Average_P: 0.22825 (95%-conf.int. 0.16623 - 0.30239)24 ROUGE-W-1.2 Average_F: 0.16917 (95%-conf.int. 0.13670 - 0.20511)---------------------------------------------24 ROUGE-S* Average_R: 0.06608 (95%-conf.int. 0.04152 - 0.09971)24 ROUGE-S* Average_P: 0.08502 (95%-conf.int. 0.04014 - 0.13988)24 ROUGE-S* Average_F: 0.05949 (95%-conf.int. 0.03719 - 0.08882)---------------------------------------------24 ROUGE-SU* Average_R: 0.10433 (95%-conf.int. 0.07587 - 0.14102)24 ROUGE-SU* Average_P: 0.12555 (95%-conf.int. 0.06607 - 0.19873)24 ROUGE-SU* Average_F: 0.09434 (95%-conf.int. 0.06742 - 0.12844)omnisky@omnisky:~/software/ROUGE-1.5.5$ 


阅读全文
0 0
原创粉丝点击