#Paper Reading# Manifold-Ranking Based Topic-Focused Multi-Document Summarization
来源:互联网 发布:怎么申请开通80端口 编辑:程序博客网 时间:2024/05/19 19:40
论文题目:Manifold-Ranking Based Topic-Focused Multi-Document Summarization
论文地址:http://www.aaai.org/Papers/IJCAI/2007/IJCAI07-467.pdf
论文发表于:IJCAI 2007(CCF A类)
论文大体内容:
本文将流形排序(Manifold-ranking)应用到多文档摘要(extractive式)中,通过实验发现取得了不错的效果。
1、manifold-ranking基于2个假设:
①邻近的点有相同的分数;
②相同结构的点也有相同的分数;
2、manifold-ranking使用一个权重网络,每两个结点有一条权重连线,通过网络之间的权重传播,不断迭代各连线的权重值,最终得到一个稳定的权重(类PageRank);
3、本文作者从信息丰富度(与主题T的关系)和信息新奇度(与已有摘要的区别)2方面来考虑extractive式摘要的生成;
4、信息丰富度
①这里将每个文档拆分为句子(权重网络的结点),计算TFISF,得到D矩阵(维度为M*N,M为词典大小,N为句子数),然后通过consine计算每两个句子之间的相似度,得到W矩阵,作对称正则化(Symmetrically normalize)S=diag(W*1)^(-1/2)*W*diag(W*1)^(-1/2);
②每个句子的打分为f向量,f(t+1)=α*S*f(t)+(1-α)*y;其中α是超参数,y中除了主题描述句的值为1外,其它句子值为0,不断迭代f,直到稳定,从而得到句子的打分;
③考虑到句子在相同文档内与不同文档的差异,作者令W=λ1*W(同一文档)+λ2*W(不同文档);
5、信息新奇度(diversity)
①生成摘要的时候,每次从剩下的句子中抽取最高分的句子;
②抽取了一个句子作为摘要句子后,对剩下的未被抽取的句子做一个减分操作,主要原则是i被选择为摘要,那么跟i比较密切的j句子,会因为相似性而减分,与i越相近,减的分越多;
③不断抽取,直到抽取的句子数足够为止;
实验
6、数据集
①DUC2003
②DUC2005
7、评测标准
ROUGE
8、Baseline
①Similarity-Ranking1简化给各个句子打分的步骤,直接用句子与topic句子的相似度来打分,然后继续用diversity选择;(去掉manifold-ranking步骤)
②Similarity-Ranking2更为简单,把1的使用diversity选择也略去,直接选择打分最高的;(去掉manifold-ranking和diversity步骤)
③Lead baseline只选择最后一篇document的第一个句子;
④Coverage baseline选择所有document的第一个句子;
⑤数据集任务中的参赛者成绩
9、实验结果
以上均为个人见解,因本人水平有限,如发现有所错漏,敬请指出,谢谢!
- #Paper Reading# Manifold-Ranking Based Topic-Focused Multi-Document Summarization
- #Paper Reading# Joint Matrix Factorization and Manifold-Ranking for Topic-Focused Multi-Document Sum
- #Paper Reading# Multi-document Summarization Based on Cluster Using Non-negative Matrix
- #Paper Reading# Recent Advances in Document Summarization
- #Paper Reading# Multi-Document Summarization via Sentence-Level Semantic Analysis and SMF
- #Paper Reading# Leveraging Multi-Domain Prior Knowledge in Topic Models
- #Paper Reading# SummaRuNNer: A RNN based Sequence Model for Extractive Summarization of Documents
- Ranking with Recursive Neural Networks and Its Application to Multi-document Summarization
- #Paper Reading# Online Knowledge-Based Model for Big Data Topic Extraction
- #Paper Reading# Neural Extractive Summarization with Side Information
- #Paper Reading# A Neural Attention Model for Abstractive Sentence Summarization
- #Paper Reading# Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
- Paper Reading:Regional Multi-person Pose Estimation
- Abstractive Document Summarization with a Graph-Based Attentional Neural Model
- 多文档自动文摘:Multi-Document Summarization,MDS
- Query-Oriented Multi-Document Summarization via Unsupervised Deep Learning
- #Paper Reading# Lifelong Machine Learning for Topic Modeling and Beyond
- #Paper Reading# Robust Word-Network Topic Model for Short Texts
- 配置文件+c3p0数据库连接池连接mysql数据库报错:Access denied for user 'root'@'localhost' (using password: YES)
- audio Framework 简述
- HDU2682 Tree
- Mysql 5.7新版本创建用户
- C++构造函数初始化类的特殊成员变量——类
- #Paper Reading# Manifold-Ranking Based Topic-Focused Multi-Document Summarization
- JavaScript Array 对象
- 机器学习实战 朴素贝叶斯
- 如何理解AOP
- CKEditor图片上传
- mysql格式化日期
- #8 Actions
- JavaScript String 对象
- 32. Longest Valid Parentheses