TREC Real-Time Summarization Track

来源：互联网发布：网络时间校准到秒编辑：程序博客网时间：2024/06/08 17:42

最近关注了下TREC的RTS任务，顺便做点笔记。

Text Retrieval Conference (TREC)

一、简介

RTS（Real-Time Summarization）是文本检索会议提出的一项子任务。

文本检索会议(Text Retrieval Conference, TREC)开始于1992年，由美国国家标准技术协会(National Institute of Standards and Technology，NIST)和美国高级研究计划局(Defense Advanced Research Projects Agency， DARPA)共同主办。

TREC的目的是通过为大规模文本检索方法的评估提供所需的基础设施（如：语料、评测方法、问题集），来支持和鼓励信息检索社区的研究，提高lab-to-product技术转让的速度。

[ TREC首页]
[ TREC－2017年时间线及相关任务]

二、2017年的主要任务

TREC 2017 包括8个Tracks任务，分别是：

 1、Common Core Track - 2017年的新任务 2、Complex Answer Retrieval Track - 2017年的新任务 3、Dynamic Domain Track 4、Live QA Track 5、OpenSearch Track 6、Precision Medicine Track 7、Real-Time Summarization Track - 2016年的新任务 8、Tasks Track

Real-Time Summarization Track

一、简介

RTS是2016年文本检索会议（TREC）提出的一个新任务，可以看成是Microblog（MB， 2010－2015）track任务与Temporal Summarization（TS， 2013－2015）track任务的合并。

详细信息请查看RTS首页 [ RTS Homepage]

RTS任务在评估进行期间，所有参赛者的系统将使用Twitter流API监听关于“interest profiles”的Twitter样本流，并实时地执行评估任务。（Twitter流API提供了大约1％的所有推文的采样样本，免费提供给所有注册用户），注意评估时间以UTC为标准，参赛者负责将UTC转换为本地时间以与保证评估开始和结束时间一致。

interest profiles类似于ad hoc retrieval任务中的topic，代表用户的信息需求。可以理解为用户订阅的或感兴趣的话题。

由于本人更关注数据，所以贴出2015、2016年的interest profiles。

[Interest profiles that were assessed from TREC 2015]

[Additional interest profiles culled from TREC 2015]

[New interest profiles developed for TREC 2016]

二、What problems are RTS trying to solve?

以Twitter为例，RTS的主要任务就是将与用户关心的topic相关的tweets以某种方式告知给用户。这里的topic就相当于interest profiles里的内容，某种方式就相当于下面要介绍的两个场景。官网的例子是：用户可能对2016年美国总统选举的投票结果感兴趣，并希望在发布竞选新结果时获得通知。我们可以想象两种传播更新的方法：

Scenario A: Push notifications（推送通知）：
一旦系统识别出相关的post，系统应该立即通过发送推送通知的方式发送到用户的移动电话。推送的通知应该是满足下列三个要求

1）相关的（on topic）2）及时的（provide updates as soon after the actual event occurrence as possible） 3）新颖的（users should not be pushed multiple notifications that say the same thing）

Scenario B: Email digest（邮件摘要）：
或者，用户可能想要接收关于interest profiles的邮件摘要，摘要阐述了当天发生了什么。结果需要满足以下两个要求：

1）相关的（relevant）2）新颖的（novel）

下面介绍一下interest profiles的格式：

{ "topid" : "MB246",  "title" : "Greek international debt crisis",  "description" : "Find information related to the crisis surrounding the Greek debt to international creditors, and the consequences of their possible withdrawal from the European Union.",  "narrative" : "Given the continuing crisis over the Greek debt to international creditors, such as the International Monetary Fund (IMF), European Central Bank (ECB), and the European Commission, the user is interested in information on how this debt is being handled, including the possible withdrawal of Greece from the euro zone, and the consequences of such a move."  }

title：contains a short description of the information need
description： sentence- elaborations of the information need
narrative：paragraph-long elaborations of the information need

三、General Evaluation Setup

这里写图片描述

对于情景A（推送通知），系统将基于用户的interest profile实时标识为相关的内容推送到TREC RTS evaluation broker（经由REST API）。这些通知将立即交付到一组Assessors的手机上。

对于情景B（电子邮件摘要），在评估期结束时，参与者将上传一个推文列表到NIST服务器进行评估。

情景A有两种方式进行评估，分别是live user-in-the-loop assessments和traditional post hoc batch evaluations，情景B，只使用traditional post hoc batch evaluations的方式进行评估。

live user-in-the-loop assessments方式，评估者会为每一条推送打上relevant、relevant but redundant 、not relevant三种标签；traditional post hoc batch evaluations的方式除了打标签以外还要对相关的tweet进行语义聚类，然后采用TREC 2015微博评估的方法进行评估。

1 0