Introduction for Structured Text Summarization

来源:互联网 发布:龙门式加工中心编程 编辑:程序博客网 时间:2024/06/11 04:07

    Given the dramatic growth of digital content, new solutions are needed for us to be able to get a quick overview of pertinent information without being inundated by irrelevant details. While there has been ample research on automatic summarization methods, summaries may still be somewhat convoluted and hard to absorb. In our system, we propose the novel task of structured text summarization, which we address by combining ranking techniques with open information extraction. This method yields an uncluttered, more easily digestible overview of key insights from a text.

     In light of the staggering growth of digital content now vying for our attention at any given point in time, reading every long article that we come across in full detail is no longer practical. For instance, having assumed a leadership role at a multinational company, a person will likely lack the time for an in-depth reading of a long, detailed article recounting, say, the political deliberations surrounding the possible introduction of new labor laws in Thailand. Instead, such a person may just wish to receive a very concise overview of the proposals being discussed. While executives may be able to rely on consultants or staff to provide brief executive summaries, it would be helpful to have innovative new technological solutions to this problem. These could enable us to more quickly get an overview of the key information in a long article without being inundated by irrelevant details.

    Past research along these lines has focused on the task of automatic summarization, which compresses a given text to distill a shorter version. While this may go a long way, the resulting summaries may still be poorly organized and convoluted.
    In our system , we propose the novel task of structured text summarization,which seeks to produce structured lists of textual items that are less cluttered and constitute more easily digestible overviews of key insights from a text. We address this task by combining salience-based ranking techniques to identify important content with methods based on open domain information extraction (Open IE) to convert sentences to a structured form, from which the main thoughts are more easily discernible.


Consider the following input sentences:
      Harvard University is a private Ivy League research university in Cambridge, Massachusetts. Harvard is the United States’ oldest institution of higher learning. 


Our system combines and converts these two sentences into a structured form as follows:


阅读全文
0 0
原创粉丝点击