Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources

Darina Benikova; Margot Mieskes; Christian M. Meyer; Iryna Gurevych

Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources

Darina Benikova, Margot Mieskes, Christian M. Meyer, Iryna Gurevych

Abstract

Coherent extracts are a novel type of summary combining the advantages of manually created abstractive summaries, which are fluent but difficult to evaluate, and low-quality automatically created extractive summaries, which lack coherence and structure. We use a corpus of heterogeneous documents to address the issue that information seekers usually face – a variety of different types of information sources. We directly extract information from these, but minimally redact and meaningfully order it to form a coherent text. Our qualitative and quantitative evaluations show that quantitative results are not sufficient to judge the quality of a summary and that other quality criteria, such as coherence, should also be taken into account. We find that our manually created corpus is of high quality and that it has the potential to bridge the gap between reference corpora of abstracts and automatic methods producing extracts. Our corpus is available to the research community for further development.

Anthology ID:: C16-1099
Volume:: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:: December
Year:: 2016
Address:: Osaka, Japan
Editors:: Yuji Matsumoto, Rashmi Prasad
Venue:: COLING
SIG:
Publisher:: The COLING 2016 Organizing Committee
Note:
Pages:: 1039–1050
Language:
URL:: https://aclanthology.org/C16-1099
DOI:
Bibkey:
Cite (ACL):: Darina Benikova, Margot Mieskes, Christian M. Meyer, and Iryna Gurevych. 2016. Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1039–1050, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):: Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources (Benikova et al., COLING 2016)
Copy Citation:
PDF:: https://aclanthology.org/C16-1099.pdf
Code: AIPHES/DBS

PDF Cite Search Code