Abstract
Currently the abstractive method and extractive method are two main approaches for automatic document summarization. To fully integrate the relatedness and advantages of both approaches, we propose in this paper a general framework for abstractive summarization which incorporates extractive summarization as an auxiliary task. In particular, our framework is composed of a shared hierarchical document encoder, an attention-based decoder for abstractive summarization, and an extractor for sentence-level extractive summarization. Learning these two tasks jointly with the shared encoder allows us to better capture the semantics in the document. Moreover, we constrain the attention learned in the abstractive task by the salience estimated in the extractive task to strengthen their consistency. Experiments on the CNN/DailyMail dataset demonstrate that both the auxiliary task and the attention constraint contribute to improve the performance significantly, and our model is comparable to the state-of-the-art abstractive models.
Original language | English |
---|---|
Title of host publication | Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data |
Pages | 3-15 |
DOIs | |
Publication status | Published - 2018 |
Externally published | Yes |