Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

Xingwei He; Yeyun Gong; A-Long Jin; Weizhen Qi; Hang Zhang; Jian Jiao; Bartuer Zhou; Biao Cheng; Siu-Ming Yiu; Nan Duan

doi:10.18653/v1/2022.emnlp-main.53

Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

Xingwei He, Yeyun Gong^*, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, Siu-Ming Yiu, Nan Duan

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

10 Citations (Scopus)

Abstract

Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities. Previous work focuses on retrieving prototype sentences for the provided concepts to assist generation. They first use a sparse retriever to retrieve candidate sentences, then re-rank the candidates with a ranker. However, the candidates returned by their ranker may not be the most relevant sentences, since the ranker treats all candidates equally without considering their relevance to the reference sentences of the given concepts. Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. We further transfer the critical knowledge summarized by the distilled ranker to the retriever. In this way, the relevance scores of candidate sentences predicted by the ranker and retriever will be more consistent with their quality measured by the metric. Experimental results on the CommonGen benchmark verify the effectiveness of our proposed method: (1) Our generation model with the distilled ranker achieves a new state-of-the-art result. (2) Our generation model with the distilled retriever even surpasses the previous SOTA.

Original language	English
Title of host publication	2022 Conference on Empirical Methods in Natural Language Processing
Editors	Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Publisher	Association for Computational Linguistics (ACL)
Pages	839-852
Number of pages	14
ISBN (Electronic)	9781959429401
DOIs	https://doi.org/10.18653/v1/2022.emnlp-main.53
Publication status	Published - 2022
Externally published	Yes
Event	2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 11 Dec 2022

Publication series

Name	Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Conference

Conference	2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/Territory	United Arab Emirates
City	Abu Dhabi
Period	7/12/22 → 11/12/22

Access to Document

10.18653/v1/2022.emnlp-main.53

Cite this

He, X., Gong, Y., Jin, A.-L., Qi, W., Zhang, H., Jiao, J., Zhou, B., Cheng, B., Yiu, S.-M., & Duan, N. (2022). Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), 2022 Conference on Empirical Methods in Natural Language Processing (pp. 839-852). (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.53

He, Xingwei ; Gong, Yeyun ; Jin, A-Long et al. / Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning. 2022 Conference on Empirical Methods in Natural Language Processing. editor / Yoav Goldberg ; Zornitsa Kozareva ; Yue Zhang. Association for Computational Linguistics (ACL), 2022. pp. 839-852 (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022).

@inproceedings{a7594b0ed19947ad9ebe87238759986c,

title = "Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning",

abstract = "Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities. Previous work focuses on retrieving prototype sentences for the provided concepts to assist generation. They first use a sparse retriever to retrieve candidate sentences, then re-rank the candidates with a ranker. However, the candidates returned by their ranker may not be the most relevant sentences, since the ranker treats all candidates equally without considering their relevance to the reference sentences of the given concepts. Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. We further transfer the critical knowledge summarized by the distilled ranker to the retriever. In this way, the relevance scores of candidate sentences predicted by the ranker and retriever will be more consistent with their quality measured by the metric. Experimental results on the CommonGen benchmark verify the effectiveness of our proposed method: (1) Our generation model with the distilled ranker achieves a new state-of-the-art result. (2) Our generation model with the distilled retriever even surpasses the previous SOTA.",

author = "Xingwei He and Yeyun Gong and A-Long Jin and Weizhen Qi and Hang Zhang and Jian Jiao and Bartuer Zhou and Biao Cheng and Siu-Ming Yiu and Nan Duan",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 ; Conference date: 07-12-2022 Through 11-12-2022",

year = "2022",

doi = "10.18653/v1/2022.emnlp-main.53",

language = "English",

series = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022",

publisher = "Association for Computational Linguistics (ACL)",

pages = "839--852",

editor = "Yoav Goldberg and Zornitsa Kozareva and Yue Zhang",

booktitle = "2022 Conference on Empirical Methods in Natural Language Processing",

}

He, X, Gong, Y, Jin, A-L, Qi, W, Zhang, H, Jiao, J, Zhou, B, Cheng, B, Yiu, S-M & Duan, N 2022, Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning. in Y Goldberg, Z Kozareva & Y Zhang (eds), 2022 Conference on Empirical Methods in Natural Language Processing. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Association for Computational Linguistics (ACL), pp. 839-852, 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, 7/12/22. https://doi.org/10.18653/v1/2022.emnlp-main.53

Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning. / He, Xingwei; Gong, Yeyun; Jin, A-Long et al.
2022 Conference on Empirical Methods in Natural Language Processing. ed. / Yoav Goldberg; Zornitsa Kozareva; Yue Zhang. Association for Computational Linguistics (ACL), 2022. p. 839-852 (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

AU - He, Xingwei

AU - Gong, Yeyun

AU - Jin, A-Long

AU - Qi, Weizhen

AU - Zhang, Hang

AU - Jiao, Jian

AU - Zhou, Bartuer

AU - Cheng, Biao

AU - Yiu, Siu-Ming

AU - Duan, Nan

PY - 2022

Y1 - 2022

N2 - Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities. Previous work focuses on retrieving prototype sentences for the provided concepts to assist generation. They first use a sparse retriever to retrieve candidate sentences, then re-rank the candidates with a ranker. However, the candidates returned by their ranker may not be the most relevant sentences, since the ranker treats all candidates equally without considering their relevance to the reference sentences of the given concepts. Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. We further transfer the critical knowledge summarized by the distilled ranker to the retriever. In this way, the relevance scores of candidate sentences predicted by the ranker and retriever will be more consistent with their quality measured by the metric. Experimental results on the CommonGen benchmark verify the effectiveness of our proposed method: (1) Our generation model with the distilled ranker achieves a new state-of-the-art result. (2) Our generation model with the distilled retriever even surpasses the previous SOTA.

AB - Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities. Previous work focuses on retrieving prototype sentences for the provided concepts to assist generation. They first use a sparse retriever to retrieve candidate sentences, then re-rank the candidates with a ranker. However, the candidates returned by their ranker may not be the most relevant sentences, since the ranker treats all candidates equally without considering their relevance to the reference sentences of the given concepts. Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. We further transfer the critical knowledge summarized by the distilled ranker to the retriever. In this way, the relevance scores of candidate sentences predicted by the ranker and retriever will be more consistent with their quality measured by the metric. Experimental results on the CommonGen benchmark verify the effectiveness of our proposed method: (1) Our generation model with the distilled ranker achieves a new state-of-the-art result. (2) Our generation model with the distilled retriever even surpasses the previous SOTA.

UR - http://www.scopus.com/inward/record.url?scp=85144754843&partnerID=8YFLogxK

U2 - 10.18653/v1/2022.emnlp-main.53

DO - 10.18653/v1/2022.emnlp-main.53

M3 - Conference Proceeding

AN - SCOPUS:85144754843

T3 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

SP - 839

EP - 852

BT - 2022 Conference on Empirical Methods in Natural Language Processing

A2 - Goldberg, Yoav

A2 - Kozareva, Zornitsa

A2 - Zhang, Yue

PB - Association for Computational Linguistics (ACL)

T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Y2 - 7 December 2022 through 11 December 2022

ER -

He X, Gong Y, Jin AL, Qi W, Zhang H, Jiao J et al. Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning. In Goldberg Y, Kozareva Z, Zhang Y, editors, 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL). 2022. p. 839-852. (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022). doi: 10.18653/v1/2022.emnlp-main.53

Metric-Guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this