TY - GEN
T1 - Improving Biomedical Claim Detection using Prompt Learning Approaches
AU - Chen, Tong
AU - Stefanidis, Angelos
AU - Jiang, Zhengyong
AU - Su, Jionglong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Biomedical claim detection is an effective method to uncover negative effects arising from the treatment of disease and detect misinformation about medications from online platforms. Due to the power of pre-trained language models (PLMs), such as BERT, RoBERTa and T5, fine-tuned PLMs perform exceptionally well in biomedical claim detection. However, a gap exists in the text classification task between objective forms used in pre-training and fine-tuning for PLMs methods, preventing these models from taking full advantage of the information for biomedical claim detection. Motivated by the prompt learning approach, we propose a method, in which the classification task is transformed into a masked language modeling task that fully utilizes the mask learning capability of PLMs for better prediction of biomedical claim detection. In our method, a template with a mask representing the label is first constructed, and the mask is then filled and mapped to the corresponding label. We use three PLMs as backbone models, i.e., BERT, RoBERTa, and T5, with both hard and mixed templates which are fully and partially predefined templates. Experimental results using the BioClaim dataset demonstrate the superiority of the prompt learning methods over the BERT and RoBERTa classification baselines. Furthermore, the T5 model with mixed template consistently outperforms the rest of experimented models and achieves state-of-the-art performance with an increase of 5.3% on F1-score compared to previous research on this dataset.
AB - Biomedical claim detection is an effective method to uncover negative effects arising from the treatment of disease and detect misinformation about medications from online platforms. Due to the power of pre-trained language models (PLMs), such as BERT, RoBERTa and T5, fine-tuned PLMs perform exceptionally well in biomedical claim detection. However, a gap exists in the text classification task between objective forms used in pre-training and fine-tuning for PLMs methods, preventing these models from taking full advantage of the information for biomedical claim detection. Motivated by the prompt learning approach, we propose a method, in which the classification task is transformed into a masked language modeling task that fully utilizes the mask learning capability of PLMs for better prediction of biomedical claim detection. In our method, a template with a mask representing the label is first constructed, and the mask is then filled and mapped to the corresponding label. We use three PLMs as backbone models, i.e., BERT, RoBERTa, and T5, with both hard and mixed templates which are fully and partially predefined templates. Experimental results using the BioClaim dataset demonstrate the superiority of the prompt learning methods over the BERT and RoBERTa classification baselines. Furthermore, the T5 model with mixed template consistently outperforms the rest of experimented models and achieves state-of-the-art performance with an increase of 5.3% on F1-score compared to previous research on this dataset.
KW - Claim detection
KW - Natural language processing
KW - Pre-trained language models
KW - Prompt learning
UR - http://www.scopus.com/inward/record.url?scp=85182017974&partnerID=8YFLogxK
U2 - 10.1109/PRML59573.2023.10348317
DO - 10.1109/PRML59573.2023.10348317
M3 - Conference Proceeding
AN - SCOPUS:85182017974
T3 - 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning, PRML 2023
SP - 369
EP - 376
BT - 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning, PRML 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Pattern Recognition and Machine Learning, PRML 2023
Y2 - 4 August 2023 through 6 August 2023
ER -