TY - GEN
T1 - Guardians of Discourse
T2 - 10th IEEE Smart World Congress, SWC 2024
AU - He, Jianfei
AU - Wang, Lilin
AU - Wang, Jiaying
AU - Liu, Zhenyu
AU - Na, Hongbin
AU - Wang, Zimu
AU - Wang, Wei
AU - Chen, Qi
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Identifying offensive language is essential for maintaining safety and sustainability in the social media era. Though large language models (LLMs) have demonstrated encouraging potential in social media analytics, they lack thorough evaluation when in offensive language detection, particularly in multilingual environments. We for the first time evaluate multilingual offensive language detection of LLMs in three languages: English, Spanish, and German with three LLMs, GPT-3.5, Flan-T5, and Mistral, in both monolingual and multilingual settings. We further examine the impact of different prompt languages and augmented translation data for the task in non-English contexts. Furthermore, we discuss the impact of the inherent bias in LLMs and the datasets in the mispredictions related to sensitive topics.
AB - Identifying offensive language is essential for maintaining safety and sustainability in the social media era. Though large language models (LLMs) have demonstrated encouraging potential in social media analytics, they lack thorough evaluation when in offensive language detection, particularly in multilingual environments. We for the first time evaluate multilingual offensive language detection of LLMs in three languages: English, Spanish, and German with three LLMs, GPT-3.5, Flan-T5, and Mistral, in both monolingual and multilingual settings. We further examine the impact of different prompt languages and augmented translation data for the task in non-English contexts. Furthermore, we discuss the impact of the inherent bias in LLMs and the datasets in the mispredictions related to sensitive topics.
KW - large language models
KW - multilingual
KW - Offensive language detection
UR - http://www.scopus.com/inward/record.url?scp=105002244485&partnerID=8YFLogxK
U2 - 10.1109/SWC62898.2024.00246
DO - 10.1109/SWC62898.2024.00246
M3 - Conference Proceeding
AN - SCOPUS:105002244485
T3 - Proceedings - 2024 IEEE Smart World Congress, SWC 2024 - 2024 IEEE Ubiquitous Intelligence and Computing, Autonomous and Trusted Computing, Digital Twin, Metaverse, Privacy Computing and Data Security, Scalable Computing and Communications
SP - 1603
EP - 1608
BT - Proceedings - 2024 IEEE Smart World Congress, SWC 2024 - 2024 IEEE Ubiquitous Intelligence and Computing, Autonomous and Trusted Computing, Digital Twin, Metaverse, Privacy Computing and Data Security, Scalable Computing and Communications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 December 2024 through 7 December 2024
ER -