TY - GEN
T1 - Leveraging Large Language Models for QA Dialogue Dataset Construction and Analysis in Public Services
AU - Wu, Chaomin
AU - Wu, Di
AU - Pan, Yushan
AU - Wang, Hao
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - This paper identifies the limitations of current AI datasets within the public service sector, specifically concerning the human-robot interaction (HRI) context. Existing datasets often lack the necessary interactive features for effective and efficient interactions, hindering the development of customized and emotionally responsive systems. As public service demands become more diverse and complex in HRI, traditional datasets fail to support high-quality interactions, necessitating significant improvements. To address this issue, we introduce a QA dialogue dataset specifically tailored for public service applications, comprising 1208 pairs generated by large language model. This dataset integrates textual and emotional data, providing detailed annotations for interaction quality and emotional accuracy. Our method includes four stages: data generation, annotation, emotion analysis, and performance evaluation. During the data generation stage, GPT-4 is employed to create a diverse set of dialogues. In the annotation stage, these dialogues are meticulously labeled for quality and emotional content. The emotion analysis stage utilizes various recognition algorithms to process the data. Finally, the performance evaluation stage involves experiments to validate the dataset’s effectiveness. Comparative experiments demonstrate the dataset’s efficacy in enhancing the adaptability and performance of public service robots, underscoring its potential for training AI models to effectively handle real-world dialogues.
AB - This paper identifies the limitations of current AI datasets within the public service sector, specifically concerning the human-robot interaction (HRI) context. Existing datasets often lack the necessary interactive features for effective and efficient interactions, hindering the development of customized and emotionally responsive systems. As public service demands become more diverse and complex in HRI, traditional datasets fail to support high-quality interactions, necessitating significant improvements. To address this issue, we introduce a QA dialogue dataset specifically tailored for public service applications, comprising 1208 pairs generated by large language model. This dataset integrates textual and emotional data, providing detailed annotations for interaction quality and emotional accuracy. Our method includes four stages: data generation, annotation, emotion analysis, and performance evaluation. During the data generation stage, GPT-4 is employed to create a diverse set of dialogues. In the annotation stage, these dialogues are meticulously labeled for quality and emotional content. The emotion analysis stage utilizes various recognition algorithms to process the data. Finally, the performance evaluation stage involves experiments to validate the dataset’s effectiveness. Comparative experiments demonstrate the dataset’s efficacy in enhancing the adaptability and performance of public service robots, underscoring its potential for training AI models to effectively handle real-world dialogues.
KW - Emotion Analysis
KW - Human-Robot Interaction
KW - Public Service
KW - QA Dialogue Datasets
UR - http://www.scopus.com/inward/record.url?scp=85209773197&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-9431-7_5
DO - 10.1007/978-981-97-9431-7_5
M3 - Conference Proceeding
AN - SCOPUS:85209773197
SN - 9789819794300
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 56
EP - 68
BT - Natural Language Processing and Chinese Computing - 13th National CCF Conference, NLPCC 2024, Proceedings
A2 - Wong, Derek F.
A2 - Wei, Zhongyu
A2 - Yang, Muyun
PB - Springer Science and Business Media Deutschland GmbH
T2 - 13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024
Y2 - 1 November 2024 through 3 November 2024
ER -