Leveraging Large Language Models for QA Dialogue Dataset Construction and Analysis in Public Services

Chaomin Wu, Di Wu, Yushan Pan, Hao Wang*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

This paper identifies the limitations of current AI datasets within the public service sector, specifically concerning the human-robot interaction (HRI) context. Existing datasets often lack the necessary interactive features for effective and efficient interactions, hindering the development of customized and emotionally responsive systems. As public service demands become more diverse and complex in HRI, traditional datasets fail to support high-quality interactions, necessitating significant improvements. To address this issue, we introduce a QA dialogue dataset specifically tailored for public service applications, comprising 1208 pairs generated by large language model. This dataset integrates textual and emotional data, providing detailed annotations for interaction quality and emotional accuracy. Our method includes four stages: data generation, annotation, emotion analysis, and performance evaluation. During the data generation stage, GPT-4 is employed to create a diverse set of dialogues. In the annotation stage, these dialogues are meticulously labeled for quality and emotional content. The emotion analysis stage utilizes various recognition algorithms to process the data. Finally, the performance evaluation stage involves experiments to validate the dataset’s effectiveness. Comparative experiments demonstrate the dataset’s efficacy in enhancing the adaptability and performance of public service robots, underscoring its potential for training AI models to effectively handle real-world dialogues.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 13th National CCF Conference, NLPCC 2024, Proceedings
EditorsDerek F. Wong, Zhongyu Wei, Muyun Yang
PublisherSpringer Science and Business Media Deutschland GmbH
Pages56-68
Number of pages13
ISBN (Print)9789819794300
DOIs
Publication statusPublished - 2025
Event13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024 - Hangzhou, China
Duration: 1 Nov 20243 Nov 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15359 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024
Country/TerritoryChina
CityHangzhou
Period1/11/243/11/24

Keywords

  • Emotion Analysis
  • Human-Robot Interaction
  • Public Service
  • QA Dialogue Datasets

Fingerprint

Dive into the research topics of 'Leveraging Large Language Models for QA Dialogue Dataset Construction and Analysis in Public Services'. Together they form a unique fingerprint.

Cite this