Abstract
Recent multimodal large language models (MLLMs) have demonstrated significant potential in open-ended conversation, generating more accurate and personalized responses. However, their abilities to memorize, recall, and reason in sustained interactions within real-world scenarios remain underexplored. This paper introduces MMRC, a Multi-Modal Real-world Conversation benchmark for evaluating six core open-ended abilities of MLLMs: information extraction, multi-turn reasoning, information update, image management, memory recall, and answer refusal. With data collected from real-world scenarios, MMRC comprises 5,120 conversations and 28,720 corresponding manually labeled questions, posing a significant challenge to existing MLLMs. Evaluations on 20 MLLMs in MMRC indicate an accuracy drop during open-ended interactions. We identify four common failure patterns: long-term memory degradation, inadequacies in updating factual knowledge, accumulated assumption of error propagation, and reluctance to “say no.” To mitigate these issues, we propose a simple yet effective NOTE-TAKING strategy, which can record key information from the conversation and remind the model during its responses, enhancing conversational capabilities. Experiments across six MLLMs demonstrate significant performance improvements.
| Original language | English |
|---|---|
| Title of host publication | The Annual Meeting of the Association for Computational Linguistics |
| Subtitle of host publication | ACL 2025 |
| Editors | Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar |
| Place of Publication | Vienna, Austria |
| Publisher | Association for Computational Linguistics (ACL) |
| Chapter | 1 |
| Pages | 22477 |
| Number of pages | 22503 |
| Volume | 1 |
| Edition | 1 |
| ISBN (Electronic) | 979-8-89176-251-0 |
| ISBN (Print) | 979-8-89176-251-0 |
| DOIs | |
| Publication status | Published - 24 Jul 2025 |
| Event | 63rd Annual Meeting of the Association for Computational Linguistics: ACL 2025 - Vienna, Austria, Vienna, Austria Duration: 27 Jul 2024 → 1 Aug 2025 https://2025.aclweb.org/ |
Conference
| Conference | 63rd Annual Meeting of the Association for Computational Linguistics |
|---|---|
| Country/Territory | Austria |
| City | Vienna |
| Period | 27/07/24 → 1/08/25 |
| Internet address |