Abstract
Multimodal Large Language Models (MLLMs) in healthcare suffer from severe confirmation bias, often hallucinating visual details to support initial, potentially erroneous diagnostic hypotheses. Existing Chain-of-Thought (CoT) approaches lack intrinsic correction mechanisms, rendering them vulnerable to error propagation. To bridge this gap, we propose Dialectic-Med, a multi-agent framework that enforces diagnostic rigor through adversarial dialectics. Unlike static consensus models, Dialectic-Med orchestrates a dynamic interplay between three role-specialized agents: a proponent that formulates diagnostic hypotheses; an opponent equipped with a novel visual falsification module that actively retrieves contradictory visual evidence to challenge the Proponent; and a mediator that resolves conflicts via a weighted consensus graph. By explicitly modeling the cognitive process of falsification, our framework guarantees that diagnostic reasoning is tightly grounded in verified visual regions. Empirical evaluations on MIMIC-CXR-VQA, VQA-RAD, and PathVQA demonstrate that Dialectic-Med not only achieves state-of-the-art performance but also fundamentally enhances the trustworthiness of the reasoning process. Beyond accuracy, our approach significantly enhances explanation faithfulness and decisively mitigates hallucinations, establishing a new standard over single-agent baselines.
| Original language | English |
|---|---|
| Title of host publication | The 64th Annual Meeting of the Association for Computational Linguistics |
| Subtitle of host publication | ACL 2026 |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 1-18 |
| Number of pages | 18 |
| Publication status | Accepted/In press - 7 Apr 2026 |
| Event | The 64th Annual Meeting of the Association for Computational Linguistics: ACL 2026 - San Diego, California, United States, San Diego, United States Duration: 2 Jul 2026 → 7 Jul 2026 https://2026.aclweb.org/ |
Conference
| Conference | The 64th Annual Meeting of the Association for Computational Linguistics |
|---|---|
| Country/Territory | United States |
| City | San Diego |
| Period | 2/07/26 → 7/07/26 |
| Internet address |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Medical Multimodal LLMs
- Multi-Agent Systems
- Hallucination Mitigation
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver