Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels

Rui Wang, Zhengxin Pan, Fangyu Wu, Yifan Lv, Bailing Zhang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Multi-label classification is a task with diverse applications, but current algorithms heavily rely on accurately labeled data, leading to time-consuming and labor-intensive data collection. However, multi-label classification with partial labels presents significant challenges. In this study, we propose Multi-modal Contextual Prompt Learning (MCPL), a novel approach that leverages large-scale visual-language models and exploits the strong image-text alignment in CLIP to address the scarcity of label annotations. We pre-train the visual language model's encoder on a large number of image-text pairs.. We introduce multi-modal contextual prompt learning in both images and labeled text to better utilize the image-label correspondence within CLIP, resulting in enhanced multi-label classification performance, even when faced with partial labels. We also use the coupling function to couple the two modes and realize the interactive connection of the two modal prompts. Extensive experiments on the MS-COCO and VOC2007 datasets, demonstrating its superiority and achieving competitive performance.

Original languageEnglish
Title of host publicationProceedings of the 2024 16th International Conference on Machine Learning and Computing, ICMLC 2024
PublisherAssociation for Computing Machinery
Pages517-524
Number of pages8
ISBN (Electronic)9798400709234
DOIs
Publication statusPublished - 2 Feb 2024
Event16th International Conference on Machine Learning and Computing, ICMLC 2024 - Shenzhen, China
Duration: 2 Feb 20245 Feb 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference16th International Conference on Machine Learning and Computing, ICMLC 2024
Country/TerritoryChina
CityShenzhen
Period2/02/245/02/24

Keywords

  • Multi-label classification
  • Partial label
  • Prompt learning

Fingerprint

Dive into the research topics of 'Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels'. Together they form a unique fingerprint.

Cite this