Abstract
Few-shot classification aims to recognize query samples from novel classes given scarce labeled data, which remains a challenging problem in machine learning. This paper proposes a Discriminative Feature Enhancement Network (DFENet) to distinguish the discriminative feature of a novel category with the following characteristics: (1) a Cross-Modal Guidance Module (CMG-Module) is proposed to enrich the query vectors via leveraging label information from the additional modality; (2) a Neural-Decoding based Attention Module (NDA-Module) further explores the relationship between the query and support samples, and the attention weights depend on contribution of the support to the reconstructed query. The main intuition of the NDA-Module is to deepen our understanding of the self-attention mechanism from a neural decoding perspective, emphasizing that the best reconstruction can be used as a universal guideline; (3) a flexible triplet loss is designed to incorporate the semantic context among all classes and distinguish samples from different classes. The experimental results on two few-shot classification datasets show that our DFENet can learn a more discriminative representation for novel classes progressively. We also test the proposed method on image retrieval and facial expression recognition tasks, demonstrating consistent improvement.
| Original language | English |
|---|---|
| Article number | 124811 |
| Journal | Expert Systems with Applications |
| Volume | 255 |
| DOIs | |
| Publication status | Published - 1 Dec 2024 |
Keywords
- Attention mechanism
- Few-shot learning
- Multi-modal learning
- Neural decoding principle