TY - JOUR
T1 - Discriminative Feature Enhancement Network for Few-Shot Classification and Beyond
AU - Wu, Fangyu
AU - Wang, Qiu-Feng
AU - Liu, Xuan
AU - Chen, Qi
AU - Zhao, Yuxuan
AU - Zhang, Bailing
AU - Lim, Eng Gee
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/12/1
Y1 - 2024/12/1
N2 - Few-shot classification aims to recognize query samples from novel classes given scarce labeled data, which remains a challenging problem in machine learning. This paper proposes a Discriminative Feature Enhancement Network (DFENet) to distinguish the discriminative feature of a novel category with the following characteristics: (1) a Cross-Modal Guidance Module (CMG-Module) is proposed to enrich the query vectors via leveraging label information from the additional modality; (2) a Neural-Decoding based Attention Module (NDA-Module) further explores the relationship between the query and support samples, and the attention weights depend on contribution of the support to the reconstructed query. The main intuition of the NDA-Module is to deepen our understanding of the self-attention mechanism from a neural decoding perspective, emphasizing that the best reconstruction can be used as a universal guideline; (3) a flexible triplet loss is designed to incorporate the semantic context among all classes and distinguish samples from different classes. The experimental results on two few-shot classification datasets show that our DFENet can learn a more discriminative representation for novel classes progressively. We also test the proposed method on image retrieval and facial expression recognition tasks, demonstrating consistent improvement.
AB - Few-shot classification aims to recognize query samples from novel classes given scarce labeled data, which remains a challenging problem in machine learning. This paper proposes a Discriminative Feature Enhancement Network (DFENet) to distinguish the discriminative feature of a novel category with the following characteristics: (1) a Cross-Modal Guidance Module (CMG-Module) is proposed to enrich the query vectors via leveraging label information from the additional modality; (2) a Neural-Decoding based Attention Module (NDA-Module) further explores the relationship between the query and support samples, and the attention weights depend on contribution of the support to the reconstructed query. The main intuition of the NDA-Module is to deepen our understanding of the self-attention mechanism from a neural decoding perspective, emphasizing that the best reconstruction can be used as a universal guideline; (3) a flexible triplet loss is designed to incorporate the semantic context among all classes and distinguish samples from different classes. The experimental results on two few-shot classification datasets show that our DFENet can learn a more discriminative representation for novel classes progressively. We also test the proposed method on image retrieval and facial expression recognition tasks, demonstrating consistent improvement.
KW - Attention mechanism
KW - Few-shot learning
KW - Multi-modal learning
KW - Neural decoding principle
UR - http://www.scopus.com/inward/record.url?scp=85199889423&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.124811
DO - 10.1016/j.eswa.2024.124811
M3 - Article
SN - 0957-4174
VL - 255
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 124811
ER -