TY - JOUR
T1 - TOAN
T2 - Target-Oriented Alignment Network for Fine-Grained Image Categorization with Few Labeled Samples
AU - Huang, Huaxi
AU - Zhang, Junjie
AU - Yu, Litao
AU - Zhang, Jian
AU - Wu, Qiang
AU - Xu, Chang
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2022/2/1
Y1 - 2022/2/1
N2 - In this paper, we study the fine-grained categorization problem under the few-shot setting, i.e., each fine-grained class only contains a few labeled examples, termed Fine-Grained Few-Shot classification (FGFS). The core predicament in FGFS is the high intra-class variance yet low inter-class fluctuations in the dataset. In traditional fine-grained classification, the high intra-class variance can be somewhat relieved by conducting the supervised training on the abundant labeled samples. However, with few labeled examples, it is hard for the FGFS model to learn a robust class representation with the significantly higher intra-class variance. Moreover, the inter- and intra-class variance are closely related. The significant intra-class variance in FGFS often aggravates the low inter-class variance issue. To address the above challenges, we propose a Target-Oriented Alignment Network (TOAN) to tackle the FGFS problem from both intra- and inter-class perspective. To reduce the intra-class variance, we propose a target-oriented matching mechanism to reformulate the spatial features of each support image to match the query ones in the embedding space. To enhance the inter-class discrimination, we devise discriminative fine-grained features by integrating local compositional concept representations with the global second-order pooling. We conducted extensive experiments on four public datasets for fine-grained categorization, and the results show the proposed TOAN obtains the state-of-the-art.
AB - In this paper, we study the fine-grained categorization problem under the few-shot setting, i.e., each fine-grained class only contains a few labeled examples, termed Fine-Grained Few-Shot classification (FGFS). The core predicament in FGFS is the high intra-class variance yet low inter-class fluctuations in the dataset. In traditional fine-grained classification, the high intra-class variance can be somewhat relieved by conducting the supervised training on the abundant labeled samples. However, with few labeled examples, it is hard for the FGFS model to learn a robust class representation with the significantly higher intra-class variance. Moreover, the inter- and intra-class variance are closely related. The significant intra-class variance in FGFS often aggravates the low inter-class variance issue. To address the above challenges, we propose a Target-Oriented Alignment Network (TOAN) to tackle the FGFS problem from both intra- and inter-class perspective. To reduce the intra-class variance, we propose a target-oriented matching mechanism to reformulate the spatial features of each support image to match the query ones in the embedding space. To enhance the inter-class discrimination, we devise discriminative fine-grained features by integrating local compositional concept representations with the global second-order pooling. We conducted extensive experiments on four public datasets for fine-grained categorization, and the results show the proposed TOAN obtains the state-of-the-art.
KW - Few-shot setting
KW - Fine-grained image classification
KW - Second-order relation extraction
UR - http://www.scopus.com/inward/record.url?scp=85102700818&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2021.3065693
DO - 10.1109/TCSVT.2021.3065693
M3 - Article
AN - SCOPUS:85102700818
SN - 1051-8215
VL - 32
SP - 853
EP - 866
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 2
ER -