TY - JOUR
T1 - Coarse-grained generalized zero-shot learning with efficient self-focus mechanism
AU - Yang, Guanyu
AU - Huang, Kaizhu
AU - Zhang, Rui
AU - Goulermas, John Y.
AU - Hussain, Amir
N1 - Funding Information:
The work was partially supported by the following: National Natural Science Foundation of China under No.61876155; Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under No. BE2020006-4B, K20181189, BK20181190; Key Program Special Fund in XJTLU under No. KSF-T-06, and KSF-E-26.
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/11/6
Y1 - 2021/11/6
N2 - For image classification in computer vision, the performance of conventional deep neural networks (DNN) may usually drop when labeled training samples are limited. In this case, few-shot learning (FSL) or particularly zero-shot learning (ZSL), i.e. classification of target classes with few or zero labeled training samples, was proposed to imitate the strong learning ability of human. However, recent investigations show that most existing ZSL models may easily overfit and they tend to misclassify the target instance as one class seen in the training set. To alleviate this problem, we proposed an embedding based ZSL method with a self-focus mechanism, i.e. a focus-ratio that introduces the importance of each dimension, into the model optimization process. The objective function will be reconstructed according to these focus-ratios encouraging that the embedding model focus exclusively on important dimensions in the target space. As the self-focus module only takes part in the training process, the over-fitting knowledge is apportioned, and hence the rest embedding model can become more generalized for the new classes during test. Experimental results on four benchmarks, including AwA1, AwA2, aPY and CUB, show that our method outperforms the state-of-the-art methods on coarse-grained ZSL tasks while not affecting the performance of fine-grained ZSL. Additionally, several comparisons demonstrate the superiority of the proposed mechanism.
AB - For image classification in computer vision, the performance of conventional deep neural networks (DNN) may usually drop when labeled training samples are limited. In this case, few-shot learning (FSL) or particularly zero-shot learning (ZSL), i.e. classification of target classes with few or zero labeled training samples, was proposed to imitate the strong learning ability of human. However, recent investigations show that most existing ZSL models may easily overfit and they tend to misclassify the target instance as one class seen in the training set. To alleviate this problem, we proposed an embedding based ZSL method with a self-focus mechanism, i.e. a focus-ratio that introduces the importance of each dimension, into the model optimization process. The objective function will be reconstructed according to these focus-ratios encouraging that the embedding model focus exclusively on important dimensions in the target space. As the self-focus module only takes part in the training process, the over-fitting knowledge is apportioned, and hence the rest embedding model can become more generalized for the new classes during test. Experimental results on four benchmarks, including AwA1, AwA2, aPY and CUB, show that our method outperforms the state-of-the-art methods on coarse-grained ZSL tasks while not affecting the performance of fine-grained ZSL. Additionally, several comparisons demonstrate the superiority of the proposed mechanism.
KW - Coarse-grained
KW - Inductive
KW - Weighted loss
KW - Zero-shot learning
UR - http://www.scopus.com/inward/record.url?scp=85113626620&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2021.08.027
DO - 10.1016/j.neucom.2021.08.027
M3 - Article
AN - SCOPUS:85113626620
SN - 0925-2312
VL - 463
SP - 400
EP - 410
JO - Neurocomputing
JF - Neurocomputing
ER -