TY - JOUR
T1 - Class-aware domain adaptation for improving adversarial robustness
AU - Hou, Xianxu
AU - Liu, Jingxin
AU - Xu, Bolei
AU - Wang, Xiaolong
AU - Liu, Bozhi
AU - Qiu, Guoping
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/7
Y1 - 2020/7
N2 - Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.
AB - Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.
KW - Adversarial robustness
KW - Domain adaptation
UR - http://www.scopus.com/inward/record.url?scp=85084337140&partnerID=8YFLogxK
U2 - 10.1016/j.imavis.2020.103926
DO - 10.1016/j.imavis.2020.103926
M3 - Article
AN - SCOPUS:85084337140
SN - 0262-8856
VL - 99
JO - Image and Vision Computing
JF - Image and Vision Computing
M1 - 103926
ER -