Class-aware domain adaptation for improving adversarial robustness

Xianxu Hou; Jingxin Liu; Bolei Xu; Xiaolong Wang; Bozhi Liu; Guoping Qiu

doi:10.1016/j.imavis.2020.103926

Class-aware domain adaptation for improving adversarial robustness

Xianxu Hou, Jingxin Liu, Bolei Xu, Xiaolong Wang, Bozhi Liu, Guoping Qiu^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

9 Citations (Scopus)

Abstract

Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.

Original language	English
Article number	103926
Journal	Image and Vision Computing
Volume	99
DOIs	https://doi.org/10.1016/j.imavis.2020.103926
Publication status	Published - Jul 2020
Externally published	Yes

Keywords

Adversarial robustness
Domain adaptation

Access to Document

10.1016/j.imavis.2020.103926

Cite this

@article{0d1f5122d12841a999cc4a6807966979,

title = "Class-aware domain adaptation for improving adversarial robustness",

abstract = "Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.",

keywords = "Adversarial robustness, Domain adaptation",

author = "Xianxu Hou and Jingxin Liu and Bolei Xu and Xiaolong Wang and Bozhi Liu and Guoping Qiu",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier B.V.",

year = "2020",

month = jul,

doi = "10.1016/j.imavis.2020.103926",

language = "English",

volume = "99",

journal = "Image and Vision Computing",

issn = "0262-8856",

}

TY - JOUR

T1 - Class-aware domain adaptation for improving adversarial robustness

AU - Hou, Xianxu

AU - Liu, Jingxin

AU - Xu, Bolei

AU - Wang, Xiaolong

AU - Liu, Bozhi

AU - Qiu, Guoping

PY - 2020/7

Y1 - 2020/7

N2 - Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.

AB - Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.

KW - Adversarial robustness

KW - Domain adaptation

UR - http://www.scopus.com/inward/record.url?scp=85084337140&partnerID=8YFLogxK

U2 - 10.1016/j.imavis.2020.103926

DO - 10.1016/j.imavis.2020.103926

M3 - Article

AN - SCOPUS:85084337140

SN - 0262-8856

VL - 99

JO - Image and Vision Computing

JF - Image and Vision Computing

M1 - 103926

ER -

Class-aware domain adaptation for improving adversarial robustness

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this