Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts

Sidong Jiang; Siyuan Wang; Rui Zhang; Xi Yang; Kaizhu Huang

doi:10.1007/978-981-97-1417-9_30

Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts

Sidong Jiang^*, Siyuan Wang, Rui Zhang, Xi Yang, Kaizhu Huang

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.

Original language	English
Title of host publication	Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings
Editors	Jinchang Ren, Amir Hussain, Iman Yi Liao, Rongjun Chen, Kaizhu Huang, Huimin Zhao, Xiaoyong Liu, Ping Ma, Thomas Maul
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	322-332
Number of pages	11
ISBN (Print)	9789819714162
DOIs	https://doi.org/10.1007/978-981-97-1417-9_30
Publication status	Published - 22 May 2024
Event	13th International Conference on Brain Inspired Cognitive Systems, BICS 2023 - Kuala Lumpur, Malaysia Duration: 5 Aug 2023 → 6 Aug 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14374 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	13th International Conference on Brain Inspired Cognitive Systems, BICS 2023
Country/Territory	Malaysia
City	Kuala Lumpur
Period	5/08/23 → 6/08/23

Keywords

Black-box attack
Diffusion models
Image protection

Access to Document

10.1007/978-981-97-1417-9_30

Cite this

Jiang, S., Wang, S., Zhang, R., Yang, X., & Huang, K. (2024). Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts. In J. Ren, A. Hussain, I. Y. Liao, R. Chen, K. Huang, H. Zhao, X. Liu, P. Ma, & T. Maul (Eds.), Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings (pp. 322-332). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14374 LNAI). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-97-1417-9_30

Jiang, Sidong ; Wang, Siyuan ; Zhang, Rui et al. / Cipher-Prompt : Towards a Safe Diffusion Model via Learning Cryptographic Prompts. Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings. editor / Jinchang Ren ; Amir Hussain ; Iman Yi Liao ; Rongjun Chen ; Kaizhu Huang ; Huimin Zhao ; Xiaoyong Liu ; Ping Ma ; Thomas Maul. Springer Science and Business Media Deutschland GmbH, 2024. pp. 322-332 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{d53ca57a085a4c15921069fb8233c891,

title = "Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts",

abstract = "Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.",

keywords = "Black-box attack, Diffusion models, Image protection",

author = "Sidong Jiang and Siyuan Wang and Rui Zhang and Xi Yang and Kaizhu Huang",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.; 13th International Conference on Brain Inspired Cognitive Systems, BICS 2023 ; Conference date: 05-08-2023 Through 06-08-2023",

year = "2024",

month = may,

day = "22",

doi = "10.1007/978-981-97-1417-9_30",

language = "English",

isbn = "9789819714162",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "322--332",

editor = "Jinchang Ren and Amir Hussain and Liao, {Iman Yi} and Rongjun Chen and Kaizhu Huang and Huimin Zhao and Xiaoyong Liu and Ping Ma and Thomas Maul",

booktitle = "Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings",

}

Jiang, S, Wang, S, Zhang, R , Yang, X & Huang, K 2024, Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts. in J Ren, A Hussain, IY Liao, R Chen, K Huang, H Zhao, X Liu, P Ma & T Maul (eds), Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14374 LNAI, Springer Science and Business Media Deutschland GmbH, pp. 322-332, 13th International Conference on Brain Inspired Cognitive Systems, BICS 2023, Kuala Lumpur, Malaysia, 5/08/23. https://doi.org/10.1007/978-981-97-1417-9_30

Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts. / Jiang, Sidong; Wang, Siyuan; Zhang, Rui et al.
Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings. ed. / Jinchang Ren; Amir Hussain; Iman Yi Liao; Rongjun Chen; Kaizhu Huang; Huimin Zhao; Xiaoyong Liu; Ping Ma; Thomas Maul. Springer Science and Business Media Deutschland GmbH, 2024. p. 322-332 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14374 LNAI).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Cipher-Prompt

T2 - 13th International Conference on Brain Inspired Cognitive Systems, BICS 2023

AU - Jiang, Sidong

AU - Wang, Siyuan

AU - Zhang, Rui

AU - Yang, Xi

AU - Huang, Kaizhu

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

PY - 2024/5/22

Y1 - 2024/5/22

N2 - Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.

AB - Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.

KW - Black-box attack

KW - Diffusion models

KW - Image protection

UR - http://www.scopus.com/inward/record.url?scp=85195139314&partnerID=8YFLogxK

U2 - 10.1007/978-981-97-1417-9_30

DO - 10.1007/978-981-97-1417-9_30

M3 - Conference Proceeding

AN - SCOPUS:85195139314

SN - 9789819714162

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 322

EP - 332

BT - Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings

A2 - Ren, Jinchang

A2 - Hussain, Amir

A2 - Liao, Iman Yi

A2 - Chen, Rongjun

A2 - Huang, Kaizhu

A2 - Zhao, Huimin

A2 - Liu, Xiaoyong

A2 - Ma, Ping

A2 - Maul, Thomas

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 5 August 2023 through 6 August 2023

ER -

Jiang S, Wang S, Zhang R , Yang X, Huang K. Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts. In Ren J, Hussain A, Liao IY, Chen R, Huang K, Zhao H, Liu X, Ma P, Maul T, editors, Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings. Springer Science and Business Media Deutschland GmbH. 2024. p. 322-332. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-981-97-1417-9_30