Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts

Sidong Jiang*, Siyuan Wang, Rui Zhang, Xi Yang, Kaizhu Huang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.

Original languageEnglish
Title of host publicationAdvances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings
EditorsJinchang Ren, Amir Hussain, Iman Yi Liao, Rongjun Chen, Kaizhu Huang, Huimin Zhao, Xiaoyong Liu, Ping Ma, Thomas Maul
PublisherSpringer Science and Business Media Deutschland GmbH
Pages322-332
Number of pages11
ISBN (Print)9789819714162
DOIs
Publication statusPublished - 22 May 2024
Event13th International Conference on Brain Inspired Cognitive Systems, BICS 2023 - Kuala Lumpur, Malaysia
Duration: 5 Aug 20236 Aug 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14374 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Brain Inspired Cognitive Systems, BICS 2023
Country/TerritoryMalaysia
CityKuala Lumpur
Period5/08/236/08/23

Keywords

  • Black-box attack
  • Diffusion models
  • Image protection

Fingerprint

Dive into the research topics of 'Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts'. Together they form a unique fingerprint.

Cite this