Steering Large Text-to-Image Model for Kandinsky Synthesis Through Preference-Based Prompt Optimization

  • Aven-Le Zhou
  • , Wei Wu
  • , Yu-Ao Wang
  • , Kang Zhang*
  • *Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

With the advancement of neural generative capabilities, the art community has increasingly embraced GenAI (Generative Artificial Intelligence), particularly large text-to-image models, for producing aesthetically compelling results. However, the process often lacks determinism and requires a tedious trial-and-error process, as users frequently struggle to devise effective prompts to achieve their desired outcomes. This paper introduces a prompting-free generative approach that applies a genetic algorithm and real-time iterative human feedback to optimize prompt generation, enabling the creation of user-preferred abstract art, e.g., Kandinsky’s Bauhaus style. The proposed two-part approach begins with constructing an Artist Model capable of deterministically generating Kandinsky paintings. The second phase integrates real-time user feedback to optimize prompt generation and obtains an “Optimized Prompting Model,” which adapts to user preferences and automatically generates prompts. Combined with the Artist Model, this approach allows users to create Kandinsky tailored to their preferences.
Original languageEnglish
Title of host publicationArtificial Intelligence in Music, Sound, Art and Design (EvoMUSART 2025)
Subtitle of host publicationInternational Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)
PublisherSpringer
Pages417-433
Number of pages17
DOIs
Publication statusPublished - 20 Apr 2025
Externally publishedYes

Cite this