• 11 Posts
  • 70 Comments
Joined 6 months ago
cake
Cake day: April 19th, 2025

help-circle


  • The sequence of your prompt tokens does matter. If you really want something, it should go first. As a rule of thumb, you could split the prompt into three parts, as proposed in the perchance t2i beginner guide. Description (who), Composition (where, doing what), Style. In your example, you prioritize the style, followed by an “empty” token of the woman’s name, followed by the okay description. So, I’d suggest you use chat gpt or a similar LLM, ask it to “create a detailed descriptive prompt, using natural language, in the format (description>composition>style) for a Chroma Unlocked text to image generator of a …<insert your description>”. You’ll get much better results. As a side note, chroma unlocked as well as most flux models are trained on a large dataset, so the results are greatly diverse. You can only achieve character consistency locally on your machine using comfyUI or similar tools.





  • The AI’s have come a long way, especially the recent versions that catch a lot of details, but they are just not there yet for the task you’re describing working on the prompt alone. If you really want a detailed scene of three or more people you will need to dive into the world of LoRas, Control Net and other tools, in ComfyUI or similar program. Another approach would be to make a very detailed image of each of the three characters here on perchance, and then, ask Flux.1 Kontext to put them all in one scene doing something you want without changing their appearance. I’ve had very good results with this approach, and it’s much easier, too.















  • Good idea but that would be a hell of a job to track all the pieces across different lists plus specification for each piece where it goes. Since I’m lazy I just did:

    Uniform
      https://user.uploads.dev/file/53321b33603b8f9f386fb870db73ad65.png
      top=batman mask
      mid=batman suit with batman logo on the chest, batman gloves, batman belt
      low=batman pants, batman boots
      rear=batman cape, mask straps on the back of the head
      noZoom=[this.top], [this.mid], [this.low], partially visible batman cape on the sides
    

    and then just call it like [Uniform[camera]]. I can easily filter out different actor specifications now.