What moduls and models Pechange t2i is based on?

Koto@lemmy.world · 27 days ago

The most amateur looking model is Flux.Krea. It’s censored, though, in comparison with Chroma. Chroma Flash Heun is supposed to run with 10 steps Heun sampler. Try that. I’ve tried it and it’s slightly different from Perchance. There’s a model patch and loras from Perchance, I’m pretty sure, so you’re unlikely to 100% replicate it.

Koto@lemmy.world · 1 month ago

What moduls and models Pechange t2i is based on?

Koto@lemmy.world · 2 months ago

The sequence of your prompt tokens does matter. If you really want something, it should go first. As a rule of thumb, you could split the prompt into three parts, as proposed in the perchance t2i beginner guide. Description (who), Composition (where, doing what), Style. In your example, you prioritize the style, followed by an “empty” token of the woman’s name, followed by the okay description. So, I’d suggest you use chat gpt or a similar LLM, ask it to “create a detailed descriptive prompt, using natural language, in the format (description>composition>style) for a Chroma Unlocked text to image generator of a …<insert your description>”. You’ll get much better results. As a side note, chroma unlocked as well as most flux models are trained on a large dataset, so the results are greatly diverse. You can only achieve character consistency locally on your machine using comfyUI or similar tools.

Koto@lemmy.world · 2 months ago

They say, the model is Chroma, https://huggingface.co/lodestones/Chroma/tree/main. I have my doubts though. I’ve tried all kinds of sampler/scheduler combinations using XY Plot, nothing came close to the anime style in here. Somewhat similar was euler_ancestral + beta/normal. There could be a list of loras applied for anime styles, I’m not sure. I’d also like to know.

Koto@lemmy.world · 2 months ago

Most browsers have a grammar check option. You need to look into your browser settings and enable it.

Koto@lemmy.world · edit-2 2 months ago

It’s not the model, I think. I tried it on my comfyUI, chroma v48, it barely ever gives me extra limbs. It could be the resolution. The model was trained in 1024x1024 format, and the options offered on perchance don’t match. But to compensate for that perchance offers great image quality, the workflow (or loras) or whatever sampler/scheduler combinations… I’ve tried it all with the latest chroma and couldn’t replicate it. It’s amazing here on perchance.

Koto@lemmy.world · edit-2 2 months ago

The AI’s have come a long way, especially the recent versions that catch a lot of details, but they are just not there yet for the task you’re describing working on the prompt alone. If you really want a detailed scene of three or more people you will need to dive into the world of LoRas, Control Net and other tools, in ComfyUI or similar program. Another approach would be to make a very detailed image of each of the three characters here on perchance, and then, ask Flux.1 Kontext to put them all in one scene doing something you want without changing their appearance. I’ve had very good results with this approach, and it’s much easier, too.

Koto@lemmy.world · 3 months ago

I’d also like to know. One thing is for sure, the base is flux.1-schnell with LoRas. Except, it still accepts negative prompts to some degrees (weakly), which flux shouldn’t do.

Koto@lemmy.world · edit-2 3 months ago

Nunchuks also has negative prompt, but Chroma seems to deal with fat ugly lip contouring a lot better it seems. My first impression.

Koto@lemmy.world · 3 months ago

The best ones are installed locally. If you have ComfyUI or similar interface, you can install ForgeFlux Kontext. Which is the best and the latest model in the field of image-to-image, imho.

Koto@lemmy.world · 3 months ago

Same prompts lead to the same results, give or take. The OP said FACES and EVERY TIME in caps, suggesting he’s getting the same faces everytime he’s using the i2t which is not true. I was just trying to point out that there are ways to describe many details and change that.

Koto@lemmy.world · 3 months ago

Check this, if that’s what you were looking for: https://perchance.org/mytestgen You could do it via dynamic lists, but that’s 100 more words than in the example. If it is indeed what you were looking for, let me know if you need any explanations.

Koto@lemmy.world · edit-2 3 months ago

Sure, camera brings even an anime image to life. Just a front picture everytime is boring. Try describing how a camera is looking at the characters or the scene and you’ll notice the difference in the dynamics. A simple example, This picture tells a story rather than showing an anime character. And it’s only 3 short sentence about the camera work, angle, focus.

Koto@lemmy.world · 3 months ago

It is achieved by the sum of all descriptions and the word choice. But one phrase/sentence is usually not enough. For example, you want a 2 inch fairy climbing, um, a bottle of water. The word “climb” suggests that the fairy is small enough. If you have frame/camera angle descriptions then you can add phrases like “the subject is taking one-third of the frame”, to give it the perspective relative to the table surface. I suggest that you use https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one. Find a picture in a search, then ask joy-caption to describe it in relative and superlative terms.

Koto@lemmy.world · 3 months ago

Check your HTML part. <h1>[title]</h1> which is not defined anywhere, and update() which gives you [output] error. It’s part of a new generator. You can completely clear that and start with a [darkroom].

Koto@lemmy.world · 3 months ago

You won’t be losing consumable list on that created list, as you can .consumableList it again. Judging by your code, you figured it out!

Koto@lemmy.world · edit-2 3 months ago

I just hit “randomize” several times in my generator and they all look like different people to me. Of course, you need to fill a lot of details which my generator does for you, so you don’t have to type anything.

Koto@lemmy.world · 3 months ago

Yes, it’s possible. The correct syntax would be [a = originalList.consumableList.selectMany(2)] . Then, we’ll need to make a list of that slice by using createPerchanceTree() with the correct escape characters. The function works like this: [newList = createPerchanceTree("slicedList\n\ta\n\tb").slicedList] [newList]

Koto@lemmy.world · 3 months ago

Flux Kortex is perfect for image to image. I recommend installing it locally, amazing stuff.

Koto@lemmy.world · edit-2 3 months ago

https://perchance.org/hm20ndlb7m Not sure if you meant like this, a dropdown to choose from the list of genres. In perchance lists, the correct syntax to choose one is [list.selectOne] to choose one. You can assign it to a variable that you can later use, for example:[selectedGenre=genre.selectOne].

Koto@lemmy.world · edit-2 3 months ago

Good idea but that would be a hell of a job to track all the pieces across different lists plus specification for each piece where it goes. Since I’m lazy I just did:

Uniform
  https://user.uploads.dev/file/53321b33603b8f9f386fb870db73ad65.png
  top=batman mask
  mid=batman suit with batman logo on the chest, batman gloves, batman belt
  low=batman pants, batman boots
  rear=batman cape, mask straps on the back of the head
  noZoom=[this.top], [this.mid], [this.low], partially visible batman cape on the sides

and then just call it like [Uniform[camera]]. I can easily filter out different actor specifications now.