This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Is there an open source alternative to StableDiffusion that would allow describing an image in human natural language rather than a set of tags? I would like to describe an image using sentences, verbs, and various "with" "and" etc., for example: "A man is standing in a dark corridor, holding his hat in his left hand. Blue glowing liquid is dripping down his red coat." Describing an image with tags does not allow for as much flexibility in customizing the desired result as natural language.
We already have such a mechanism in popular closed-source neural networks, why can't we get the same mechanism in open source? Or is there such an alternative?
Subreddit
Post Details
- Posted
- 1 month ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/StableDiffu...
Flux is the latest state of the art open source image generator. https://blackforestlabs.ai/ The URL the other person gave is from the developers of Flux, what you linked is a third party UI that has Flux support. There's a lot of UIs that have Flux support, and many online services offering Flux generation. There's three versions of Flux, each bigger than the previous, and even the smallest takes more resources to run than Stable Diffusion.
ComfyUI and
Automatic1111are popular UIs with Flux support. The easiest way to download the Flux checkpoint(s) is from https://civitai.com/models/618692/flux. Schnell is the smallest version, then dev, and it says you can't download pro.You can also generate Flux images on Civitai. Press the "create" button on the page I linked and the create UI will appear. Change to "draft" for the version, that's actually schnell and doesn't cost 150 buzz to make an image.
Edit: Turns out Automatic1111 does not support Flux! Thanks /u/atakariax. I got confused because Forge and Automatic1111 have very similar UIs.