StableDiffusion AI

About 2 months ago I came across dall-e mini / craiyon… AI generating pictures based upon text prompt. A small toy – You enter some text, wait for about a minute and an image jumps out of the box. The image was supposed to be a graphical representation of the prompt. In reality it looked more like effects of Picasso teachings in primary school. In other words – it was hard to use it for anything else than some fun…

Then about a month ago I stumbled upon MidJourney and my jaw dropped below the floor level. Same mechanism – enter text, maybe some additional parameters and wait for pictures to emerge from the void. But this one generates fantastic, artistic images with high detail level, stunning colours, surprisingly faithful to the prompt and in many distinct variations. This service is a graphical revolution! So called artificial intelligence trained upon millions of images (praise those artists and photographers!) spits out digital compilations that no one ever dreamed. Yet MidJourney has some flaws and limitations – mainly two connected ones: all generated images are public and there is a lot of censorship. Therefore I admire those creations, but somehow I did not used it at all.

A bit later StableDiffusion appeared in the internets. Open AI with same purpose – generating images from prompts, free, generally not limited. And then – thanks to wonderful developers – GUI applications for StableDiffusion appeared, which one can install and use locally on a PC. In the beginning there were minimal requirements of 8GB VRAM NVidia GPU, then new optimized versions allowing to run it even with CPU only. I used a program created by GRisk (https://grisk.itch.io/) – simple user interface allowing to enter text, set few parameters and unlimited generation.

And so I am having a lot of fun with this. And definitely recommend this fun for everyone, especially for those with artistic flair. Those generated images are the effect of learning upon works of many talented artists and photographers, therefore they are kinda art of the humanity, and the one writing prompt is kinda like a wizard, explorer of varying imaginations. In addition there are programs allowing to do more with images – in particular GFPGAN to correct faces and RealESRGAN to go up with resolution – some GUIs have this built in. And also – there are plugins for popular digital art programs like Krita or Photoshop – which integrate generation and regeneration of images straight into digital canvas to make creation even more exciting by composing bigger and adding missing elements.