This is diffusion text-to-image model designed to generate high-aesthetic images of 1024x1024 pixels, including portraits and landscapes. It is the successor to Playground v2 and demonstrates superiority over models like SDXL, PixArt-α, DALL-E 3, and Midjourney 5.2 in user studies focused on aesthetic quality.
This is an image generation model based on text prompts (text-to-image), built on the architecture of Stable Diffusion XL (SDXL). It represents a fine-tuned version of the base model stabilityai/stable-diffusion-xl-base-1.0.
SDXL-Turbo is a distilled version of SDXL 1.0, which allows sampling large-scale foundational image diffusion models in 1 to 4 steps at high image quality.
Kandinsky-3 is a diffusion model for text-to-image generation, developed based on previous versions of the Kandinsky2-x family. It has been improved through expanded data volume, including information related to Russian culture, enabling the generation of images reflecting this theme. The model also demonstrates enhanced text understanding and improved visual quality due to larger text encoder and Diffusion U-Net model sizes.
This is a diffusion model developed by Stability AI for generating short video clips from a static image (image-to-video). The model creates videos up to 4 seconds long (25 frames at 576×1024 resolution), using the input image as a conditional frame.
Stable Diffusion XL-base-1.0 is the base* text-to-image generation model, improved upon previous versions of Stable Diffusion models. It is designed to generate images at 1024x1024 pixels. It is also not recommended to choose an image size smaller than 512x512 pixels.
Refiner Model specializes in the final stages of noise reduction and enhances the visual accuracy of images generated by the base model.
Blue pencil-XL is a text-to-image generation model designed to create anime-style images.
Kandinsky 2.2 is a free Russian neural network for image generation developed by Sber AI. It operates based on a diffusion model: initially adding noise to images it was trained on, then restoring them through a reverse diffusion process to create new unique images.
Anything V3 is a Stable Diffusion diffusion model specialized in generating high-quality anime-style images
The Stable Diffusion v2-1 is a diffusion-based text-to-image generation model. It is fine-tuned from the Stable Diffusion v2 checkpoint. Generates high-resolution images (up to 768x768) and supports text-guided generation.
Stable Diffusion x4 upscaler - is a diffusion model for 4x image resolution upscaling based on text prompts. It takes as input a low-resolution image and a text prompt, along with the noise_level
parameter.
This model, based on Stable Diffusion, is designed to generate pixel art sprite sheets of a character from four angles: front (front), right (right), back (back), and left (left).
Stable Diffusion v1.5 is a diffusion model for image generation based on textual prompts. The model was initialized with the weights of the previous version, Stable Diffusion v1.2, and subsequently fine-tuned. It supports image generation at 512×512 pixels and image modification via textual prompts.