Auto 1111 SDK: Stable Diffusion Python library

Auto 1111 SDK is a lightweight Python library for using Stable Diffusion generating images, upscaling images, and editing images with diffusion models. It is designed to be a modular, light-weight Python client that encapsulates all the main features of the [Automatic 1111 Stable Diffusion Web Ui](https://github.com/AUTOMATIC1111/stable-diffusion-webui). Auto 1111 SDK offers 3 main core features currently:

Text-to-Image, Image-to-Image, Inpainting, and Outpainting pipelines. Our pipelines support the exact same parameters as the Stable Diffusion Web UI, so you can easily replicate creations from the Web UI on the SDK.
Upscaling Pipelines that can run inference for any Esrgan or Real Esrgan upscaler in a few lines of code.
An integration with Civit AI to directly download models from the website.

Join our Discord!!

Demo

We have a colab demo where you can run many of the operations of Auto 1111 SDK. Check it out here!!

Installation

We recommend installing Auto 1111 SDK in a virtual environment from PyPI. Right now, we do not have support for conda environments yet.

pip3 install auto1111sdk

To install the latest version of Auto 1111 SDK (with controlnet now included), run:

pip3 install git https://github.com/saketh12/Auto1111SDK.git

Quickstart

Generating images with Auto 1111 SDK is super easy. To run inference for Text-to-Image, Image-to-Image, Inpainting, Outpainting, or Stable Diffusion Upscale, we have 1 pipeline that can support all these operations. This saves a lot of RAM from having to create multiple pipeline objects with other solutions.

from auto1111sdk import StableDiffusionPipeline

pipe = StableDiffusionPipeline("<Path to your local safetensors or checkpoint file>")

prompt = "a picture of a brown dog"
output = pipe.generate_txt2img(prompt = prompt, height = 1024, width = 768, steps = 10)

output[0].save("image.png")

Controlnet

Right now, Controlnet only works with fp32. We are adding support for fp16 very soon.

from auto1111sdk import StableDiffusionPipeline
from auto1111sdk import ControlNetModel

model = ControlNetModel(model="<THE CONTROLNET MODEL FILE NAME (WITHOUT EXTENSION)>", 
                   image="<PATH TO IMAGE>")

pipe = StableDiffusionPipeline("<Path to your local safetensors or checkpoint file>", controlnet=model)

prompt = "a picture of a brown dog"
output = pipe.generate_txt2img(prompt = prompt, height = 1024, width = 768, steps = 10)

output[0].save("image.png")

Running on Windows

Find the instructions here. Contributed by by Marco Guardigli, [email protected]

Documentation

We have more detailed examples/documentation of how you can use Auto 1111 SDK here. For a detailed comparison between us and Huggingface diffusers, you can read this.

For a detailed guide on how to use SDXL, we recommend reading this

Features

Original txt2img and img2img modes
Real ESRGAN upscale and Esrgan Upscale (compatible with any pth file)
Outpainting
Inpainting
Stable Diffusion Upscale
Attention, specify parts of text that the model should pay more attention to
- a man in a ((tuxedo)) - will pay more attention to tuxedo
- a man in a (tuxedo:1.21) - alternative syntax
- select text and press Ctrl Up or Ctrl Down (or Command Up or Command Down if you're on a MacOS) to automatically adjust attention to selected text (code contributed by anonymous user)
Composable Diffusion: a way to use multiple prompts at once
- separate prompts using uppercase AND
- also supports weights for prompts: a cat :1.2 AND a dog AND a penguin :2.2
Works with a variety of samplers
Download models directly from Civit AI and RealEsrgan checkpoints
Set custom VAE: works for any model including SDXL
Support for SDXL with Stable Diffusion XL Pipelines
Pass in custom arguments to the models
No 77 prompt token limit (unlike Huggingface Diffusers, which has this limit)

Roadmap

Adding support Hires Fix and Refiner parameters for inference.
Adding support for Lora's
Adding support for Face restoration
Adding support for Dreambooth training script.
Adding support for custom extensions like Controlnet.

We will be adding support for these features very soon. We also accept any contributions to work on these issues!

Contributing

Auto1111 SDK is continuously evolving, and we appreciate community involvement. We welcome all forms of contributions - bug reports, feature requests, and code contributions.

Report bugs and request features by opening an issue on Github. Contribute to the project by forking/cloning the repository and submitting a pull request with your changes.

Credits

Licenses for borrowed code can be found in Settings -> Licenses screen, and also in html/licenses.html file.

Automatic 1111 Stable Diffusion Web UI - https://github.com/AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion - https://github.com/Stability-AI/stablediffusion, https://github.com/CompVis/taming-transformers
k-diffusion - https://github.com/crowsonkb/k-diffusion.git
ESRGAN - https://github.com/xinntao/ESRGAN
MiDaS - https://github.com/isl-org/MiDaS
Ideas for optimizations - https://github.com/basujindal/stable-diffusion
Cross Attention layer optimization - Doggettx - https://github.com/Doggettx/stable-diffusion, original idea for prompt editing.
Cross Attention layer optimization - InvokeAI, lstein - https://github.com/invoke-ai/InvokeAI (originally http://github.com/lstein/stable-diffusion)
Sub-quadratic Cross Attention layer optimization - Alex Birch (Birch-san/diffusers#1), Amin Rezaei (https://github.com/AminRezaei0x443/memory-efficient-attention)
Textual Inversion - Rinon Gal - https://github.com/rinongal/textual_inversion (we're not using his code, but we are using his ideas).
Idea for SD upscale - https://github.com/jquesnelle/txt2imghd
Noise generation for outpainting mk2 - https://github.com/parlance-zz/g-diffuser-bot
CLIP interrogator idea and borrowing some code - https://github.com/pharmapsychotic/clip-interrogator
Idea for Composable Diffusion - https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
xformers - https://github.com/facebookresearch/xformers
Sampling in float32 precision from a float16 UNet - marunine for the idea, Birch-san for the example Diffusers implementation (https://github.com/Birch-san/diffusers-play/tree/92feee6)

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
auto1111sdk		auto1111sdk
.DS_Store		.DS_Store
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
automatic1111sdk_on_windows_w_gpu.md		automatic1111sdk_on_windows_w_gpu.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto 1111 SDK: Stable Diffusion Python library

Demo

Installation

Quickstart

Controlnet

Running on Windows

Documentation

Features

Roadmap

Contributing

Credits

About

Releases

Packages

Contributors 5

Languages

License

Auto1111SDK/Auto1111SDK

Folders and files

Latest commit

History

Repository files navigation

Auto 1111 SDK: Stable Diffusion Python library

Demo

Installation

Quickstart

Controlnet

Running on Windows

Documentation

Features

Roadmap

Contributing

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages