PILOT: Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Official Implement of PILOT.

Lingzhi Pan, Tong Zhang, Bingyuan Chen, Qi Zhou, Wei Ke, Sabine Susstrunk, Mathieu Salzmann

Method Overview

Getting Started

It is recommended to create and use a Torch virtual environment, such as conda. Next, download the appropriate PyTorch version compatible with your CUDA devices, and install the required packages listed in requirements.txt.

git clone https://github.com/Lingzhi-Pan/PILOT.git
cd PILOT
conda create -n pilot python==3.9
conda activate pilot
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

You can download the stable-diffusion-v1-5 model from the website "https://huggingface.co/runwayml/stable-diffusion-v1-5" and save it to your local path.

Run Examples

We provide three types of conditions to guide the inpainting process: text, spatial controls, and reference images. Each condition-control refers to a different configuration in the directory configs/.

Text-guided

Modify the model_path parameter in the config file to point to the directory where you saved your SD model, and then execute the following instruction:

python run_example.py --config_file configs/t2i_step50.yaml

Text Spatial Controls

To introduce spatial controls using ControlNet or T2I-Adapter, we offer options for both models, but we recommend using ControlNet. First, download the ControlNet checkpoint, such as ControlNet conditioned on Scribble images, published by Lvmin Zhang from the following link: https://huggingface.co/lllyasviel/sd-controlnet-scribble. Then, execute the instructions below:

python run_example.py --config_file configs/controlnet_step30.yaml

You can also download other ControlNet models published by Lvmin Zhang to enable inpainting with other conditions such as canny map, segmentation map, and normal map.

Text Reference Image

Download the checkpoint of IP-Adapter from the website "https://huggingface.co/h94/IP-Adapter", and then run the following instruction:

python run_example.py --config_file configs/ipa_step50.yaml

Text Spatial Controls Reference Image

You can also use ControlNet and IP-Adapter together to achieve multi-condition controls:

python run_example.py --config_file configs/ipa_controlnet_step30.yaml

Personalized Image Inpainting

You can also integrate LORA into the base model or replace the base model with other personalized Text-to-Image (T2I) models to achieve personalized image inpainting. For example, replacing the base model with a T2I model fine-tuned by DreamBooth using several photos of a cute dog can generate the dog inside the masked region while preserving the dog's identity effectively.

See our Paper for more information!

BibTeX

If you find this work helpful, please consider citing:

@article{pan2024coherent,
  title={Coherent and Multi-modality Image Inpainting via Latent Space Optimization},
  author={Pan, Lingzhi and Zhang, Tong and Chen, Bingyuan and Zhou, Qi and Ke, Wei and S{\"u}sstrunk, Sabine and Salzmann, Mathieu},
  journal={arXiv preprint arXiv:2407.08019},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
assets		assets
configs		configs
data		data
models		models
pipeline		pipeline
utils		utils
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_example.py		run_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PILOT: Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Method Overview

Getting Started

Run Examples

Text-guided

Text Spatial Controls

Text Reference Image

Text Spatial Controls Reference Image

Personalized Image Inpainting

BibTeX

About

Releases

Packages

Contributors 3

Languages

License

Lingzhi-Pan/PILOT

Folders and files

Latest commit

History

Repository files navigation

PILOT: Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Method Overview

Getting Started

Run Examples

Text-guided

Text Spatial Controls

Text Reference Image

Text Spatial Controls Reference Image

Personalized Image Inpainting

BibTeX

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages