- Published on
Latent Consistency LoRa with Stable Diffusion XL 1.0
- Authors
- Name
- Tim Dolan
Advanced Image Generation with Stable Diffusion and Gradio
Creating and modifying images using AI models has become increasingly accessible. In this tutorial, we'll dive into using Stable Diffusion, a powerful text-to-image model, alongside Gradio, an easy-to-use library for building interactive web interfaces for machine learning models. This tutorial covers not just basic image generation but also advanced techniques like image inpainting and adaptive image generation using LoRa weights.
Paper where Latent Consistency was introduced: LCM-LoRa: A universal Stable Diffusion Acceleration Module
This space is available as a live demo here
The repository is available on huggingface or on GitHub

https://macadeliccc-lcm-papercut-demo.hf.space
Setting Up Your Environment
If you have already done these steps or are familiar with the process, you can skip this section.
The first step is to ensure that you have the appropriate NVIDIA GPU drivers installed on your local machine.
Before we can leverage the GPU acceleration for the Whisper-large-v3 model, we need to ensure that we have the right tools installed. Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. It allows you to install, run, and update packages and their dependencies. Installing Conda
If you don't already have Conda installed, you'll need to install it before proceeding. You can download and install Conda from the official Conda website. Choose the installer that's right for your system and follow the on-screen instructions.
Once you have Conda installed, open a terminal window and proceed with setting up your environment.
Setting up Conda Environment
Creating a dedicated Conda environment for your project is considered a best practice as it helps manage dependencies and avoid conflicts.
conda create --name torch python=3.10
conda activate torch
You should make sure that you have followed the correct steps located here to install torch with GPU support for your correct platform. If you have not done this, you will not be able to use the GPU.
I am on linux for example, so this would be my install command using conda.
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia
Build the correct command for your machine on the website and make sure its installed properly by testing if torch has access to the gpu.
You do not need to locally install CUDA or cudnn, pytorch ships with prebuilt binaries for each version. These steps alone are enough for torch to see your GPU.
First, ensure you have all necessary dependencies installed:
pip install --upgrade gradio diffusers PIL transformers peft
This command installs Gradio for the interface, PyTorch for tensor computations, and the diffusers library which houses the Stable Diffusion models. Basic Image Generation
We start by setting up a function to generate images based on a text prompt using Stable Diffusion:
import torch
from diffusers import LCMScheduler, AutoPipelineForText2Image
def generate_image(prompt, num_inference_steps, guidance_scale):
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "latent-consistency/lcm-lora-sdxl"
pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")
# Load and fuse lcm lora
pipe.load_lora_weights(adapter_id)
pipe.fuse_lora()
# Generate the image
image = pipe(prompt=prompt, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale).images[0]
return image
This function loads the Stable Diffusion model with LoRa weights, which help in maintaining consistency and quality in the generated images.
Image Inpainting
Inpainting allows you to modify specific parts of an image, guided by a mask. Here's how you can do it:
from diffusers import AutoPipelineForInpainting
from PIL import Image
def inpaint_image(prompt, init_image, mask_image, num_inference_steps, guidance_scale):
pipe = AutoPipelineForInpainting.from_pretrained(
"diffusers/stable-diffusion-xl-1.0-inpainting-0.1",
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
pipe.fuse_lora()
if init_image is not None and mask_image is not None:
init_image = Image.open(io.BytesIO(init_image)).resize((1024, 1024))
mask_image = Image.open(io.BytesIO(mask_image)).resize((1024, 1024))
generator = torch.manual_seed(42)
image = pipe(
prompt=prompt,
image=init_image,
mask_image=mask_image,
generator=generator,
num_inference_steps=num_inference_steps,
guidance_scale=guidance_scale,
).images[0]
return image
Advanced Image Generation with Adapters
To further enhance the model's capabilities, we integrate additional adapters:
from diffusers import DiffusionPipeline
def generate_image_with_adapter(prompt, num_inference_steps, guidance_scale):
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
variant="fp16",
torch_dtype=torch.float16
).to("cuda")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl", adapter_name="lcm")
pipe.load_lora_weights("TheLastBen/Papercut_SDXL", weight_name="papercut.safetensors", adapter_name="papercut")
pipe.set_adapters(["lcm", "papercut"], adapter_weights=[1.0, 0.8])
pipe.fuse_lora()
generator = torch.manual_seed(0)
image = pipe(prompt=prompt, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, generator=generator).images[0]
pipe.unfuse_lora()
return image
Image Modification
We can also modify the brightness and contrast of generated images:
from PIL import ImageEnhance
import io
def modify_image(image, brightness, contrast):
image = Image.open(io.BytesIO(image))
enhancer = ImageEnhance.Brightness(image)
image = enhancer.enhance(brightness)
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(contrast)
return image
Building the Gradio Interface
Gradio makes it easy to build interactive web interfaces. Here's how you can set up an interface for your models:
with gr.Blocks() as demo:
with gr.Row():
image_output = gr.Image(label="Generated Image")
with gr.Row():
with gr.Accordion(label="Configuration Options"):
prompt_input = gr.Textbox(label="Prompt", placeholder="Self-portrait oil painting, a beautiful cyborg with golden hair, 8k")
steps_input = gr.Slider(minimum=1, maximum=10, label="Inference Steps", value=4)
guidance_input = gr.Slider(minimum=0, maximum=2, label="Guidance Scale", value=1)
generate_button = gr.Button("Generate Image")
with gr.Row():
with gr.Accordion(label="Papercut Image Generation"):
adapter_prompt_input = gr.Textbox(label="Prompt", placeholder="papercut, a cute fox")
adapter_steps_input = gr.Slider(minimum=1, maximum=10, label="Inference Steps", value=4)
adapter_guidance_input = gr.Slider(minimum=0, maximum=2, label="Guidance Scale", value=1)
adapter_generate_button = gr.Button("Generate Image with Adapter")
with gr.Row():
with gr.Accordion(label="Inpainting"):
inpaint_prompt_input = gr.Textbox(label="Prompt for Inpainting", placeholder="a castle on top of a mountain, highly detailed, 8k")
init_image_input = gr.File(label="Initial Image")
mask_image_input = gr.File(label="Mask Image")
inpaint_steps_input = gr.Slider(minimum=1, maximum=10, label="Inference Steps", value=4)
inpaint_guidance_input = gr.Slider(minimum=0, maximum=2, label="Guidance Scale", value=1)
inpaint_button = gr.Button("Inpaint Image")
with gr.Row():
with gr.Accordion(label="Image Modification (Experimental)"):
brightness_slider = gr.Slider(minimum=0.5, maximum=1.5, step=1, label="Brightness")
contrast_slider = gr.Slider(minimum=0.5, maximum=1.5, step=1, label="Contrast")
modify_button = gr.Button("Modify Image")
generate_button.click(
generate_image,
inputs=[prompt_input, steps_input, guidance_input],
outputs=image_output
)
modify_button.click(
modify_image,
inputs=[image_output, brightness_slider, contrast_slider],
outputs=image_output
)
inpaint_button.click(
inpaint_image,
inputs=[inpaint_prompt_input, init_image_input, mask_image_input, inpaint_steps_input, inpaint_guidance_input],
outputs=image_output
)
adapter_generate_button.click(
generate_image_with_adapter,
inputs=[adapter_prompt_input, adapter_steps_input, adapter_guidance_input],
outputs=image_output
)
demo.launch()
This Gradio app provides a user-friendly interface for generating and modifying images using the above functions. Conclusion
In this tutorial, we explored advanced image generation techniques using Stable Diffusion and LoRa weights, coupled with Gradio for an interactive experience. This setup demonstrates the potential of AI in creative fields, providing tools for artists, designers, and hobbyists alike.
For the full code and more details, you can visit the GitHub repository linked in the references. References
- Stable Diffusion: Stability AI
- Gradio: Gradio Docs
- LoRa Weights: Latent Consistency