← Back to homepage

Building a Stable Diffusion like text-to-image pipeline using DreamShaper 7

November 22, 2023 by Chris

In text-to-image modelling, Stable Diffusion has increased the pace of development when it comes to generative models. However, it does not come with its problems - including slow convergence and difficulty handling high-dimensional data (S., S., n.d.). Some researchers have proposed a finetuned variant of the model instead, named DreamShaper.

In fact, DreamShaper has 8 versions already - of which the last is presumed to be the final one.

Read this article if you wish to know more about what Stable Diffusion problems it solves. In this article, we'll focus on using it instead!

In this article, we're going to build a diffusers pipeline with (an LCM-LoRA-finetuned version of, for speeding up inference) DreamShaper 7. See the header image for what it's capable of generating!

Required packages

In order to run the code you'll create, you need to install:

Imports and global settings

Let's create a file named dreamshaperpipeline.py. In it, we're starting with the imports and filling some settings:

import torch
from diffusers import DiffusionPipeline, LCMScheduler
import matplotlib.pyplot as plt

size = 512 # 512x512 pixels
num_inference_steps = 4 # number of diffusion steps
guidance_scale = 0.0 # no guidance

Torch is needed because diffusers depends on it; we'll visualize the images with Matplotlib.

As you can see, in this example, you're generating 512 x 512 pixel images (feel free to set it to smaller or larger ones, but do recognize that this may impact the hardware you'll need to run it successfully!). We use 4 diffusion steps for doing so and let no classifier guide the model (this is a technical step related to the LCM-LoRA process).

Loading the DreamShaper 7 pipeline with LCM-LoRA adapters

The next step involves actually creating the DreamShaper 7 pipeline. We're using HuggingFace's DiffusionPipeline for this goal.

The DiffusionPipeline is the quickest way to load any pretrained diffusion pipeline from the Hub for inference HuggingFace, n.d..

We do this by initializing the DiffusionPipeline from the pretrained Lykon/dreamshaper-7 model. Subsequently, we check if CUDA is available - in other words, if you can run this pipeline on your GPU - and if so, enable it. This will speed up running the pipeline significantly.

Then, we're using the LCMScheduler with the pipeline configuration and load the latent-consistency/lcm-lora-sdv1-5 weights. These weights are LoRA weights weights meaning that the model was finetuned using the LoRA technique. However, it was done in a particular way: to enable fast inference. In fact, using these weights speeds up inference a lot.

Multistep and onestep scheduler (Algorithm 3) introduced alongside latent consistency models in the paper Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference by Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. This scheduler should be able to generate good samples from LatentConsistencyModelPipeline in 1-8 steps (HuggingFace, n.d.)

Finally, we return the pipeline.

Here's the code:

def load_dreamshaper_lora_pipeline():
    """
    Load the DreamShaper 7 model with LCM LoRA adapters for fast inference.
    """

    # Create a DiffusionPipeline using the pretrained DreamShaper 7 model
    pipeline = DiffusionPipeline.from_pretrained("Lykon/dreamshaper-7")

    # Use CUDA if available
    if torch.cuda.is_available():
        pipeline.to("cuda")

    # Use the LCM LoRA adapters for fast inference
    pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
    pipeline.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")

    return pipeline

Asking for the prompt

Then, we ask the user for the prompt - in other words, what they want to visualize:

def ask_for_prompt():
    """
    Ask the user for a prompt.
    """
    prompt = input("What do you want to visualize?\n")

    return prompt

Generating the images

This is followed by a definition which allows us to generate the images. It takes the pipeline, the prompt and some extra settings:

def generate_images(pipeline, prompt, num_inference_steps, guidance_scale, size):
    """
    Generate images using the pipeline.
    """
    results = pipeline(
        prompt=prompt,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        height=size,
        width=size
    )

    return results

Showing the final image

Subsequently, we show the final image - this part is just visualizing the image with Matplotlib.

def show_image(results):
    """
    Show an image.
    """
    # Create a figure without any border or axis
    fig, ax = plt.subplots(figsize=(30, 30))
    ax.imshow(results.images[0])
    ax.axis('off')  # Turn off axis labels and ticks

    # Show the image without borders
    plt.subplots_adjust(left=0, right=1, top=1, bottom=0)  # Remove extra white space
    plt.show()

Combining everything together

Finally, we combine everything together in a main def:

def main():
    """
    Main function.
    """
    # Load the pipeline
    pipeline = load_dreamshaper_lora_pipeline()

    # Ask for a prompt
    prompt = ask_for_prompt()

    # Generate images
    results = generate_images(pipeline, prompt, num_inference_steps, guidance_scale, size)

    # Show the image
    show_image(results)


if __name__ == "__main__":
    main()

Generated examples

Let's now run the script.

> python dreamshaper7pipeline.py

...and now take a look at what it produces for some basic prompts.

An orange at a beach:

The skyline of New York City during sunset, dreamscape:

I also let ChatGPT generate a more complex prompt.

Create an image that combines the concept of 'bioluminescent jungle' with 'steampunk cityscape.' Imagine a lush, glowing forest filled with exotic flora and fauna juxtaposed against a sprawling metropolis of intricate, Victorian-inspired machinery. The blending of natural wonder and mechanical innovation should be visually stunning and captivating.

This is what it looks like:

Here's another one:

Imagine a world where gravity is reversed, and people live on the undersides of floating islands in the sky. Create an image that showcases the everyday life of the island-dwellers, from their upside-down houses and gardens to their unique modes of transportation. Highlight the challenges and innovations of living in a world with 'reverse gravity.'

Pretty awesome!

References

Appendix: Full code

Here's the full code if you're interested:

import torch
from diffusers import DiffusionPipeline, LCMScheduler
import matplotlib.pyplot as plt

size = 512 # 512x512 pixels
num_inference_steps = 4 # number of diffusion steps
guidance_scale = 0.0 # no guidance


def load_dreamshaper_lora_pipeline():
    """
    Load the DreamShaper 7 model with LCM LoRA adapters for fast inference.
    """

    # Create a DiffusionPipeline using the pretrained DreamShaper 7 model
    pipeline = DiffusionPipeline.from_pretrained("Lykon/dreamshaper-7")

    # Use CUDA if available
    if torch.cuda.is_available():
        pipeline.to("cuda")

    # Use the LCM LoRA adapters for fast inference
    pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
    pipeline.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")

    return pipeline


def ask_for_prompt():
    """
    Ask the user for a prompt.
    """
    prompt = input("What do you want to visualize?\n")

    return prompt


def generate_images(pipeline, prompt, num_inference_steps, guidance_scale, size):
    """
    Generate images using the pipeline.
    """
    results = pipeline(
        prompt=prompt,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        height=size,
        width=size
    )

    return results


def show_image(results):
    """
    Show an image.
    """
    # Create a figure without any border or axis
    fig, ax = plt.subplots(figsize=(30, 30))
    ax.imshow(results.images[0])
    ax.axis('off')  # Turn off axis labels and ticks

    # Show the image without borders
    plt.subplots_adjust(left=0, right=1, top=1, bottom=0)  # Remove extra white space
    plt.show()


def main():
    """
    Main function.
    """
    # Load the pipeline
    pipeline = load_dreamshaper_lora_pipeline()

    # Ask for a prompt
    prompt = ask_for_prompt()

    # Generate images
    results = generate_images(pipeline, prompt, num_inference_steps, guidance_scale, size)

    # Show the image
    show_image(results)


if __name__ == "__main__":
    main()

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.