The world’s leading publication for data science, AI, and ML professionals.

How to Create Interpolation Videos with Stable Diffusion and Deform

Using frame interpolation to create videos with Stable Diffusion and Deforum

Generative AI

Image generated by the author using DeForum
Image generated by the author using DeForum

Introduction

Everywhere you look, you see images generated by algorithms such as Stable Diffusion and Midjourney. Video, however, is a far more challenging prospect.

I work with a media production company, and AI videos have a long way to go before they become widely used within the industry. The quality is not there yet, by a mile.

Challenges

There are several challenges, such as flickering between frames, resolution, and computation, to name a few.

I should also mention the ongoing debate about copyright issues for algorithms trained on content from the internet. It’s clear that the current approach is too liberal. I’ll follow this development with great interest.

Current use-cases

However, media production is all about creating engaging and unique content, and AI can certainly be a component of that already.

One use case is to add a different style to an original video, as they did in the crow, where they first recorded a dancer and then had an algorithm adjust each frame.

We will see progress on that topic constantly going forward, but in this post, I’ll show you a simple approach to get you started. We will create a short video by transforming images through interpolation using DeForum.

Let’s get started! 🙂


Step 1: Setting up VastAI

To use Deforum, you need a GPU. If you don’t have one on your computer, the easiest and cheapest alternative is vast.ai.

Instance Configuration

First, you need to select which image to use. I’ll go for pytorch/pytorch version "1.13.1-cuda11.6-cudnn8-devel" It’s not the only correct option, but it works.

For launch mode, I’ll go for the most straightforward option: a standard Jupyter notebook.

The last configuration you need to change is to increase the disk space since the models are massive.

Selecting a GPU

Next, it’s time to select a GPU. There are thousands of options, but a standard 3090 will work fine. Make sure that the upload and download speeds are decent.

The price should be around $0.4 per hour, but feel free to pay more if you want a better GPU.


Step 2: Install Deforum

Deforum explains the installation process on their GitHub, but I’ll also write it here to make this tutorial easier to follow.

2.1: Open instance

To open your instance, go to "Instances" in the menu to the left and click the blue "Open button" to the right.

2.2: Open terminal

Once inside, open a terminal by clicking "New" and selecting the terminal option.

Run the following two commands to install conda and close the terminal for the changes to take effect.

conda create -n dsd python=3.10 -y
conda init

Open a new terminal just like before and activate the conda environment.

conda activate dsd

2.3: Install everything

Install everything by running the following commands:

git clone https://github.com/deforum-art/deforum-stable-diffusion.git
cd deforum-stable-diffusion
python install_requirements.py

# Add conda environment to jupyter
conda install -c anaconda ipykernel
python -m ipykernel install --user --name=firstEnv

After a few minutes, you can test if everything is working correctly by running the following:

python Deforum_Stable_Diffusion.py

Step 3: Create a video

3.1 Open notebook

When everything is installed and working, close the terminal, and open "Deforum_Stable_Diffusion.ipynb" inside the deforum-stable-diffusion folder.

Change the kernel to dsd and run the first three cells. It can take a little time for the third cell to finish.

3.2 Select a starting image [optional]

If you want to, you can start from an original image. I’m going to use the following image from Pexels.

Photo by Nout Gons: https://www.pexels.com/photo/city-street-photo-378570/
Photo by Nout Gons: https://www.pexels.com/photo/city-street-photo-378570/

You need to make sure that the image is of a reasonable size. Otherwise, it won’t fit into RAM. So I resized this image to 801×512 (Deforum will cut the sides to 768×512).

To upload the image, click upload, and place it somewhere reasonable. For example, I put it under /deforum-stable-diffusion.

3.3 Deforum settings

As soon as you open the Notebook, you’ll see hundreds of settings you can adjust. Unfortunately, it’s not a perfect interface because many settings are only relevant based on previous choices.

Note: The individual settings can be hard to find and don’t come in the order that I write them. It’s best to search for the location using ctrl+F or cmd+F.

First, we change the animation_mode to Interpolation and interpolate_x_frames to some reasonable value. interpolate_x_frames decides how many frames the algorithm should generate between your keyframes.

animation_mode = 'Interpolation'
interpolate_x_frames = 16

Next, we create our prompts by changing animation_prompts. The numbers you see to the left don’t matter for Interpolation.

For other use cases, that number decides when to start using a specific prompt, but for Interpolation, these become our keyframes, and we generate interpolate_x_frames frames between each.

Here are my prompts, but you can write whatever you want.

animation_prompts = {
    0: "a dystopian warzone with zombies, dark clouds in the sky, night",
    1: "a dystopian warzone, dark clouds in the sky",
    2: "a dystopian city, clouds in the sky",
    3: "a city",
    4: "a beautiful city",
    5: "a beautiful and modern city with blue sky",
    6: "a beautiful and modern city with blue sky and green trees"
}

If you don’t use an original image, continue to 3.4; otherwise, follow the instructions below.

Now, we change W and H to the dimensions of our selected image. In my case, that’s 801×512.

W = 801 #@param
H = 512 #@param

To use the image, we need to change the following settings as well:

use_init = True #@param {type:"boolean"}
strength = 0.6 #@param {type:"number"}
init_image = "city-street.jpg" #@param {type:"string"}

Strength decides how much of the original image to keep. A number close to 1 means we will save much of the original image. Init_image should, of course, point to the location of your image.

3.4 Generate images

Run cells 3, 4, and 5. First, the algorithm will generate keyframes based on your prompts in animation_prompts.

When the keyframes are done, it will continue to create the intermediate frames. This can take some time if you have many prompts in animation_prompts and a large interpolate_x_frames.

3.5 Create video

Create a new cell at the bottom of your Notebook, and paste the following code. Then run the cell to create your video.

import cv2

image_files = os.listdir(args.outdir)
image_files = ["{}/{}".format(args.outdir, i) for i in image_files if '.png' in i]
image_files = [i for i in image_files if args.timestring in i]

out = cv2.VideoWriter('video.avi',cv2.VideoWriter_fourcc(*'DIVX'), 15, (args.W, args.H))

for i in range(len(image_files)):
    out.write(cv2.imread(image_files[i]))
out.release()

Here’s what my video looks like:

And here’s one more example:


That’s it for my tutorial; happy hacking! If you create something cool, please share.

Thank you for reading! 🙂


Related Articles