Today, let’s introduce a new AI project, StoryDiffusion, used to generate coherent images and videos.
Due to its coherence, it can be used to create comics or long videos.
Official examples can be viewed at the end of the article via the link provided, so we won’t go into detail here.
Key Features:
StoryDiffusion creates a magical story by generating coherent images and videos. Our work mainly consists of two parts:
- Coherent self-attention mechanism for generating character-consistent images in long-range sequences. It is plug-and-play and compatible with all image diffusion models based on SD1.5 and SDXL. In the current implementation, users need to provide at least three textual prompts for the coherent self-attention module. We recommend providing at least 5-6 textual prompts for better layout arrangements.
- Motion predictor for generating long-distance videos, which predicts the motion between conditional images in a compressed image semantic space, achieving larger-scale motion prediction.
Installation
Installation is straightforward, just execute a few commands sequentially.
|
|
Usage
Let’s take a look at the interface.
So, you need to input “Character Description” and “Comic Description,” and that’s basically it.
Programmers often debate whether to use tabs or spaces for code alignment, so we’ll consider using this topic for a comic.
Character Description
|
|
Comic Description
|
|
Then click “Generate” to get the image!
Examples
Let’s look at a few examples. Still on the tab or space theme, let’s try different comic styles.
Let’s try a few different themes.
Will AI replace human jobs?
Which is a better pet: cat or dog?
Which operating system is better: Windows or macOS?