Story Flicks – STATUS: WAYNE is Open for Work

Overview:

Story Flicks creates a fully function short clip video with pictures, audio and subtitles.

How it works in steps format:

User Input (React): User enters a prompt in the React app.
API Request (React -> FastAPI): React sends the prompt to a FastAPI endpoint.
Story Generation (FastAPI + LLM): FastAPI calls the LLM to generate the story.
TTS (FastAPI + TTS Engine): FastAPI uses a TTS engine to create the audio.
Image Generation (FastAPI + Image Model): FastAPI uses an image model to generate images for each part of the story.
Subtitle Generation (FastAPI + STT): Uses STT to create subtitle file
Video Assembly (FastAPI + MoviePy/FFmpeg): FastAPI combines everything into a video.
Video Delivery (FastAPI -> React): FastAPI sends the video back to the React app for the user to view/download.

Luckily, we do not need to know any of this. We simply go to a web form, complete and submit the web form and within 2-4 minutes the video is completed and ready to watch.

Intended Use:

Intended use case is for video clips up to 1-2 minutes.

Intended Audience:

Initial creation is for short bedtime stories but can be modified to some degree.

Customization:

Some minor changes can be made to the application without any re-coding required. The system is created with the Pydantic library and there is coding for requiring certain items in the Python .py configuration files.

Minor themes and styles can be changed.

When a prompt is created and given by user, as example, if you put specifics like gender, age and other criteria the the AI will look to produce content based on the age or story topic of the main characters when they are human people.

If small animals, rodents, bugs and insects, etc… are mentioned it will usually create a cartoon or storybook for children themed content.

Video Creation Example:

Note: Right-Click inside video and choose “Open in a new tab” to view larger video.

Capabilities:

If good user prompts are created some short fun or informative videos can be created. As example asking for a 5 list example, or giving a beginning-middle-ending type themed prompts the tool is quite useful.

The videos currently range from 1-10 input images for creation, several different languages, and about 10 user voices including child, male and female voices.