- Make it Pop by Khulan
- Posts
- Make it Pop #02 - AI ads for software? Here's the step-by-step
Make it Pop #02 - AI ads for software? Here's the step-by-step
The step-by-step guide for AI-generated ads for software, AI creative news updates, and competition reminders

Hey everyone,
This week, I wanted to solve a problem that’s been bothering me.
We’ve all seen the incredible AI-generated videos for fashion, CPG, and cars. But what about software?
The second you try to show a software UI in a dynamic scene, the text just... melts. It becomes a garbled, unusable mess. This is a huge blocker for those of us in tech who want to use these tools to create marketing content for our apps.
So, I ran an experiment to build a workflow that actually works. And I’m happy to say, it was a success.
This video is a proof-of-concept (not an official ad) that I created. Below is a step by step process on how I did it.
Step 1: Create your "digital set"
First, you need a stage. I didn't want to just animate a flat screenshot; I wanted the UI to live in a real scene. The easiest way to do this is to generate your scene with a "green screen" placeholder.
I used Nano Banana to create this static image of a laptop in a study.

Prompt: "A cozy and well-used desk setup in a New York City apartment. On the wooden desk, a modern silver laptop features a vibrant, uniform green screen. The desk is adorned with an open textbook, notebooks, a vintage desk lamp casting warm light, a vase with flowers, and various stationery. The window is prominently open, revealing a clear view of classic New York city buildings, tall and stately, suggesting a Midtown or Upper East Side location. Cars and street lights are visible on the street below, hinting at evening or a busy urban day. White sheer curtains are partially drawn, framing the cityscape and gently billowing inwards. The room has a warm, lived-in feel with shelves and walls decorated with photos and small plants."
Step 2: Place your UI screenshot/image on the green screen
I took a clean screenshot of the Gemini App UI and used Nano Banana to place it onto the green screen.
This gives us our "base image" to create different shots.

Prompt: “Composite the provided screenshot onto the laptop's screen. Crucially, stretch and distort the asset as necessary to perfectly fill the entire display area of the laptop's screen, leaving no gaps or black borders. Ensure precise perspective matching, realistic lighting integration, and accurate reflections, so the UI appears natively and completely displayed on the laptop.”
Step 3: Generate your "shots"
This is where it gets cool. Now that you have your base image, you can use it with image-to-image generation to create new angles and shots without losing the UI.

Prompt: “Composite the provided screenshot onto the laptop's screen. Stretch and distort the screenshot as necessary to perfectly fill the entire display area of the laptop's screen, leaving no gaps or black borders. Ensure precise perspective matching, realistic lighting integration, and accurate reflections, so the UI appears natively and completely displayed on the laptop.”
I uploaded the base image from step 2 + again the screenshot (so Nano Banana knows what’s on the screen), then I added the same prompt but with different ending:
Left photo: give me a extreme close up photo of the laptop with the screenshot. Maintain the text as is on the screenshot
Right: give me a close up photo of the laptop with the screenshot. Maintain the text as is on the screenshot

Prompt: “Woman sitting in the chair, looking at her laptop” + reference image

Step 4: The animation
Here is the most important part. How do you animate this static image without the text on the screen breaking?
The answer is Veo 3.1's "first/last frame" feature with good quality images
By uploading the high quality images for both the first frame and the last frame, you "lock" the core elements of the scene. Veo understands that the beginning and end state of the UI should be identical, so it preserves the text fidelity.
All the motion you see (the slow zoom out) is generated by Veo 3.1 between these two keyframes:
Prompt 1: The scene opens tightly focused on the close up of the laptop. The camera slowly begins to pull back in a smooth, fluid dolly motion, uncovering more of the laptop's silver chassis. As it recedes, the lens gently tilts upward and arcs slightly to the right, unveiling a warm wooden desk surface beneath the laptop. Throughout this continuous move, the environment's quiet energy, productive ambiance, and modern tech aesthetics.
Prompt 2: The shot opens tightly focused on a sleek laptop screen displaying the Gemini app interface. The camera smoothly zooms out, slowly pulling back to reveal a well-organized desk. Subtle depth of field shifts maintain focus transitions from objects close to the camera to those further away, ensuring a fluid visual experience that invokes a serene yet productive mood.
(I only used last 3 seconds from this clip)
Prompt 3: The scene opens with a rear view of a woman seated at a wooden desk near a large window. Camera is still. She gradually turns her head toward the lens, her expression shifting into a soft, warm smile. The mood is calm, inviting, and intimate.
note: Text clarity decreases as the distance between the laptop and the camera increases
Why this is a game-changer
This "first/last frame" method is the key. It’s a simple trick that finally unlocks the ability for tech and SaaS marketers to create high-quality, dynamic video content.
I could have spent a lot more time perfecting this, but I wanted to share the workflow as soon as I knew it worked.
Outside of creating videos, you could also take the pictures and put them on your website, ads, pitch decks… My trick for presentations is to convert the AI generated video into a GIF (use Gifski to convert + under 50mb) and drop the GIF in your slides.
I hope this guide helps you. Let me know what you create with it!
Btw: Voice over was done with Hume AI and music with Suno
⚡ AI creative news updates you should know
19 - 26 October 2025
Higgsfield Popcorn is the new AI storyboard generator that lets you create cinematic, multi-frame visual sequences with consistent characters, lighting and mood from one prompt or set of reference images.
Runway introduced "Workflows," a drag-and-drop nodal interface for chaining AI models into custom pipelines, alongside expanded Model Fine-Tuning for personalized video generation, cutting prototyping time & costs.
Midjourney announced at their office hour that V8 is in development, with updates on V7.1 and a new style-creator UI, the team said a full interface redesign is coming.
ByteDance’s Seed3D 1.0 is a new 3D generative model that transforms a single 2D image into a full 3-D asset (with detailed geometry, realistic textures, and PBR materials) ready for simulation.
LTX‑2 by Lightricks is a full-fledged AI video engine, from text or image to 4K video with synced audio, lipsync, up to 10s of continuous video and 25/50 FPS options for fluid animation, gameplay, or slow-mo effects. This gives creators more control.
HeyGen launched Motion Designer, AI tool for creating custom motion graphics from text prompts, skipping templates or software like After Effects. Empowers non-motion designers to build engaging animations fast.
🏆 AI creator competitions worth joining
If you’ve got a video or concept brewing, these competitions are open right now, and they’re giving real prizes + visibility to your creative AI work
Chroma Awards Three competition divisions - Film, Music Videos, and Games - each with unique categories, rules and prizes
| AI Film Award by 1 Billion Summit The winner will be awarded a prize of USD $1 million. Must use Google’s AI models.
|
Thanks for reading and supporting me! :)
If this newsletter sparked an idea, share it with a friend who’s building with AI.
Stay curious,
Khulan