- Make it Pop by Khulan
- Posts
- Make it Pop #04 - A guide to directing AI images
Make it Pop #04 - A guide to directing AI images
5 step framework for AI image directing, turning 30 and competition deadline extended

I’ve been creating with AI as a tool for a while now, for campaigns, experiments, brand visuals, and honestly just for fun. And the more I use it, the more I realize something simple: Prompt engineering < Art direction.
Whether you are shooting with a physical camera or generating pixels, the principles are the same. You have to make decisions about the world, the light, and the action before you hit the shutter (or the return key).
And like everything else in creativity, good direction comes from a few habits you can practice. Here is the 5-step framework:
Step 1: Build the world (before the subject)
When we start prompting, our instinct is to describe the thing we want. But art directors don't start with the subject; they start with the world the subject lives in.
Before you cast your character, ask yourself: What is the temperature of this world? Is it warm and nostalgic? Is it sterile and cold? This gives the model a world to honor.
Basic prompt: A young woman sitting in a car at night

Art directed prompt: Cinematic film still, 1980s, interior of a vintage Chevy Nova at night, heavy rain on the windows, cold blue atmosphere, fogged glass, feeling of isolation and suspense. A young woman sitting in a car at night

Other prompt options:
Nostalgic & warm: Soft light, golden-hour palette, hazy.
Clinical & sterile: Hard fluorescent light, desaturated green/blue tint, uneasy.
Tense & paranoid: High-contrast shadows (chiaroscuro), tight, voyeuristic angles.
Step 2 : Place the camera
The biggest giveaway of an "undirected" AI image is that the camera is usually just... there. It’s eye-level, medium distance, flat. To make an image feel intentional, you need to take control of the perspective.
Basic prompt: A young woman in a video arcade, 80s style, wide shot.

Art directed prompt: Cinematic film still, 1980s, dramatic low-angle shot looking up at a young woman at an arcade cabinet, her face lit by the game's glow, shot on ARRI Alexa with a vintage 35mm anamorphic lens, soft bokeh from neon lights, shallow depth of field, natural lens distortion, shot on Kodak Vision3 500T film stock.

Other prompt options:
Top-down vertical: A "God's Eye" view, feels objective or graphic.
Extreme close-up: Tightly frames just the eyes or mouth for high tension.
Dutch angle: The camera is tilted, making the scene feel disorienting and unstable.
Step 3: Direct the action
A static subject can feel uncanny because life is never truly still. If you just ask for a person, the AI will often give you a mannequin staring blankly at the lens.
Great art direction fixes this by directing movement. It doesn't have to be an explosion; it can be a micro-movement. It’s about the "breath" between actions.
Basic prompt: A young woman in a dark room holding a walkie-talkie, looking scared, dramatic pose.

Art directed prompt: Cinematic film still, 1984, close-up on a young woman's face in a dark room, she is breathing heavily, mouth slightly open, eyes wide looking scared, listening intently to a walkie-talkie, subtle motion, shot on Kodak Vision3 500T.

Other prompt options:
Eyes darting off-screen: Creates instant narrative suspense.
Heavy breathing / visible breath: Shows fear, cold, or exertion without a "pose".
Hair/fabric whipping in wind: Adds dynamism and energy to a still photo.
Step 4: Paint with light
Light is your emotional engine. It is the single most important tool for setting the mood.
If you don't specify the light, the AI will usually default to a generic "studio softbox" look or flat lighting. You want to be specific about where the light is coming from and how it hits the subject.
Basic prompt: A young woman in the woods at night with a flashlight, cinematic lighting.

Art directed prompt: Cinematic film still, 1980s, a young woman in dark woods at night, lit by the harsh beam of her flashlight cutting through the mist, deep shadows, high contrast, practical lighting, shot on Kodak Vision3 500T.

Other prompt options:
Practical lighting: Light only comes from sources in the scene (e.g., a lamp, flashlight, TV).
Backlight / rim light: Light from behind the subject, creating a silhouette or "halo" effect.
Chiaroscuro: Extreme high-contrast with deep, dark shadows, creating mystery.
Step 5: The "texture" (embrace imperfection)
Finally, we need to break the digital seal. AI models are trained to optimize for perfection, smooth skin, sharp focus, perfect symmetry. But to the human eye, perfection often feels fake. Realism lives in the errors.
We want to add "glitch" terms to make the image feel tactile and lived-in.
Basic Prompt: A young woman in a phone booth at night.

Art Directed Prompt: Cinematic film still, 1980s, a young woman in a glass phone booth, rain streaking down the glass, heavy film grain, high ISO noise, slight motion blur, chromatic aberration, unpolished, raw aesthetic, shot on Kodak Vision3 500T.

Other prompt options:
Film grain / high ISO noise: Gritty texture that breaks up "digital skin".
Motion Blur: Natural streaking of movement that feels dynamic.
Halation: A soft, reddish "bloom" around bright highlights, classic to analog film.
Putting it all together
World + Camera + Action + Light + Texture
This is the simplest structure I’ve found to reliably produce images that feel designed, not random.
Test these five layers on your next image. You’ll immediately see the difference in cohesion, tone, and storytelling. If you try this workflow, I’d love to see what you create.
I purposely left out “Subject” because that’s a whole other post on its own.
Photos above generated with Nano Banana.
⚡ AI creative news updates you should know
10 - 17 November 2025
Higgsfield Transitions is a new tool that enables seamless animated transitions between any combination of photos and videos using a built-in library of cinematic effects.
Flow by Google now supports a new “Image” tab powered by the Nano Banana and Imagen 4 models; you can edit generated images or add them as “ingredients” to prompt further creations.
World Labs has launched Marble, a multimodal ‘world model’ that turns text, images, videos or layouts into editable 3D worlds you can export and explore.
Kling 2.5 Turbo now supports start & end-frame input, letting you define the first and last shot of a video and have the model generate the motion between them. 2.5 Turbno gives better cost-to-quality ratio than previous versions.
Higgsfield Angles lets you change the camera perspective of any image, upload a photo and adjust angles via a 3-D cube or sliders to get new viewpoints in seconds. You can rethink visual composition without reshoots / reprompts.
Krea introduces a node-based workflow system that lets you chain image, video, audio and 3D models into reusable pipelines on one canvas. You can build, automate and share full creative workflows with no tool-hopping.
🏆 AI creator competitions worth joining
If you’ve got a video or concept brewing, these competitions are open right now, and they’re giving real prizes + visibility to your creative AI work
Wonder Film Festival Chance to win up to $6000 and be considered as a filmmaker in the next chapter of Wonder’s Anthology series.
| AI Film Award by 1 Billion Summit The winner will be awarded a prize of USD $1 million. Must use Google’s AI models.
|
On a personal note, I’ve been juggling a full-time job that has felt more than full-time lately, so I haven’t had the time to post on social media.
I’m turning 30 on Saturday, and it’s made me reflect on my 20s. What I want to carry forward, and what I’m ready to leave behind.
One thing keeps coming up: courage. Chutzpah. A bigger gut.
Being on a visa in the US can make my choices feel limited, but when I’m honest with myself… it’s still a choice to be in the US. I could choose a different country, a different path. What I really need is more courage, to trust myself, to back myself, and to move with a bigger sense of boldness in my 30s.
Khulan