Make it Pop #12 - Guide to World Models

From camera-op simulations to live-generated worlds: why World Models are a bigger creative breakthrough than most people think.

In the creative AI space, we’ve spent the last two years or so getting good at prompting frames. We’ve learned how to describe a single moment in time to get a stunning image or a short video clip. But a massive shift is happening right under our feet.

World models like Genie 3 or Runway GWM-1 are a bigger creative breakthrough than most people realize. With these systems, you’re not prompting a single frame or a scene-by-scene sequence to be stitched together in a video editor. You’re operating a camera inside a simulated world.

What are World Models (and why they're different from AI video models)?

Traditional AI generates pixels based on statistical patterns. World models, however, are neural networks that learn an internal simulation of how the world behaves, understanding geometry, lighting, and physics.

  • Traditional AI video models: Predicts what the next set of pixels should look like based on visual patterns.

  • World Models: Predict what the next state of the world should be. They don’t just "draw" a car; they "understand" how a car moves, how it reflects light, and how it interacts with objects in a 3D environment.

Analogy to help you understand: How After Effects vs. Blender work:

  • Traditional AI video models = After Effects: In After Effects, if you want a shadow, you essentially have to build and fake it frame-by-frame. You are manipulating pixels to create an illusion of depth.

  • World Models = Blender: In Blender, you don't "create" a shadow; you simply set up a light source and an object, and the engine understands the physics required to cast that shadow for you.

Current World Models on the market

  • Google Genie 3: A leading example that generates immersive 3D worlds from text or images that users can explore and interact with in real-time.

  • Runway GWM-1: A family of models that understand how scenes behave, not just how they appear. It includes GWM-Worlds for interactive environment exploration at 24fps and GWM-Avatars for digital humans with realistic expressions and lip-sync.

  • LingBot-World (Alibaba Wan 2.2): An open-source powerhouse capable of nearly 10 minutes of continuous, stable generation with zero "long-term drift" (where objects usually deform over time). It supports 16 FPS interactivity with sub-1-second latency.

  • Decart AI (Oasis): This is an experiential, real-time AI model where every frame is generated auto regressively based on your actual keyboard/mouse inputs, essentially creating a "playable" video game. I would also say their Lucy 2 model is a live interactive model, even though it’s not classified officially as a “world model”.

5 ways to use Genie 3 for creative work today.

From my LinkedIn post

1 Video → 1400+ frames

  • Traditional AI image generation takes 30+ seconds to give you 1 static image. If the angle is wrong, you start over.

  • Genie 3 workflow: Upload a reference image to define the world. Enter the simulation for 60 seconds, download the video. At standard frame rates, you now have 1000+ unique images to choose from.

Simulate camera person

  • Unlike traditional video gen where you can only generate 1 scene, with Genie 3 you are the camera operator. You can simulate a physical camera moving through the set.

  • Genie 3 workflow: Define your environment (the set). Define your character as a "Camera Rig" or "Drone." Control the movement physically, walking forward for a dolly shot, or rotating for a pan.

CGI concepting

  • Genie 3 to "concept" the CGI world first. Generate a 3D simulation to validate the art direction, lighting, and layout before deploying artists to build the final assets.

  • Genie 3 workflow: Describe the world (and character & actions). Toggle to "First Person" or “Third Person” perspective. Hand off the generated reference to your 3D team as a visual guide, test with actors & green screen.

Game design concepting

  • Skipping greyboxing, with Genie 3 you can upload a sketch or describe a world, then play it immediately. Walk around inside your idea right away to see if it works.

  • Genie 3 workflow: Upload your level sketch or concept art. Toggle to "First Person" or “Third Person” perspective. Direct the character to run through the environment to test the flow.

Property walk-through

  • Turn a 2D floor plan into a navigable 3D tour. Genie 3 can turn a floor plan into walk through experience.

  • Genie 3 workflow: Upload the floor plan image to define the layout and furniture. Toggle to “First Person” perspective. Walk through the result & download 60 seconds video

Need to say: I paid the Ultra subscription to access Genie 3 out of my own pocket, even though I work at Google. I’m dedicated to this newsletter folks! 🤣 

Are world models production-ready?

Are world models perfect today? Not even close. We are still facing significant hurdles:

  • Sim-to-real gap: While they understand physics, grounding those dynamics in 100% accurate physical priors remains a challenge

  • Visual fidelity vs. speed: To run at "live" frame-rates (like Decart's 20 FPS), models often sacrifice resolution or produce "fuzzy" distant details. You can choose to upscale the final video.

  • Temporal memory: While models like LingBot are pushing into the 10-minute range, many still struggle with precise inventory or object control over very long horizons.

But remember: early CGI looked a little weird, and early AI video was a hallucinogenic mess. It will only get better from here!! Watch this space in 2026!

📰 AI creative news updates 26th Jan - 4th Feb 2026

  • Kling AI launched Video 3.0, a unified multimodal video model with native audio, multi-shot storytelling, character consistency, and 15s cinematic control.

  • Decart AI rolled out Lucy 2, a real-time generative video model that edits and generates videos live with minimal latency. It moves AI video from offline clips to interactive, live transformation, enabling real-time livestream effects.

  • Deezer made its AI-music detection tool available commercially to other platforms to tag and exclude AI-generated tracks. It helps combat streaming fraud, protect artist royalties, and set industry standards as AI music floods services

  • Alibaba Cloud open-sourced its Tongyi Z-Image foundation model for photorealistic, stylized image generation with fine-tuning support. A high-quality, efficient image model that runs on modest hardware accelerates creative workflows.

  • Google’s DeepMind rolled out Project Genie 3, an experimental AI prototype that lets users create, explore, and remix interactive virtual worlds from text and images. This will benefit game design, simulation, training, and creative world building.

  • Higgsfield launched Vibe Motion, an AI motion-design tool powered by Claude that lets creators prompt motion and refine it live instead of manual editing in After Effects.

  • Luma AI launched Ray3.14, its newest AI video model with native 1080p output that’s 4× faster and 3× cheaper, plus better stability and motion consistency. It removes key quality-speed-cost tradeoffs in generative video

🏆 AI creative competitions worth joining

If you’ve got a video or concept brewing, these competitions are open right now, and they’re giving real prizes + visibility to your creative AI work

AI Film Festival (aiffi)

Over $10,000 in prizes, project visibility, potential funding for future productions, selected videos featured on our streaming platform, official screenings across 5 countries in 2026, and more.

AI Film Awards Cannes 2026

Unleash your creativity at the prestigious Cannes stage!

On a personal note, girlzzzz & guyzzzz, please audit your subscriptions to all the random creative AI tools. I was looking at my credit card statement and realized I have been paying for months for random subscriptions because I just wanted to test the tools and forgot to cancel. There are currently two expensive monthly expenses I have: my Equinox membership and all these creative tooling subscriptions 💀 How did we get here lol. It was cheaper to just pay for one Adobe Creative Cloud license back in the days before AI than all these AI tools combined.

Khulan