Digital Dips
Posts
AI rewrites the playbook for your organization

AI rewrites the playbook for your organization

Dipping into the digital future: Google drops “Nano Banana” and lets AIs play inside AI-made worlds

Wesley Romeijnders
September 01, 2025 • Estimated reading time: 18 minutes

Hi Futurist,

In the last Dip, I asked in a poll whether I should move from once every two weeks to a weekly edition. Not a huge response, but still, I’m going for it. Digital Dips will soon hit your inbox every week. In a slightly different format too. More on that later. Last week, some OpenAI staff hinted at a new internal breakthrough. Several AI researchers posted cryptic tweets with phrases like “feel the AGI” alongside a photo of a humanoid robot, legs crossed, cigarette in mouth. I’m guessing it has something to do with training AI models to work inside robots. Which, funnily enough, is exactly what Google is working on. Therefore, I am sending you insights, inspiration, and innovation straight to your inbox. Let’s dive into the depths of the digital future together and discover the waves of change shaping our industry.

💡 In this post, we're dipping in:

📣 Byte-Sized Breakthroughs: AI shifts from assistant to operator, Google drops “Nano Banana” with Photoshop-killing precision, and Genie 3 lets AIs play inside AI-made worlds. Zapier turns agents into teammates, Claude sneaks into Chrome, and AI Mode in Search starts booking your dinner plans.
🎙️ MarTech Maestros: Zapier didn’t panic, it pushed. A company-wide “code red” on AI became the spark for bold experiments and a shift in culture. CEO Wade Foster tells us how urgency fuels progress, proving that leading in AI is less about tools and more about people.
🧐 In Case You Missed It: OpenAI sharpens its coding edge, Google experiments with video, and Anthropic’s Claude turns into a coding companion. From AI idols to digital twins and from live translations to humanoid robot brains, the field refuses to slow down.

Do you have tips, feedback or ideas? Or just want to give your opinion? Feel free to share it at the bottom of this dip. That's what I'm looking for.

No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

Quick highlights of the latest technological developments.

Headstory: AI rewrites the playbook for your organization

We thought AI would just help us get work done. Turns out, that was the wrong assumption. It’s not here to assist. It’s here to operate. And GPT‑5 is the proof. A general agent, built not just to generate text, but to execute economically valuable work. Law, logistics, sales, engineering. In half the cases, it already outperforms human experts. GPT‑5 didn’t just raise the bar. It made the old bar irrelevant. Your job is no longer to do the work. It’s to manage the agents who will. We no longer assign tasks to colleagues, we’ll assign them to agents. We won’t review dashboards, we’ll review what agents produced while we were asleep.

This shift flips the core design of how organizations operates. Instead of dashboards, tickets, and to-do lists, we’ll have agent command centres, workflows built for delegation, and systems that reflect back decisions, not raw data. Your tools will still matter. But your ability to manage agents, just like people, will matter more. Take Wallmart as an example. They are consolidating dozens of fragmented AI tools into four central “super agents”, serving customers, employees, engineers, and suppliers. They’re using Model Context Protocols (MCP) to unify systems and scale agent-to-agent collaboration across the enterprise.

Walmart is building the future. Most are stuck in the past. Not because of the tech, but because of the operating model. I've seen it enough at companies I visited for workshops or training sessions. They’re stuck on “exploring use cases,”, approval processes or worse, pretending nothing's changed. Tools are rolled out without workflows. Agents are launched without context. IT can’t keep up. Culture can’t adapt. And leadership doesn’t know what “good” looks like in this new world. They are still set up for a world where humans are the workers, not the supervisors of AI. I’ve seen it enough: Everything gets routed through “the AI working group”.

Meanwhile, the opportunity cost is growing. Agents are already reducing response times in support. Writing product specs. Reviewing contracts. Doing actual work. At Zapier, AI agents now handle 40% of customer tickets before a human even sees them. But resistance remains. Many companies act as if AI won’t change the world, until they can’t afford to ignore it any longer.

Last month, OpenAI dropped the cost of GPT-4o by 80%. That means the same budget you used yesterday now gives you 5x the output. I believe that alone should make every executive re-run the numbers. Because, the bigger story is what this implies: in a world where the cost of AI inference is plummeting, the way you build your organization must change. Historically, you built for what was economically viable today. Moore’s Law, with its slow and steady slope, shaped every roadmap. But, now shifts the strategic calculus: instead of building conservative, ROI-proven features, smart companies are starting to build for where AI is going, not just what’s possible today.

The biggest myth? That agents need better models to get better results. To build good agents, start with one word: context. Not just any context, the right context. The workflows, tools, data, and domain knowledge your business already runs on. AI agents don’t magically understand your business. You have to teach them, like any new hire. No matter how powerful the models get, the bottleneck isn’t the AI. It’s the inputs. AI agents don’t succeed because they’re clever. They succeed because they’re given the right context. Context is king. It always will be. So, the real work isn’t building agents. It’s feeding them.

The building blocks are already here. The agent stack is rapidly maturing:

State-of-the-art reasoning has jumped ahead at an insane rate
Costs are falling so fast you can give models more work, more often
Context windows now allow deep, multi-step instructions
Thinking budgets let you trade compute for quality
Tool use and RAG enable (external or internal) knowledge access
Agents can now call other agents for collaboration, plan, act and review

If you ask me what to do next: go deep, not wide. Build AI agents vertically, around the core processes of your business. Sales ops. Claims processing. Code review. Whatever makes your company tick. These agents aren’t “assistants.” They’re operators. They carry domain knowledge, handle structured workflows, follow specialized instructions, and evolve with your data. Then, build for where the tech is going, not just what’s viable now. The next model. The next price drop. The next context expansion. The speed of AI progress means those who build for tomorrow are always ahead. If you wait for ROI, you’ll miss the upside.

This shift demands a new kind of professional. As AI handles execution, the value of human work changes. I believe leaders need to invest in:

Judgment – What to build. What to ignore. Where to spend time.
Strategic thinking – Not how, but why. Connecting work to outcomes.
Creative problem-solving – When Plan A fails, try Plan D.
Continuous learning – Learn fast. Unlearn faster.
AI literacy – Know how to guide, evaluate, and question AI tools.

Execution is cheap. Taste is expensive. Build teams that know the difference. So, the questions business leaders should ask themselves now are:

What workflows in our org are ready to be agent-led?
Do our agents have access to the right data, tools, and context?
Have we designed for collaboration between agents and humans?
What assumptions about cost, value, and feasibility need to be updated?
Are our teams fluent in AI, or still stuck in PowerPoint and spreadsheets?
Who’s in charge of rethinking workflows from the ground up?
What do we need to learn faster than our competitors?

Because here’s the truth: nobody else has the playbook for your business. You have to write it yourself. And the faster you learn, the faster you lead. AI is not waiting for you to catch up. The gap between the fast and the slow is widening. This isn’t a roadmap problem. It’s a leadership problem. And the opportunity won’t wait.

Let’s not build another roadmap. Let’s build what your future team will manage. Let’s build the agents.

Google’s ‘Nano Banana’ image model is here

TL;DR

Google just dropped Gemini 2.5 Flash Image, an upgrade that doesn’t just generate images, it understands them. You can now edit, blend, or completely transform images with natural language prompts. It keeps character consistency across different scenes and supports multi-image fusion, allowing up to 13 visuals to be merged into one. Real estate cards, product shots, and branded assets can now be generated with pixel-level control. Oh, and it gets context too.

Read it yourself?

Sentiment

For weeks, this model was floating around X under the codename “Nano Banana.” People loved it. They called it the Photoshop killer. The buzz? Deserved. The model nails consistency. It can track characters and elements even when they’re not fully visible. It understands what it’s looking at and what you mean when you want to change it.

My thoughts

This is it. Google isn’t just ahead in benchmarks, the results speak for themselves. When OpenAI launched its image model months ago, it struggled: characters changed, the wrong things were edited, image fusion was messy. Gemini 2.5 Flash fixes it all. It can merge up to 13 images, maintain perfect character consistency, and interpret context to generate new angles or lighting setups. It’s not just impressive tech. It unlocks actual new use cases: product design, marketing campaigns, e-commerce visuals, architecture. Even interior designers now have a smarter tool in their stack.

An AI playing in the mind of another AI

TL;DR

Google’s Genie 3 can create entire worlds, on the fly. Their new interactive world simulator doesn’t just generate 3D environments, it also reacts in real time to another AI. That AI is SIMA, an embodied agent dropped into Genie’s universe, learning to move, explore, and complete tasks. It’s like a sandbox for AI to train itself. The full simulation loop, input, environment, response, is 100% AI-generated. The next step? Using this to train more general intelligent systems.

Read it yourself?

Sentiment

People online are stunned. Just two years ago, Google was still catching its breath after Bard’s messy debut, and now, they’re the ones actually moving the field forward. Even Elon chimed in, claiming xAI and Tesla are doing the same. But the reactions aren’t just excitement. Some are seriously wondering if this is what the start of the singularity looks like.

My thoughts

In our last Dip, I spoke about Genie 3 and shared my view: "World models like Genie 3 bring us closer to AGI. Because they give AI a sandbox to learn in. … And because these worlds are consistent, persistent, and interactive, with memory and complex systems, it becomes possible to train AI in situations that mimic real life at scale.” That is precisely what Google has done. By letting SIMA interact with Genie 3, basically taking control and playing, the AI isn’t just responding to text, it’s learning through experience. Like a toddler pushing buttons, bumping into walls, solving puzzles. And all of this happens in a persistent, responsive world that exists inside a model. Two AIs, learning from each other. This is the future. And it’s already working. What this means is simple and profound: Google is now flooding itself with a mountain of new training data. Not just for future AI models, but also for robotics, and yes, even AGI, because they’re openly talking about it now.

More byte-sized breakthroughs:

Zapier’s agents now pass tasks like teammates
Zapier’s new Agents act like teammates across your workflows, passing tasks smoothly from one to the next, no manual handoffs. Describe your workflow in plain language, and Copilot builds your full agent team from start to finish. Agents now pull live data from Google Drive, Dropbox, and Box, so your automations stay current. Turn your best agents into templates and share them with the Zapier community.
Claude for Chrome enters the browser
Anthropic is piloting Claude for Chrome in a private beta. No more switching tabs, from calendar to inbox, Claude handles it inside your browser. It can click buttons, fill forms, even reply to messages, all while watching what’s on your screen, to get things done. To keep things safe, Claude asks before doing anything risky and only gets permissions you approve.
Google’s AI Mode in Search becomes your on‑demand agent
AI Mode in Search now helps you act, not just ask, from booking restaurants to finding tickets. It uses live browsing, partner integrations, Google Maps and the Knowledge Graph to take real steps online. Partners include OpenTable, SeatGeek, Ticketmaster and more. Results are personalized to your tastes and you can share AI Mode results to plan a trip or dinner. AI Mode is now live in over 180 English‑speaking countries.

A must-see webinar, podcast, or article that’s too good to miss.

Inside Zapier’s AI shift. From ‘code red’ to culture change.

When Zapier declared a company-wide “code red” on AI, it wasn’t panic, it was momentum. Zapier’s CEO Wade Foster shares how urgency turned into action, sparking experiments, rewarding curiosity and pushing teams past hesitation. The conversation dives into the cultural shift behind AI adoption, showing why leadership isn’t just about tools but about people. A practical look at guiding an established company through change, without losing sight of speed, impact, and well-being. A must listen for leaders who want to (re)build their organization in the age of AI.

A roundup of updates that are too cheesy to ignore.

Qwen-Image-Edit offers precise bilingual text editing and versatile semantic tweaks for your images.
Wan2.2-S2V goes open source, offering cinema-quality audio-driven animations for filmmakers and creators.
HunyuanVideo-Foley launches an open-source Text-Video-to-Audio framework for perfect sound in film and game production.
Descript integrates ElevenLabs' v3 for enhanced AI voice generation with a simple settings update.
ElevenLabs Music API lets developers integrate high-quality AI music, with over 750k songs created since launch.
ElevenLabs Chat Mode debuts, letting you create text-only assistants for streamlined customer interactions.
Anthropic’s Claude Code introduces Agents to streamline your coding tasks with expert precision.
Anthropic’s Claude Code’s Learning mode guides coders with strategic gaps to enhance their skills.
Anthropic’s Claude Code now plays nice with GitHub, offering easier APIs, ready templates, and expanded event support.
Anthropic will start training its AI models on user data, including new chat transcripts and coding sessions, unless users choose to opt out.
Firecrawl v2 speeds up with 10x faster scraping and new semantic crawling, securing a $14.5M boost from Nexus VP.
Google’s Gemini App lets you turn sketches into prototypes with a snap and description.
Google Translate unveils AI-powered live translations and a beta language practice feature for iOS and Android.
Google Vids introduces gen AI features to transform images into dynamic videos. Now, create custom AI avatars to deliver messages hassle-free.
Google’s Stax unveils an experimental tool to streamline LLM evaluation with autoraters.
Google Stitch unveils Canvas, a tool for visualizing entire user flows to boost design consistency.
DeepSeek-V3.1 debuts with dual-mode inference, enhanced agent capabilities for rapid solutions and langer context window.
n8n’s AI Orchestrator optimizes chat efficiency by routing requests to the most suitable models.
n8n introduces chat streaming for instant, seamless word-by-word replies in automations.
Runway's Game Worlds Beta lets you explore characters and stories in real time.
Runway's Act-Two adds Voices, letting you customize character audio for enhanced storytelling.
LTX Studio introduces Camera Motion controls for precise, consistent shots.
LTXV's Multimodal Context composes shots by feeding images and scene cues.
DynamicsLabs’ Mirage 2 turns any image into a live, interactive world you and friends can explore in real-time.
Microsoft Copilot Labs introduces 3D modeling, empowering creators to shape the future.
Meta teams up with Midjourney to supercharge its AI image and video offerings.
Apple considers Gemini AI for a 2026 Siri makeover, boosting voice control with cloud innovation.
xAI’s Grok-2 goes open-source, inviting developers to explore its versatile capabilities.
xAI's Grok for iOS debuts object highlighting. Object highlighting feature can point to certain objects in Grok Vision mode.
xAI’s launches Grok Code Fast 1 on GitHub Copilot and more, speeding up agentic coding for developers.
Higgsfield Records unveils Kion, the first AI Idol from the world's first AI record label.
Higgsfield Speak 2.0 enhances your videos with emotional, contextual dialogue in 70+ languages.
Higgsfield introduces Mini Apps with Nano Banana for precise control, free for a year.
Synclabs’ introduces Lipsync-2-Pro. Edit high-res videos seamlessly, preserving even the finest details.
NVIDIA unveils Jetson AGX Thor. It’s NVIDIA’s new humanoid robot brain, now officially on sale for $3,499.
Cohere unveils Command A Reasoning, boosting enterprise applications with top-tier reasoning efficiency.
GitHub unveils Copilot Agents, letting you delegate coding tasks directly from any page.
HeyGen's Digital Twin now mirrors your every move, elevating video authenticity effortlessly.
Lindy Build autonomously fixes coding issues, transforming app dreams into bug-free realities while you lunch.
OmniHuman-1.5 crafts dynamic, minute-long videos from a single image using advanced speech interpretation.
OpenAI’s Codex CLI 0.24 now accepts image inputs, offering a wave of enhancements for users.
OpenAI’s Codex unveils new features with an IDE extension, GitHub code reviews, and a revamped CLI powered by GPT-5.
OpenAI’s ChatGPT launches Project-only memory, enabling context use within projects without external memory interference.
OpenAI’s Realtime API is out of beta for creating voice agents. They also introduces gpt-realtime, their most advances speech-to-speech model yet.
Kimi Slides transforms your ideas into decks in minutes, with Adaptive Layout and auto image search coming soon.
Manus Mail transforms inbox chaos by turning your forwarded emails into actionable summaries and to-do lists.
Krea unveils its first Real-time Video generation model, now open for beta testing.
Prime’s Intellect Environments Hub invites you to crowdsource open AI environments for the next wave of AGI progress.
nDreams enters Horizon Worlds with a new VR title, leading a shift for Quest developers.

How was your digital dip in this edition?

You're still here? Let me know your opinion about this dip!

This was it. Our fortieth digital dip together. It might seem like a lot, but remember; this wasn't even everything that happened in the past few weeks. This was just a fraction.

I help executive teams redesign their operating models around agent-native work. Not strategy decks. Real workflows. If you want to go from pilot to production, faster than your competitors, I can help you get there. Just reply to this email or hit me up on LinkedIn.

Looking forward to what tomorrow brings! ▽

-Wesley