- Digital Dips
- Posts
- A look back at 2025, and a glimpse of what’s coming
A look back at 2025, and a glimpse of what’s coming
Dipping into the digital future: OpenAI’s most serious work model yet and Google made “fast and smart” the new default

Hi Futurist,
I hope you had a great Christmas with your family and loved ones. Maybe you took a moment to reflect on the past year. Because 2025 was crazy. We went from a few percent on the AGI-ARC 2 benchmark… to 75%. From 1 hour of autonomous development work… to 4.5. From debating AI’s energy use… to training models in space. Seriously. I’m not making that up. But hold on. 2025 was calm. 2026 will be wild. That’s why I’m sending you insights, inspiration and innovation—straight to your inbox. Let’s dive into the digital future and catch the waves shaping our industry.
💡 In this post, we're dipping in:
📣 Byte-Sized Breakthroughs: We open with a fast rewind of 2025, a year where AI progress didn’t just accelerate, it collapsed decades of timelines into months. You’ll see how new models, agents, and benchmarks quietly crossed from impressive into unsettling, with autonomy, cost, and capability jumping at once. We break down why GPT-5.2 and Gemini 3 Flash matter not as demos, but as economic turning points for real knowledge work.
🎙️ MarTech Maestros: AI stops being a helper and starts becoming a decision-maker, running entire workflows instead of single tasks. We look at early evidence. Faster design cycles, higher sales, and fewer operational mistakes once agents are in the loop. The focus isn’t hype, it’s structure. Why today’s org charts and processes don’t fit agentic systems. And we use BCG’s analysis to explain why this is shaping up to be the next real source of business value.
🧐 In Case You Missed It: A rapid-fire roundup of launches that are too cheesy, too strange, or too impressive to skip. Visual coding, autonomous research, self-improving agents, real-time translation, AI-generated worlds, voices, games, videos, and apps… all landing at once. Not a curated thesis. Just a snapshot of how wide and weird the frontier has become.
Do you have tips, feedback or ideas? Or just want to give your opinion? Feel free to share it at the bottom of this dip. That's what I'm looking for.

No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

Quick highlights of the latest technological developments.
Headstory: A look back at 2025, and a glimpse of what’s coming
Just over a month ago, none of the following models existed. There was no Claude Opus 4.5. No Google NanoBanana Pro. No GPT-5.2. No Gemini 3.0. No Grok 4.1. Blink. And suddenly they’re here. Stacked on top of each other like geological layers forming in real time. Never in human history have we watched capability compound at this speed. It’s dizzying. Back in 2019, super forecasters thought artificial general intelligence was something our grandchildren might deal with. Eighty years away. Now we’re arguing about definitions. About thresholds. About whether the line has already been crossed. Since then we’ve picked up multimodality, deep reasoning, tools, agents that act. Whatever you call it, the timeline didn’t just shorten. It collapsed.
Now zoom in on just the last twelve months. Really do it. Sit with it for a second. Reasoning models only showed up at the end of 2024. A month later, the first serious open alternative with DeepSeek emerged. And at that same point last year, Google was… well. Let’s be polite. Gemini 1.5 was not striking fear into anyone’s roadmap. Fast forward a single year and the landscape is almost unrecognizable. Google is now a leader and feared by Open AI. New baselines. New expectations. The pace isn’t linear. It’s exponential.
Which brings us to now. December 2025. And this is where things quietly cross from impressive into unsettling. We’re no longer talking about systems that just respond. We’re looking at Google’s AI agent Sima 2 that can define objectives, attempt to reach them, evaluate its own output, retrain on its mistakes, and run the loop again. And again. Each pass sharper than the last. Sometimes solving problems it’s never seen before. Sometimes outperforming humans at tasks we assumed were… well. Ours.
Think back to early 2025. Claude Sonnet 3.7 felt magical because it could compress an hour of developer work into minutes. That alone changed workflows. Changed pricing. Changed expectations. Now look at where we are with Opus 4.5. On real tasks we’re seeing a median autonomy window of nearly five hours. And the upper bound stretches far beyond that. That’s a single model carrying work across most of a morning. And here’s the part that should make you pause. We haven’t even fully measured GPT-5.2 or Gemini 3 yet. Let that land. If this is where 2025 ends, I don’t think most people have internalized what 2026 and 2027 are lining up to deliver.
Put the curve into human terms. If progress continues at anything resembling its current slope, we’re less than three years away from automating work that takes a person a month. Less than five years from automating a year’s worth of effort. And that’s assuming, for some absurd reason, that nothing about this acceleration feeds back into itself. Which of course it will. Tools build tools. Productivity funds progress. The slope steepens because it always does.
Benchmarks tell the same story, just without the poetry. ARC-AGI-2, long treated as a kind of psychological wall, didn’t fall. It shattered. Using the Poetiq system paired with GPT-5.2 at high settings, performance climbed to around seventy-five percent on the public evaluation set. At a cost that would’ve sounded like science fiction a year ago. To put it in perspective, the human baseline is 60%… That’s not a marginal improvement. That’s a regime change. A leap of roughly fifteen percentage points over what used to be state of the art. You’re still not convinced that by the end of next year the world would look like completely different than today?
Well, then there’s GDPval. This one matters. A lot. Released only a few months ago, it measures something painfully concrete. Can an AI actually do real economic knowledge work. The kind people are paid for. When it launched, the best model in the world at that time, Claude, scored under fifty percent. That was September. Today, GPT-5.2 variants are clearing seventy. Let’s be explicit about what that means. In more than seven out of ten cases, the model performs at or above the level of a human domain expert. That cuts both ways. You can use it to generate serious economic value. And it can almost certainly do the same kind of work you do. Better than you’d like. Things were already strange. They’re about to get stranger.
So what does all of this mean as we step into 2026?
Here’s the paradox. 2025 felt calm. Somehow manageable. 2026 won’t be. For knowledge workers, for leaders, for organizations built around human throughput, this is where the ground starts to move. Not because there will be less work. But because there will be vastly more possible work. The bottleneck shifts. From execution to imagination. From capacity to direction. If you stare at each model release in isolation, it’s easy to shrug. Another upgrade. Another benchmark. But step back. Look at the curve. We’re not walking along it. We’re inside it. We're in the middle of an exponential curve. Miss it, and it feels like nothing is changing. Until everything does.
Introducing GPT-5.2, OpenAI’s most serious work model yet
TL;DR
GPT-5.2 is OpenAI’s most capable model series so far for professional knowledge work and long-running agents. It delivers clear improvements in spreadsheets, presentations, coding, vision, long-context understanding, and tool use. On GDPval, it beats or ties human professionals on 70.9% of real knowledge-work tasks across 44 occupations, while operating faster and at a fraction of the cost. GPT-5.2 sets new state-of-the-art results across coding, reasoning, science, math, and vision benchmarks, and is now rolling out in ChatGPT and the API.
Read it yourself?
Sentiment
GPT-5.2 is a meaningful step forward. It follows instructions better and is more willing to attempt hard tasks. Code generation is clearly stronger than GPT-5.1: more careful, more complete, and more autonomous. Vision and long-context handling stand out, especially when working with large codebases or complex images. But people online mention speed is the pain point. Thinking mode is slow, Pro is even slower, and sometimes it thinks for a long time and still fails. When it works, it’s impressive. When it doesn’t, it’s frustrating. The general verdict fits the headline well: incredibly impressive, but too slow.
My thoughts
The real story of GPT-5.2 has almost nothing to do with the benchmark charts OpenAI showed on stage. It’s about economics. Two benchmarks matter here: GDPval and ARC-AGI. Start with ARC-AGI. GPT-5.2 Pro hits 90.5% accuracy at a cost of $11.64 per task. A year ago, reaching slightly lower performance cost thousands of dollars per task. In some estimates, tens of thousands. That’s a roughly 390× cost reduction in twelve months. This benchmark was designed to resist brute-force scaling. It was meant to show that models could not generalise. For years, it did exactly that. Now it doesn’t. Not because the tasks got easier, but because the efficiency curve collapsed. At $11.64 per task, AI reasoning crossed human-cost parity without anyone really noticing. At that price, even low-paid human labour is no longer cheaper. Then there’s GDPval, which might be even more important. GPT-5.2 matches or beats human domain experts on real-world knowledge work more than 70% of the time. These are tasks that take humans four to eight hours. Presentations. Spreadsheets. Analysis. Planning. The implication is blunt: this model can do economically valuable work, and it can do a lot of the work many knowledge workers do today. This is the first time it feels undeniable. Not “soon”. Not “in a few years”. Yesterday. AI crossed the threshold where it outperforms humans at scale on knowledge work that actually matters economically.
Google’s Gemini 3 Flash just made “fast and smart” the new default
TL;DR
Google expanded the Gemini 3 family with Gemini 3 Flash, a frontier-level model built for speed and scale. It keeps Pro-grade reasoning while delivering Flash-level latency and efficiency, and it’s priced aggressively ($0.50 per 1M input tokens, $3 per 1M output tokens). It’s rolling out broadly across the Gemini app, AI Mode in Search, and developer/enterprise channels like Gemini API, Google AI Studio, Gemini CLI, Vertex AI, and Gemini Enterprise, with strong benchmark results and pricing that undercuts previous tiers while staying seriously capable.
Sentiment
Online, the vibe is pretty clear: people are impressed. A lot of the comments zoom in on the same thing. This feels like a genuine milestone. The idea that you can get Pro-level reasoning with Flash speed, without paying “frontier tax,” is exactly what developers have been hoping for. The general expectation is that this will quietly upgrade everyday search and workflows for millions, and the developer crowd is already speculating about what this does for agentic tools, rapid iteration, and shipping faster experiences without blowing up budgets.
My thoughts
What really gets me is the timeline. Gemini 2.5 Pro dropped at the end of March. A little over six months later, we’re already looking at better quality, dramatically faster inference, and economics that make you rethink what’s “worth” running in production. That cadence is unreal. And yeah, again, the economics here are the headline for me. When you get better performance for something like a quarter of the cost, deployment strategy changes instantly. You stop rationing capability. You start shipping it. More assistants running in parallel. More realtime UX. More experimentation. Less “we’ll save the good model for premium users.” This is the commoditization curve speeding up in public. What used to take years now gets compressed into months. And that’s exactly how you unlock the next wave of innovation… not by waiting for one magical breakthrough, but by making frontier-level intelligence cheap, fast, and default.
More byte-sized breakthroughs:
ChatGPT opens its own app store
OpenAI has launched an App Directory inside ChatGPT, a clear move towards turning the chatbot into an all-in-one app. Users can now browse tools that connect ChatGPT directly without leaving the chat to services like Adobe, Canva, Shopify, Lovable and Replit; and the list goes on and on. Existing connectors have been reframed as apps. It’s a cleaner, more ambitious setup. The business model, however, is still a work in progress.Stripe prepares shops for AI agents
Stripe has introduced the Agentic Commerce Suite, a new way for businesses to sell directly through AI agents. AI agents become part of shopping flows, because humans are no longer clicking the buy button. Instead of building custom links for every assistant, companies can connect their product catalogue once and manage discovery, checkout, payments, and fraud from Stripe.ChatGPT gets a faster, sharper image upgrade
OpenAI has released a new version of ChatGPT Images, built on its latest image model. Image creation is now up to four times faster, with more accurate edits that keep faces, lighting, and composition consistent. You can tweak small details or fully rework an image without losing its original feel. A new Images space inside ChatGPT also makes it easier to explore styles and ideas. It brings image work closer to a smooth, creative flow.

A must-see webinar, podcast, or article that’s too good to miss.
When AI agents take over the work
AI has changed how tasks are done, but not how most companies run. That is about to shift. We are moving from tools that assist people to agents that plan, decide, and act across full workflows. Early results show faster design, higher sales, and fewer errors in core operations. This article from the Boston Consulting Group explains why agentic AI is becoming the next source of real business value.

A roundup of updates that are too cheesy to ignore.
Cursor now fixes your toughest bugs with Debug Mode and more in update 2.2.
Cursor lets you visually design in your codebase, transforming changes into code automatically.
Splinetool’s 3D for Hana merges 3D and 2D design, letting you transform shapes seamlessly in real-time.
Synclabs introduces react-1, a 10 billion parameter model for creative control in video post-production.
Google’s Gemini 2.5 supercharges TTS with emotional finesse and multi-speaker mastery.
Google’s Gemini Deep Research empowers developers to embed Google’s top-tier autonomous research capabilities.
Google’s Deep Research now crafts visual reports with custom images and simulations for AI Ultra users.
Google’s Sima 2 tests AI self-improvement, outperforming humans in a 3D world through autonomous learning.
Google Translate debuts real-time headphone translations, keeping speaker tone and cadence intact.
Google’s Opal integrates with Gemini, letting you create AI-powered mini apps for personalized experiences.
Google’s T5Gemma 2 rolls out with advanced multimodal capabilities and support for 140+ languages.
Google’s A2UI debuts as an open-source protocol, empowering agents to craft interactive UIs.
Google’s Disco GenTabs transforms your open browser tabs into custom apps using Gemini 3.
Google’s YouTube Playables Builder lets creators craft mini-games using text, video, or image prompts.
Google's remote MCP servers now offer easy app access to BigQuery and Google Maps.
PixVerse new V5.5 ties audio to video and crafts multiple shots in just one tap.
Zoom excels in AI exams, boosting its AI Companion 3.0 with sharper summaries and enhanced reasoning.
Invideo’s Performances transforms your raw footage into Hollywood-caliber films while preserving every authentic emotion.
Invideo's Vision turns your ideas into storyboards with 9 shots.
World unveils a super app with crypto payment and encrypted chat to enhance human interaction.
Resemble AI’s Chatterbox Turbo debuts as the fastest open-source AI voice model, adding emotions to every output.
Spaitial AI’s Echo crafts rich 3D worlds from text or images, making spatial creativity accessible to all.
Xiaomi unveils MiMo-V2-Flash MoE model, delivering blazing speed and record-breaking performance.
Runway’s GWM Worlds creates infinite, real-time environments from static scenes for immersive exploration.
Alibaba’s Wan2.6 transforms your ideas into cinematic HD videos with seamless multi-speaker dialogue and narrative coherence.
Alibaba’s Qwen-Image-Layered debuts with Photoshop-grade layering, offering infinite image decomposition for creators.
Alibaba’s Wan2.6 Image debuts with advanced interleaved text-and-image storytelling for compelling visual narratives.
Alibaba’s Qwen3-TTS unveils VoiceDesign & VoiceClone for creating and cloning voices with unmatched expressiveness and speed.
Meta’s SAM Audio debuts a unified AI model to isolate and edit complex audio mixtures with ease.
Meta’s Instagram unveils AI-powered Edits to effortlessly blur objects and tag outfits in videos.
OpenAI’s Realtime API unveils new audio models slashing hallucinations and boosting accuracy.
OpenAI’s Codex introduces Skills for task automation with reusable bundles and easy commands.
OpenAI’s GPT-5.2-Codex takes on your toughest coding challenges with new long-context understanding.
Tencent’s HY World 1.5 invites you to craft and roam new worlds with its real-time 3D modeling breakthrough.
Black Forest Labs’ FLUX.2 [max] debuts with real-time web searches and consistent image rendering.
ElevenLabs Agents now chat on WhatsApp, unifying support across all channels.
Lovable Connectors integrate seamlessly with Perplexity, ElevenLabs, Firecrawl, and Miro for more powerful apps.
Exa’s AI People Search lets you semantically sift through a billion profiles using Exa embeddings.
Loveart AI lets you craft last-minute presentations with web and PDF insights.
Manus now allows you to edit Nano Banana Pro slides with AI precision and ease.
Manus Design View closes the design gap with the Mark Tool for precise visual tweaks.
Mistral OCR 3 advances document processing with unmatched accuracy and efficiency.
Firecrawl's /agent tool scours the web for data you can't reach, boosting research efficiency.
Zai open-sourced GLM-4.7 outshines its predecessor with superior coding, reasoning, and tool mastery.
ClickUp unleashes Super Agents, AI assistants with human-like skills to streamline your workflow.
MiniMax M2.1 debuts as the ultimate open-source coding and AI agent model, outperforming rivals with state-of-the-art benchmarks.
Vidu Agent transforms concepts into professional-grade videos with one click, offering template replication for swift production.
Decart AI’s Lucy Motion transforms your images into videos with precise path-drawn animations.
Shopify’s SimGym launches with digital customers to transform your site testing without live traffic.

How was your digital dip in this edition?You're still here? Let me know your opinion about this dip! |

This was it. Our forty-six digital dip together. It might seem like a lot, but remember; this wasn't even everything that happened in the past few weeks. This was just a fraction.
As we step into 2026, one thing’s clear: change is the constant. The organizations that thrive won’t be the ones with the fanciest tools, but the ones that know how to lead through change. Teams that embrace it, and manage it well, will be the ones that win. That treat it as a mindset, not a project. If that’s where you want to be, I’d love to help.
Wishing you a bold, brilliant and inspiring new year. Let’s make it count.
Looking forward to what the next year brings! ▽
-Wesley