AI has begun to improve itself. GPT-5.4 is the first real proof.

Hi {{firstName|Futurist}},

The next AI shockwave is closer than you think. Something big is coming. Morgan Stanley warns that a massive leap in AI capability could hit in the first half of 2026. Driven by extreme compute scaling inside U.S. labs. The impact? A sharp rise in productivity. Serious job disruption. And even power shortages, as intelligence turns into the most important economic resource on the planet. At the same time, Sam Altman told sophomores they will graduate into a reality where artificial general intelligence is simply… there. Looking away is not an option. Closing your eyes and keeping this outside your world is not an option. This is not a distant future story. This happened in the past week.

On today’s menu: self-building models, AI that does the work, and search that finally sees everything. GPT-5.4 already shaping GPT-5.5. Microsoft turning prompts into finished tasks, and Google mapping text, image, video, and audio in one go.

So grab your favorite snack, settle in, and let's dip into what's cooking. No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

🍟 Crispy bites

Fresh tech nuggets. Short, sharp, snackable.

GPT-5.4 raises the AI bar with professional-grade performance

TL;DR

OpenAI has rolled out GPT-5.4 in ChatGPT, API, and Codex, marking a notable leap in AI’s reasoning, coding, and task management capabilities. This version consistently outperforms its predecessors, achieving an 83% success rate when benchmarked against human experts in various professional tasks. With new agentic workflows and enhanced tool executions, GPT-5.4 sets the stage for more reliable and efficient AI-driven processes. Looking ahead, even greater advancements are in the pipeline as AI's role in its development grows incrementally.

Read it yourself

Why this matters

GPT-5.4 achieves a success rate of 83% on the GDPval benchmark, surpassing GPT-5.2's 71%.
Consistently produces more refined results in professional settings, excelling in spreadsheets and presentations.
With improved token efficiency, it cuts costs and speeds up processing in APIs.
Introduces a new level of planning and adaptability, allowing users to steer responses proactively mid-answer.
Advances in computer tool use and webpage interaction promise enhanced user experiences.

My Taste

What stands out about GPT-5.4 is its ability to narrow the gap between AI and human capability, significantly outperforming previous models. As Ethan Mollick put it: “If you give a seven-hour task to AI, even with failure rates and the need to check results, you’d save an average of 4 hours and 38 minutes.” Crazy numbers. The rapid growth in its capabilities signals a move towards more self-sufficient AI systems. As we approach a future where models can drive much of their own development, each iteration brings us closer to autonomous generative processes. This version, GPT-5.4, is already helping to build GPT-5.5; just as GPT-5.3 helped create GPT-5.4. That means within a few iterations, AI models could be building newer, better, more capable versions of themselves. And that’s even crazier, and exciting at the same time, than it sounds.

Google launches Gemini Embedding 2 for multimodal data

TL;DR

Google has introduced Gemini Embedding 2, its first natively multimodal embedding model, now in public preview. This model maps text, images, videos, audio, and documents into a single embedding space, supporting over 100 languages. Initial reports show better precision and recall in tasks involving different media types. The release positions Google ahead of competitors like OpenAI, which primarily focus on text embeddings. The ability to process data across multiple formats from a single query marks a significant leap in AI functionality.

Read it yourself

Why this matters

Gemini Embedding 2 supports up to 8192 text tokens, 6 images, 120 seconds of video, audio data, and 6-page PDFs.
Processes interleaved inputs, allowing combined media types in a single request.
Outperforms leading models in tasks involving text, image, video, and audio.
OpenAI's latest API still focuses on text only, while Gemini covers five data formats.
Google's approach lets companies search across all modalities simultaneously, improving search precision.

My Taste

Google shipped a model that processes all media formats. Most others focus on text. This is how you build a competitive edge. When a company can search across all formats natively, it can integrate data in ways that others, like OpenAI with its text-focused model, simply can't. It’s not just about processing power, this is about influence and reach. Google’s investment in truly multimodal search will redefine what's possible in strategic communication tools.

Microsoft partners with Anthropic on Copilot Cowork

TL;DR

Microsoft has launched CoPilot Cowork, a significant upgrade to its AI capabilities within Microsoft 365. Unlike its predecessor, this tool doesn’t just suggest—you can delegate tasks entirely. The shocking part? Microsoft teamed up with Anthropic for this AI's backbone, challenging assumptions about its partnership strategy. With this move, Microsoft is relying on its competitor’s technology, signaling a dramatic shift in AI deployment.

Read it yourself

Why this matters

CoPilot Cowork integrates deeply with M365 apps and files
Operates safely within Microsoft 365’s security and governance guidelines
Uses Anthropic's Claude Cowork AI instead of OpenAI
Stems from a partnership with Anthropic despite Microsoft's 27% stake in OpenAI
Marks a strategic pivot from building in-house to collaborating with former rivals

My thoughts

Microsoft just changed the AI playbook. They introduced a system that does the work for million of users. One prompt: “Prepare me for Thursday’s client meeting.”
It scans your emails. Reads your files. Builds the deck. Runs the numbers. Blocks time in your calendar. Drafts the follow-up. Done. Here’s what matters: the engine is not OpenAI, where Microsoft owns 27%. It’s Anthropic. The same company behind Claude Cowork, the launch that shook the software market earlier this year. Instead of fighting, Microsoft integrated. Copilot Cowork now runs inside Microsoft 365, grounded in your real data, emails, Teams chats, calendars, files, operating across the full workspace. Assistants answered questions. Workers execute tasks. What this means? The AI assistant era was a warm-up. The AI worker era just began.

Replit Agent 4 prioritizes creative collaboration

TL;DR

Replit has launched Agent 4, a new AI designed to enhance creativity and reduce coordination tasks in software development. This agent shifts focus from mundane technical chores to fostering creative processes, enabling projects to move 10 times faster. Replit Agent 4 offers seamless design integration, parallel task execution, and the ability to manage everything from apps to slide decks in one place. The goal is to put creativity front and center in the development process.

Read it yourself

Why this matters

Replit Agent 4 speeds up production by 10x
Allows simultaneous design and build on an infinite canvas
Supports parallel task execution, visible progress, and automatic conflict resolution
Consolidates all project assets, from code to launch materials, in a single environment
Equipped to transform rough ideas into functional applications with minimal guidance

My Taste

Speed is easy. Coordination is hard. The market is obsessed with velocity. Lovable races past $400M ARR. Cursor crosses $2B. Same metric, same story: How fast can one person ship alone? Replit is playing another game. Agent 4 is built on a simple belief: solo builders hit limits. So the product leans into collaboration. Shared canvas. Multiple agents working at once. Teamwork inside the core loop. This isn’t single-player mode. It’s co-op. Because when AI writes the code, execution stops being the bottleneck. Direction does. Which version makes sense? Which one survives contact with reality? Shipping 100,000 projects a day sounds impressive. But speed without alignment creates noise. Speed with alignment creates value. The difference sits in the stack. Replit owns it end to end: design, build, data, auth, hosting, deploy. No passing files around. No fragile bridges. Just one environment where parallel work actually holds together. Solo builders will always chase the fastest tool. Teams will choose the system that keeps everyone in sync. Replit is betting the future belongs to teams.

🧀 Cheesy pick

A cheesy selection of three tools and one tasty rabbit hole.

Varg.ai turns one brief into hundreds of ready-to-publish ad videos.
Shipper.now builds full apps from plain english prompts instantly.
Glaze makes real desktop apps from plain language AI chat.https:/
Bonus: A practical guide to building real skill with Claude.

🍱 Leftovers

A roundup of updates that are too cheesy to ignore.

OpenAI secretly coding up a GitHub rival, setting sights on potential customer sales.
OpenAI develops BiDi audio model for glitch-free, interactive sound experiences.
OpenAI’s ChatGPT expands its reach to Excel and soon to Sheets, streamlining your data work.
OpenAI unveils a new flagship model with enhanced financial tools for smarter spreadsheets and presentations.
OpenAI’s Codex Security's application agent hunts and patches code vulnerabilities to turbocharge your deployment.
OpenAI’s ChatGPT unveils interactive visuals for a deeper dive into math and science concepts.
OpenAI’s ChatGPT enriches math and science education with dynamic, interactive visual aids.
OpenAI unveils Symphony, automating project board management with agent orchestration.
OpenAI’s Video API unveils Sora 2, boasting custom characters, vertical exports, and scene extensions.
LTX Studio rolls out Dubbing and Captions for enterprise teams to boost video localization.
LTX Desktop launches as a local, open-source video editor optimized for NVIDIA GPUs.
LTX introduces LTX-2.3: the world's fastest 4K video generator with seamless native dialogue.
Flow Studio unveils Wonder 3D for rapid creation of detailed 3D assets from text and images.
Exa Deep launches agents that optimize your searches with rapid, structured results.
Stripe is expanding billing for LLM tokens with automatic usage tracking and Stripe routing.
Google’s Canvas AI Mode now available in the U.S., offering dynamic planning, creative writing, and coding support directly in Search.
Google Workspace's Gemini update speeds up Docs, Sheets, and Slides while Drive now summarizes your results instantly.
Google’s Gemini CLI's new Plan mode navigates your codebase to craft a detailed development blueprint.
Google Maps integrates Gemini models for a navigation experience that's smarter and more explorative than ever.
Google unveils Workspace CLI with 40+ agent skills for Drive, Gmail, and Calendar mastery.
Android is making AI agents and assistants more helpful for Android apps and users.
Cursor unveils Automations: a tool to streamline agent launches within coding environments, triggered by code changes, Slack, or timers.
Luma unveils creative AI agents powered by 'Unified Intelligence' to streamline end-to-end creative work.
Luma debuts Uni-1, their first model merging understanding and generation towards unified intelligence.
Netflix acquires InterPositive, Ben Affleck's AI firm, to boost post-production editing capabilities.
xAI’s Grok Voice Mode now lets you attach pictures and files directly in the app.
xAI’s Grok 4.20 Beta unleashes a 16-agent AI swarm for SuperGrok users, offering unparalleled customization.
Anthropic’s Claude Marketplace rolls out a streamlined AI procurement hub for enterprises, now in limited preview.
Anthropic’s Claude Code introduces Code Review, deploying agents to sniff out bugs in your pull requests.
Anthropic’s Claude Code introduces /btw for seamless side chain conversations while coding.
Anthropic launches the The Anthropic Institute to deepen public dialogue on powerful AI.
Anthropic’s Claude synchronizes Excel and PowerPoint for seamless data integration and effortless updates.
Anthropic’s Claude empowers users to create interactive charts and diagrams directly in chat, now in beta for all plans.
Anthropic’s Claude Opus 4.6 and Claude Sonnet 4.6 now support a 1-million context window.
Base44 unleashes Superagents, giving everyone an AI coworker with zero setup across 100+ services.
Tencent launches WorkBuddy, an AI desktop agent for effortless task automation and management.
Helix 02 autonomously tidies your living room, resetting your home just the way you like it.
Harvey unveils Agent Builder, empowering teams to create autonomous agents with 25K+ workflows.
Ethereum ERC-8183 introduces Agentic Commerce, offering job escrow with evaluator attestation on Ethereum.
Runway Characters transform apps and websites with customizable, real-time avatars, turning online interactions into conversations.
AMI secures $1.03B to develop AI with persistent memory and real-world intelligence.
Hume’s TADA debuts as an open-source TTS, delivering zero hallucination, 5x speed, and extended audio clarity.
Meta has acquired Moltbook, a viral social network designed for AI agents.
Adobe rolls out an AI assistant for Photoshop, enhancing image editing with new Firefly features.
Zoom unveils an AI-powered office suite and announces AI avatars for meetings are launching this month.
Freepik launches Speak: create lip-synced talking videos with custom voices in 30+ languages.
Cloudflare’s new /crawl endpoint automates site crawling into HTML, Markdown, or JSON in one API call.
Perplexity's Personal Computer ensures 24/7 local access to your files and apps with a Mac mini.
Perplexity’s Computer for Enterprise automates workflows using 20 models and 400+ app integrations.
Perplexity API is now a full-stack platform that replaces your model provider, search, and embeddings.
Firecrawl CLI launches as your go-to tool for web scraping, searching, and browsing with agents.
Mastercard unveils Crypto Partner Program to revolutionize global money transfers with 85 industry collaborators.
WordPress launches my.WordPress.net, a browser-based private workspace for creating personal sites with no signup or hosting needed.
Parallel CLI lets agents search, extract, and enrich web data effortlessly with just a terminal command.
Genspark unleashes Claw, the AI employee that handles complex tasks via your personal cloud computer.
Peacock dives into AI with mobile-first vertical videos, live sports, and gaming innovations.
Okara introduces the world's first AI CMO. It pioneers automated marketing with smart agents boosting web traffic instantly.

How’d this digital dip taste?

You made it to the bottom. Quick taste test before you go.

This was it. Our fifty-two digital dip together. Forward this to someone who hasn’t noticed AI is building itself now.

If GPT-5.4 is already building what comes next, the question is simple: where do you still sit in the loop? I help teams move from watching to applying. From testing prompts… to setting up systems that actually do the work. If you want, I’ll help you spot the first 2–3 workflows you can hand over to AI this month. Reply to this email and let’s talk.

Looking forward to what tomorrow brings! ▽

-Wesley

AI has begun to improve itself. GPT-5.4 is the first real proof.

Hi {{firstName|Futurist}},

🍟 Crispy bites

GPT-5.4 raises the AI bar with professional-grade performance

TL;DR

Why this matters

My Taste

Google launches Gemini Embedding 2 for multimodal data

TL;DR

Why this matters

My Taste

Microsoft partners with Anthropic on Copilot Cowork

TL;DR

Why this matters

My thoughts

Replit Agent 4 prioritizes creative collaboration

TL;DR

Why this matters

My Taste

🧀 Cheesy pick

🍱 Leftovers

How’d this digital dip taste?

Other dips you might like as well

Digital Dips

Home

Resources