Digital Dips
Posts
ChatGPT just got promoted to a knowledge worker

ChatGPT just got promoted to a knowledge worker

Dipping into the digital future: the assistant era is over, the agent runs the show

Wesley Romeijnders
July 22, 2025 • Estimated reading time: 22 minutes

Hi Futurist,

The past few weeks have been a rollercoaster. Personally, and in the world of AI. First of all, I can tell you: there’s nothing more pure, more overwhelming, than seeing a child come into the world. After stepping away from writing Digital Dips for eight weeks, I was reminded why I started this newsletter a year and a half ago. The pace isn’t just fast. It’s all over the place. Google, Mistral, OpenAI, Anthropic, xAI, Manus, Higgsfield… and it keeps coming. Sam Altman said it best: “We are past the event horizon. The take off has started.” There’s only one way left to go. And that’s up…

I’ve been trying to finish this edition for two weeks now. But every time I thought I was done, something new dropped. Another breakthrough. Another leak. Another shift. It just doesn’t stop. This isn’t just noise. There’s something fundamental happening here. A change every C-level executive, manager or professional should be paying attention to. If the headstory doesn’t grab you, I honestly don’t know what will. Anyway, I’m back. After eight weeks off, I’ve packed this newsletter with over 130 news updates. And it still feels like I’ve only scratched the surface. Therefore, I am sending you insights, inspiration, and innovation straight to your inbox. Let’s dive into the depths of the digital future together and discover the waves of change shaping our industry.

💡 In this post, we're dipping in:

📣 Byte-Sized Breakthroughs: Agent Mode is here. ChatGPT now thinks, acts, and gets things done. Meanwhile, Grok breaks records, Kimi K2 slashes costs and Perplexity’s Comet rethinks browsing.
🎙️ MarTech Maestros: The 2025 Marketing AI Report is out. AI is mission-critical, but most teams still lack training and a plan. This year’s report reveals the growing gap between ambition and support.
🧐 In Case You Missed It: A mountain of updates you can’t scroll past. Over 130 product launches, features, and tools. Cheesy or not, they matter. You won’t read them all, but you’ll leave smarter than when you opened.

Do you have tips, feedback or ideas? Or just want to give your opinion? Feel free to share it at the bottom of this dip. That's what I'm looking for.

No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

Quick highlights of the latest technological developments.

Headstory: ChatGPT just got promoted to a knowledge worker

Agent Mode is here. A browser inside a brain. A computer that thinks. ChatGPT now operates on its own desktop, choosing tools, acting on your behalf, and solving real problems. It doesn’t just help, it handles. It doesn’t just suggest, it does. ChatGPT no longer supports your workflows, it runs them. One prompt in, task out. And not just any task. Research reports. Bookings. Briefings. Presentations. Runs software. Send emails. Even online shopping. It’s not a chatbot anymore. It’s a knowledge worker. At scale.

So, what can it actually do? Anything that used to take multiple browser tabs, a notepad, a calculator, a text editor and a bunch of back-and-forth. It reads dense files. It logs into secure websites. It creates executive summaries. It builds spreadsheets. It compares vendors. It generates images. It builds entire reports and then schedules the next one. For marketers, analysts, strategists and operators, this means fewer loops, faster output, and hands-on work only when needed. Think of it as your assistant that doesn’t sleep, doesn’t guess, and doesn’t forget.

This is what autonomy looks like. Give it a job. It opens tabs. Logs in. Extracts data. Runs the numbers. Generates results. It even asks smart questions if the prompt is unclear. When things get sensitive, it pauses and hands the wheel back to you. There’s a human-in-the-loop safeguard. But make no mistake, this loop is shrinking. We're stepping into a world where AI does more than respond. It reasons. And acts. Independently.

The shift is bigger than it looks. You’re not just delegating tasks. You’re starting to manage workflows. Real ones. Agent Mode doesn’t ask what to do step-by-step. It figures it out. It double-checks. It runs what-if scenarios. If it needs help, it asks. If not, it continues. This is the kind of autonomy that rewires how work gets done. Professionals used to spend hours stitching tools together. Now, it’s a conversation with an agent who already knows the job.

ChatGPT Agent is the first real glimpse of an AI tool used by millions worldwide, moving from assistant to operator. Google’s Agent Mode, Manus and Genspark, all of them have similar capabilities. The tool becomes the teammate. A system that can think, execute, and adjust in real time. The overhead of multitasking collapses. The cost of complex workflows drops. And suddenly, the scope of what one person, or one team, can accomplish expands dramatically. We’re not talking about faster emails. We’re talking about business units made of silicon.

Even in this first release, the results are remarkable. In OpenAI’s internal benchmarks, designed to evaluate performance on economically valuable knowledge-work tasks, ChatGPT Agent’s output was comparable to or better than that of humans in roughly half the cases, across a range of task durations. And yet, this is still the weakest version we’ll ever use. It’s the floor, not the ceiling.

This is the beginning of the next leap. OpenAI just showed us the blueprint. Combine models that reason with models that act. Equip them with tools like a browser, a file system, a visual interface, a terminal, deep research, image creation and full memory. Make them think before doing. Let them adapt mid-task. This is not a productivity boost. It’s the foundation for AI that operates with human-level agency. When you give an agent goals instead of instructions, and it delivers, the game changes. Every knowledge-heavy function now has a scalable partner. No new hire needed. Just a new prompt.

For businesses, this is execution at scale. Imagine a small team with the operational muscle of a much larger one. AI agents handle research, planning, documentation, synthesis, analysis, and more. They work across departments. They integrate with APIs. They create value from data that used to hide. And they don’t need onboarding. This opens a competitive edge not based on size, but on how well you orchestrate your agent stack.

This demands new thinking. What work stays human? What gets handed off? What’s the protocol when an AI makes a call, and gets it wrong? How do we track accountability in a multi-agent world? Do you train staff to write better prompts… or to manage a team of AI agents?

GPT-5 will take it further. This current Agent Mode, will become its chain of thought. It won’t use tools as plugins. They’ll be part of its reasoning. Ask it to run a market analysis, build a presentation, find partners and send invites. It won’t ask how. It will just do it. The next wave of AI won’t assist. It’ll run operations. End to end. We’ve just passed the starting line. And business as usual won’t make it to the finish. I expect GPT-5 to launch between the end of August and the beginning of October.

And all of this is just the prologue. A lot has happened over the past few days. Here’s a quick summary:

An AI model from OpenAI came in second at the World Coding Championship. It’s expected to be the open source model that OpenAI will release in the coming weeks. And at the same time, it might be the last time a human developer beats an AI model.
OpenAI announced that an experimental version of their best reasoning model achieved gold medal-level performance at the world’s most prestigious math competition (IMO). Under the same time constraints as humans and without using tools. Why is this such a big deal? Because the model was trained for general information, not specifically to do maths. In other words, it’s general intelligence. It exceeded the expectations of every expert. Even OpenAI’s own. Their first reasoning model, o1, thinks for seconds. Deep Research thinks for minutes. This one thinks for hours. A stripped-down version of this model is expected to power GPT-5.
There are several benchmarks designed to measure AGI-level intelligence. Since Grok 4 hit almost 16% on ARC-AGI-2 (more on that below), it’s time for benchmark test 3 to roll out, involving games. The idea is: humans can quickly understand and finish these games. AI models can’t. There are three games. Within a day, an AI model has already completed game 1. Two more to go.

With Agent Mode, we’ve officially crossed level 3 out of the 5 AI levels I wrote about in the 16th edition of Digital Dips. Based on these announcements, we’re just months away from level 4. This should be global news. Every political leader, C-level executive or professional should be thinking about the impact, and what to do next.

The assistant era is ending. The agent era has begun.

Grok 4 achieves new milestones in AI benchmarks

TL;DR

Grok 4, the newest player from xAI, has raised the bar in AI performance by achieving a state-of-the-art score of 15.9% on the challenging ARC-AGI-2 test, nearly doubling the previous top score. In addition to acing hardcore reasoning exams, Grok 4 is equipped with features like a 256k context window and an efficient voice mode. However, while its benchmark achievements are impressive, some users question its effectiveness in real-world applications.

Read it yourself?

Sentiment

Grok 4 came with high expectations, thanks to its impressive benchmarking achievements and new features. However, while it excels in controlled testing environments, there are concerns about its practicality outside of benchmarks. Some users question its bias, pointing out how it might reflect personal perspectives rather than objectivity. Additionally, its performance in real-world applications has been underwhelming for some, suggesting that it might be more of a showpiece of technology than a practical tool for everyday tasks.

My thoughts

Grok has made headlines over the past few weeks, and not always for the right reasons. Most recently, they launched AI Companions within Grok 4, including ones that actively seek out sexual relationships. Let’s talk about the model. Grok 4 stands out with its unique method where multiple agents collaborate on tasks before deducing the best solution. This distributed approach suggests the future of AI might lean towards leveraging compute power over innate capabilities. But as AI models surpass existing benchmarks, the ultimate test remains reality itself: can they perform in real-world scenarios like inventing technology or solving practical problems?

For those developing consumer AI products, it's essential to remember that most people are only familiar with names like ChatGPT. The technicalities behind the model are less relevant to them. This means that even as models like Grok 4 push the boundaries, early adopters in application fields may have a wide margin for innovation as public awareness grows. As Grok 4 continues to evolve, it's crucial to bridge the gap between these groundbreaking benchmarks and tangible real-world applications.

But at the same time, I have doubts about whether this model is acting more like a spokesperson for Elon himself, given how often Grok cites Elon’s opinions in its answers. I think that’s risky, especially coming from someone aiming to be first in building AGI. That said, we don’t really know why Grok does this. Not even the team behind Grok fully understands it. The team has stated they’ve addressed the issue.

Moonshot releases Kimi M2, the game changing open source AI model

TL;DR

Kimi K2 marks a new era in open-source AI, excelling at coding and agentic tasks with an astonishing price tag up to 90% cheaper than competitors. With its 1 trillion total parameters, it leverages a Mixture of Experts (MoE) architecture to outperform larger models in efficiency and accuracy, even overtaking giants like Meta's Llama and Claude Opus 4 on coding benchmarks. Despite lacking multimodal capabilities, Kimi K2 offers an impressive step forward in more accessible advanced agentic intelligence.

Read it yourself?

Sentiment

Its ability to outperform models like Meta's Llama with significantly fewer resources has left industry insiders and tech enthusiasts quite impressed. Developed with a fraction of the resources used by formidable competitors like Meta, Kimi K2 still manages to outperform others on coding benchmarks, specifically Claude Opus 4, and does so at a significantly reduced cost. Built on an open-source MoE architecture similar to DeepSeek-V3 but with 50% more parameters, Kimi K2 leverages a network of experts to offer efficient and accurate solutions by routing inputs through a specialized network suited for the task.

My thoughts

Moonshot AI gained an edge with two strategic insights: acknowledging that human data is a finite resource and training Kimi K2 with synthetic data to simulate real-world tool use scenarios. By employing a self-judging reinforcement learning system, the model can act as its own critic, refining its coding tasks. This development not only proves economically efficient, reducing training costs compared to past Meta projects, but it also highlights shifting strategies in AI development. Mark Zuckerberg's pivot towards ASI reveals changing priorities amidst competitive pressures, while delays from OpenAI hint at possible safety concerns or potential competition with models like Kimi K2. Meanwhile, China’s ability to develop competitive AI models with fewer resources continues to make the boardrooms in Silicon Valley swear. The capabilities and low cost of Kimi K2's API underscore its potential to democratize advanced AI, making it accessible to a broader audience without the hefty price tag.

More byte-sized breakthroughs:

Perplexity introduces Comet, The agentic browser of the future
Perplexity has launched Comet, a agentic browser that redefines web interaction. Imagine a browser that not only navigates the web but also executes commands on your behalf. With Comet, you're one step closer to a truly hands-free online experience, transforming how you interact with today's internet. It’s designed for the modern internet user, offering seamless functionality that empowers efficiency and productivity.
Midjourney introduces seamless image-to-video animations
Midjourney has launched its AI video model, making it easier than ever to transform your generated images into dynamic videos. With just a click, users can animate their images using either automatic or manual settings, allowing for both subtle and lively motion. This innovation marks a significant step towards real-time AI simulations, offering a fun, affordable way to explore video creation.
Discover tools that work with Claude
Anthropic has launched a brand-new directory of tools designed to integrate seamlessly with Claude, transforming it from a simple virtual assistant into a comprehensive AI collaborator. With connectors to popular services like Notion, Canva, and Stripe, Claude now leverages the full context of your workspace, delivering more relevant and actionable responses.

A must-see webinar, podcast, or article that’s too good to miss.

The 2025 Marketing AI Report

AI adoption is booming, 60% of marketers now pilot or scale AI tools. Yet 68% still lack formal training, and 75% have no roadmap. The paradox? AI is seen as “critically important,” but most teams still fly blind. The 2025 Marketing AI Report shows a widening gap between marketers pushing ahead with AI and the companies meant to support them. AI is now critical to marketing success, but most teams still lack training, strategy, and support. This year’s report gives a sober look at progress, pressure points, and what’s still missing from most boardrooms.

A roundup of updates that are too cheesy to ignore. In total, I’ve packed over 130 news bites into this newsletter, the kind I think will interest a wide mix of professionals. I know you won’t read them all. But at least you’ll be up to speed on the past 8 weeks.

Mistral announces Agents API to enhance AI capabilities with code execution, image generation, web search, MCP’s and more.
Mistral’s Le Chat unleashes deep research, voice mode, multilingual reasoning, and image editing for seamless productivity.
Mistral Small 3.2 debuts with sharper instruction execution and minimal repetition. Enhanced function calls cement its role as a robust AI choice.
Mistral debuts Mistral Code, an AI-powered vibe coding client to rival GitHub Copilot.
Mistral releases Codestral Embed, a state-of-the-art model for high-performance code retrieval and semantic understanding.
Mistral launches Magistral as the first reasoning model built for domain-specific, transparent, and multilingual intelligence.
Mistral launches Mistral Compute, an AI infrastructure project to keep nations, enterprises, and research labs ahead in global AI innovation.
Mistral releases Voxtral, a cutting-edge open-source speech models for seamless multilingual interaction.
Mistral releases Devstral models, boosting agentic coding with cost-efficient performance.
Anthropic’s Claude new voice mode beta on mobile lets you chat and manage tasks hands-free.
Anthropic’s Claude Code connects seamlessly to remote MCP servers.
Skip local setup to easily pull context right into your workspace.
Anthropic launches an dedicated space for your Claude’s artifacts and embeds AI directly into your creations.
Opera releases Neon, an AI-powered browser that codes and performs tasks like shopping and form-filling.
Databricks One is a new, streamlined experience that gives business users fast and secure access to data and AI insights
Databricks’ Agent Bricks auto-optimizes agent systems and creates domain-specific datasets for your tasks.
LleverageAI eases workflow with instant, conversational automation. Create automations without coding, integrating 2,000+ tools in seconds.
Telegram partners with xAI to bring Grok AI to over a billion users, backed by $300M in funding and shared subscription revenue.
Synthesia lets you become an AI avatar by recording a quick video anytime, anywhere.
Synthesia adds clickable buttons to videos, creating personalized and interactive experiences.
Odessey introduces AI video you can watch and interact with, in real-time.
They call it interactive video.
Factory unleashes Droids: autonomous agents crafting software with your engineering tools.
Retool launches Agents, AI coworkers that connect to your databases and APIs to work autonomously in your company.
Krea 1 launches as our first image model, delivering superior aesthetic control, style references, and custom training support.
Krea adds Hunyuan3D-2.1 to their suite, this new model can produce high-fidelity 3D with PBR textures, making generations look more photorealistic.
Krea added support for starters frames in Veo 3, letting you create videos from any image.
Perplexity on WhatsApp lets you schedule daily tasks, reminders, and more using simple English.
Perplexity expands agentic shopping with broader merchant coverage and smarter product discovery.
Perplexity launches Labs, a powerful tool for complex tasks, from reports to dynamic dashboards.
ElevenLabs’ Batch Calling for Conversational AI automates outreach by initiating hundreds of personalized calls simultaneously.
ElevenLabs v3 launches as the most expressive Text to Speech model, supporting 70+ languages and dynamic audio tags.
Elevenlabs’ ElevenReader streams AI-voiced content right from your browser.
ElevenLabs Conversational AI now supports MCP, enabling instant connections to Salesforce, HubSpot, Gmail, and more.
ElevenLabs Voice Design v3 lets you create any voice imaginable with enhanced quality and expressive range.
11ai launches a voice-first AI assistant that integrates MCP for seamless task management.
Chatterbox, a state-of-the-art TTS model that outperformed ElevenLabs (v2) in blind tests, is now open source under the MIT license.
Kling 2.1 debuts as Image-to-Video model, with 720p and 1080p modes, promising advanced video creation for creatives.
Tencent and Tencent Music launch HunyuanVideo-Avatar, a model that animates photos with speech and singing in multi-character scenes.
Runway’s Layout Sketch for Gen-4 References lets you start fresh or build on existing images with new elements and compositions.
Runway’s Act-Two elevates motion capture with improved tracking and generation quality for enterprise and creative pros.
Runway’s Chat Mode unlocks Gen-4 Images, Videos, and References in a single conversational interface.
Runway unveils Game Worlds, turning images into mood boards with References.
Hume Evi 3 launches as a speech-language model that understands and generates any voice with tune, rhythm, timbre, and style like never before.
Hume's EVI 3 can also clone your voice and style with its latest TTS and speech-to-speech models.
DeepSeek-R1-0528 launches with better benchmarks, improved front-end, fewer hallucinations, and JSON plus function calling support.
PlayAI launches Speech Editor to edit audio like text documents with AI-powered precision—no re-recording needed.
Replicate launches Kontext Chat. It lets you edit images simply by chatting with them.
PlayDiffusion launches as an open-source diffusion model for precise AI speech editing, enabling fine-grained modification of existing audio.
Gamma lets you create social content in square, portrait, or story formats, no resizing needed, hassle-free. Now with AI image editing.
Gamma now exports directly to Google Slides, letting your designs flow seamlessly into your Google workspace.
Manus launches Slides, generating stunning, structured presentations from a single prompt. Edit, export, and share your decks effortlessly.
Manus launches video generation that turns prompts into full stories, scenes, and animations in minutes.
Manus now integrates with One Drive, allowing direct upload and export.
Just click the upload button and connect to One Drive to get started!
Manus supercharges image search with rich, accurate visuals that fuel creative storytelling across websites, slides, and more.
Manus automates presentation decks, delivering professional layouts, polished visuals, and clear content in minutes.
Manus turn messy data into clean, interactive charts. Upload your raw dataset, describe what you need, and let Manus do the heavy lifting.
Manus integrates Veo3 to deliver sharper visuals and cinematic storytelling with natural audio sync.
Manus Scheduled Task automates your daily routines, from 7 AM market reports to weekly surveys, so you can focus on what matters.
Manus Cloud Browser saves your login status with consent for seamless sessions and uninterrupted automation.
Manus introduces Audio to transform reports into podcasts, enable hands-free reviews, and read scripts aloud for seamless multitasking.
Captions’ Mirage Studio launches expressive video creation with lifelike actors that laugh, sing, and rap on your command.
Genspark launches AI Secretary to streamline Gmail, Calendar, Drive & Notion with smart email management and automated meeting scheduling.
Genspark launches AI Browser next week, blending smart browsing with seamless AI integration.
Genspark launches Parallel Tool Calls, delivering up to 10X faster AI workflows across research, image, video, and sheets.
Genspark launches AI Docs, the first full-agentic AI document creator with instant drafting and free designer templates.
Genspark launches AI Pods to turn any content into a professional podcast with a single prompt.
LumaLabs Modify Video lets you reimagine any clip with director-level control over style, character, and setting.
HeyGen launches Video Agent, the first Creative Operating System that crafts scripts, casts actors, and edits your videos automatically.
HeyGen AI Studio debuts, redefining video creation with full avatar control over voice, movement, and expression.
HeyGen launches Product Placement for creating scroll-stopping UGC ads with hyper-realistic avatars and no hassle.
OpenAI’s o3-pro is now available to all Pro users in ChatGPT and the API.
OpenAI’s ChatGPT now connects to more internal sources with real-time context, retaining user-level permissions.
OpenAI reveals "Hive," a Recording feature that transcribes meetings into structured Canvas docs, launching soon for business accounts.
OpenAI Projects in ChatGPT get boosted with deep research, voice mode, better memory, file uploads, and mobile model selector.
OpenAI Canvas now lets you download your work as PDF, docx, markdown, or code files like .py and .js.
OpenAI Codex introduces Best-of-N, generating multiple responses at once to help users find the best solution faster.
OpenAI is developing collaborative document and chat features in ChatGPT to challenge Google and Microsoft’s dominance.
OpenAI is set to launch an AI-powered web browser to directly challenge Chrome and transform how users browse the internet.
OpenAI prepares to launch its new open model next week, reportedly similar to o3 mini.
OpenAI plans in-chat shopping for ChatGPT, partnering with Shopify to monetize free users.
OpenAI now edits faces in high fidelity, preserving every detail while adding mustaches and expressions within ChatGPT and the API.
Bland TTS launches the first voice AI to perfectly mimic human speech from a single audio sample.
Microsoft Co-pilot is your new shopping sidekick, delivering smarter finds, better deals, and personalized picks, all in one spot.
Microsoft previews Copilot 3D, transforming images into 3D models, for a new dimensional leap.
Flowith’s Agent Neo lets you render friends, build your pet’s dream house, and explore it in AR and Vision Pro.
Google NotebookLM upgrades for businesses & educators with Mind Maps, Discover Sources, and Audio Overviews in 80+ languages.
Google AI Edge Gallery lets users download and run AI models on phones, away from the cloud.
Google Search Live with voice launches in AI Mode, letting you chat and get instant audio answers with on-screen links.
Google Search now uses Gemini 2.5 Pro and Deep Search in AI Mode, plus it can now use AI to call local businesses to check pricing on your behalf.
Google’s Gemini CLI lands in developer terminals, boosting coding efficiency with open-source AI.
Google’s Veo 3 debuts in Gemini API, delivering advanced video with optional audio at scalable rates.
Google Gemini 2.5 Pro offers a sneak peek at its smartest AI yet before full rollout.
Google’s Gemini app launches scheduled actions to automate tasks and deliver personalized updates.
Google’s Gemini new feature transforms your photos into dynamic sound-enhanced videos.
Google Sheets now supports Gemini for instant text generation, summarization, and categorization directly from a cell.
Google unveils Gemini Embedding for 100+ languages in the Gemini API, now live at $0.15/M tokens.
Google’s DeepMind welcomes Windsurf team to enhance AI innovation with new talents.
Cognition acquires Windsurf, integrating agentic IDE with Devin’s codebase savvy.
Higgsfield launches Speak, the fastest way to create motion-driven talking videos with cinematic voice and emotion.
Higgsfield Canvas lets you paint products onto images with pixel-perfect precision.
Higgsfield Soul launches with 50+ fashion-grade presets for ultra-realistic photos.
Higgsfield Soul Inpaint adds pixel-perfect control to Soul’s signature style, letting you edit clothes, hair, and objects seamlessly.
Higgsfield launches Soul ID, delivering fully personalized characters with fashion-grade, high-aesthetic realism.
Higgsfield UGC Builder lets you create cinematic videos effortlessly with total scene control.
Move AI advances 3D animation with enhanced hand tracking, new desktop apps, and a precision motion editor in beta.
Granola launches File Uploads beta to analyze files and meetings in one place.
SkyReels V2 launches as the first open-source AI video tool, delivering cinematic content at under 10% of typical costs.
Tencent launches Hunyuan Game, the first AI engine for game production featuring instant character design, full AI art pipeline and real-time canvas.
Descript’s Underlord AI co-edits your videos, bringing seamless vibe editing to every project.
Firecrawl launches Fire Enrich l as an open-source Clay alternative, auto-filling missing CSV data like decision makers and company size.
Firecrawl launches Firestarter as an open-source chatbot to crawl websites, train bots on multiple sources and auto-generate endpoints.
Firecrawl launches Fireplexity as an open source Perplexity clone delivering AI-powered answers with cited sources.
Firecrawl launches FireGEO as your open-source Semrush for AI, tracking website presence on AI search platforms.
Apple executives discuss potential bid for AI startup Perplexity AI to boost talent and technology.
Apple mulls Mistral acquisition to bolster its AI capabilities and close the tech gap.
Oakley Meta Glasses launch, designed to amplify human potential through immersive AR experiences.
Meta takes a nearly 3 % stake, around €3 b or $3.5 b, in the parent company of Ray‑Ban and Oakley.
Meta’s WhatsApp launches new features in the Updates tab like channel subscriptions and ads.
Meta adds AI-powered summaries to WhatsApp, making chats easier to follow.
Meta acquires Play AI to enhance its AI-driven voice technology.
Meta’s building AI infrastructure on a Manhattan-sized scale, their 5GW AI supercluster goes live in 2026.
Meta wants access to unshared photos in your camera roll for AI-generated suggestions and themes.
Meta’s Project Omni empowers AI bots on AI Studio to proactively follow up, initiating context-aware, personality-driven messages.
Salesforce’s Agentforce 3.0 + MCP connects Agents to any system or data source securely and at scale.
Airtable relaunches as an AI-native app platform, blending no-code business apps with scalable automation powered by intelligent agents.
Harvey launches Deep Research for legal, powered by OpenAI's API and Harvey’s legal reasoning to support top law firms and enterprises.
Xiaomi unveils AI glasses with 12MP camera, Snapdragon AR1 chip, and 8.6-hour battery.
Skywork Deep Research delivers visual, structured reports with real data, citations, and insights, no more boring AI summaries.
Lovable's Agent Mode, now in beta, empowers autonomous thinking, planning, and action for enhanced project success.
Shopify’s Storefront MCP connects directly to OpenAI Responses API for seamless product search, cart management, and checkout link creation.
Cloudflare and top publishers block AI crawlers by default, requiring payment for content use.
Morphic’s One-shot Character Model lets you train a full character from just one image.
Proactor launches the world’s first proactive AI agent, delivering real-time transcription, live summaries, and task execution before you even ask.
Freepik AI suite now features Text to Speech with natural voices in 30+ languages. Bring your stories to life with authentic sound.
Orchids launches the first AI tool to chat your way to apps and websites that don’t look AI-made.
Moonvalley launches the world’s first fully licensed AI video model for professional filmmaking.
Hugging Face launches a $299 robot poised to shake up the robotics industry.
MiniMax Hailuo 02 brings 1080p video generation at record-low cost. It nails complex scenes like gymnastics with stunning realism.
MiniMax teases instant video creation with vibe-first AI Video Agent.
No tools. No timeline. Just talk to the agent and get a clean video back.
MiniMax Agent turns prompts into full product builds. It handles complex tasks, breaks down steps, and delivers finished work, no code needed.
Amazon teams up with Anthropic to launch an AI Agent marketplace, boosting opportunities for startups and AWS customers.
Amazon launches Kiro as their agentic IDE, streamlining spec-driven development and automating tedious coding tasks.
Alibaba’s Qwen Chat releases Desktop version with MCP support for smarter management.
LTX Studio unlocks 60-second AI video generation, offering speed and affordability on consumer GPUs.
Black Forest Labs, FLUX.1 Kontext debuts with generative flow matching for seamless image creation and editing.
Suno 4.5+ unlocks new ways to create: swap vocals, flip instrumentals, or spark a song from any playlist.

How was your digital dip in this edition?

You're still here? Let me know your opinion about this dip!

This was it. Our thirty-seven digital dip together. It might seem like a lot, but remember; this wasn't even everything that happened in the past few weeks. This was just a fraction.

If you’ve got questions about your future, your team’s, or your organization’s, I’d be happy to swing by and help you map things out. To prepare for what’s coming. Because that’s what this moment asks of us. A new world doesn’t need old thinking. Let’s figure out your next step, together. Drop me an e-mail.

Looking forward to what tomorrow brings! ▽

-Wesley