GPT-5.5, Claude 4.7 & AI Agents: The New Work OS Explained

Hi {{firstName|Futurist}},

April didn’t just feel fast. The last two weeks felt like everything hit at once. Not only four new models(!). But, new tools. New layers. New ways of working. We saw GPT-5.5 and Claude Opus 4.7 raise the bar. ChatGPT Images 2.0 made design sharper and cleaner. But the real fun sat in the leftovers. Microsoft turned Copilot into a full inbox operator. Google pushed Gemini deeper into Workspace. OpenAI rolled out Workspace Agents. Anthropic gave Claude memory and managed agents. Alibaba and Qwen kept open models moving. April was one of those months where almost every update could have been a headline on its own. All those updates point to one thing: AI is the operating layer of work.

On today’s menu: Four new AI models from Anthropic, OpenAI, Moonshot and Deepseek. A clear US vs China split. ChatGPT Images 2.0, and a wave of agent upgrades. Plus a leftovers section so packed it could run the show on its own.

So grab your favorite snack, settle in, and let's dip into what's cooking. No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

🍟 Crispy bites

Fresh tech nuggets. Short, sharp, snackable.

US labs are moving from model upgrades to platform control

TL;DR

OpenAI and Anthropic both moved their top models forward, but the bigger story is what sits around the models. GPT-5.5 pushes harder into agentic coding, computer use, research, documents, spreadsheets, and long-context work. Claude Opus 4.7 improves coding reliability, high-resolution vision, memory, review workflows, and safer cyber use. OpenAI raises the price of intelligence while building ChatGPT, Codex, and browser work into one larger work platform. Anthropic keeps Claude Opus 4.7 at the same price as Opus 4.6, but openly shows that its stronger Mythos Preview remains limited because of cyber risk. That creates a clean contrast with China’s model releases. The US race is moving toward control, access, safety gates, and margin. China is pushing open models and cost pressure from the other side.

Anthropic / OpenAI

Why this matters

April 2026 brings GPT-5.5 from OpenAI and Claude Opus 4.7 from Anthropic within days of each other.
Anthropic released Claude Opus 4.7 with the same API pricing as Opus 4.6, $5 per million input tokens and $25 per million output tokens.
Claude Opus 4.7 adds stronger long-running coding, higher-resolution vision, better instruction following, file-based memory and a new xhigh effort level.
Anthropic keeps Claude Mythos Preview limited because of cyber risk, while Opus 4.7 becomes the safer public release with new cyber controls.
GPT-5.5 rolls out to ChatGPT, Codex, Microsoft Foundry, and the API, with API pricing listed at $5 per million input tokens and $30 per million output tokens.
GPT-5.5 Pro is priced much higher for API use, at $30 per million input tokens and $180 per million output tokens.

My thoughts

The US model race now feels less like a contest over chat quality and more like a land grab for the operating layer of work. Let me explain why this matters. OpenAI is not only selling a smarter model, it is folding ChatGPT, Codex, browsing, documents, spreadsheets and software control into one place. That looks very much like the start of a superapp for work.

Anthropic is playing a different card. It is saying, we have stronger models than the one we ship, but we will not release everything at once. That is rare. OpenAI is pressing on distribution and pricing power. Anthropic is pressing on trust and controlled access. Across the water, China is making the same type of agent capability cheaper and more open. The US lead still looks strong, but the business model is starting to matter as much as the model itself.

The pricing on GPT-5.5 tells the entire story. GPT-5 launched at $0.63 per million input tokens, GPT-5.4 moved to $2.50, and GPT-5.5 now sits at $5.00. That is an 8x increase in roughly eight months. And this happens while inference costs are supposed to be falling. That is not a small detail. That is margin expansion at software scale, wrapped inside model progress. OpenAI is turning model quality into platform gravity. Every agent built on top of GPT-5.5 pays rent to the same company building ChatGPT, Codex, and browser-based work into one larger product. Anthropic is playing a different card. It shows the ceiling with Mythos Preview, then ships the safer model and explains why. That level of transparency is unusual. Put both together and the US story becomes very clear: the next race is about who controls access to the most capable work systems, and who gets to decide which capabilities reach the market.

China’s open models put pressure on the US AI stack

TL;DR

China’s latest model releases are aimed straight at the cost and control layer of AI.
Kimi K2.6 is open-sourced and pushes hard into coding, long-running agents, design work, tool use, and agent swarms. DeepSeek-V4 Preview adds open weights, a default 1M context length, and two model tiers: Pro for stronger work and Flash for cheaper, faster use. The important point is not whether every benchmark beats GPT-5.5 or Claude Opus 4.7. It is that Chinese labs are making “good enough for serious work” cheaper, more portable, and more open. That directly cross-references the US story. While OpenAI raises prices and Anthropic gates its strongest model, China is putting pressure on the market from below.

Moonshot / DeepSeek

Why this matters

Kimi K2.6 is open-sourced and available through Kimi.com, the Kimi app, the API, and Kimi Code.
K2.6 focuses on long-horizon coding, tool use, visual agents, design generation, full-stack workflows, proactive agents, and agent swarms.
Kimi reports a 12-hour coding run with more than 4,000 tool calls, improving local model inference from around 15 to 193 tokens per second.
Kimi also reports a 13-hour overhaul of an 8-year-old financial matching engine, changing more than 4,000 lines of code and lifting throughput sharply.
Agent Swarm scales from K2.5’s 100 sub-agents and 1,500 steps to K2.6’s 300 sub-agents and 4,000 coordinated steps.
DeepSeek-V4 ships with open weights, a Pro model with 1.6T total parameters and a Flash model with 284B total parameters.
DeepSeek-V4 makes one million tokens the default context length and uses token compression plus sparse attention to cut compute and memory use.

My thoughts

China is not trying to copy the US model release playbook. China’s strategy feels like pressure applied at the plumbing layer. Kimi and DeepSeek are not only chasing chatbot quality. They are making long context, agentic coding, tool calling, and multi-agent execution cheaper to run and easier to move around. That matters because the US article shows the other side of the same coin: better closed models, tighter access, higher prices, and more careful release decisions. Open models do not have to be the absolute best on every test to change the market. They only need to be strong enough on the work that burns the most tokens: Coding, research, long documents, tool-heavy tasks. The Kimi examples are especially interesting because they move beyond demos. Twelve-hour runs, five-day autonomous operations, reusable document skills, full presentations, spreadsheets, landing pages, and custom resumes. Messy work. Real work. Compared with the US article above, the pattern is obvious. US labs are building high-control platforms. Chinese labs are turning capability into a commodity faster than many expected. That tension will shape the next phase of AI more than another small benchmark win.

OpenAI launches ChatGPT Images 2.0

TL;DR

OpenAI's new ChatGPT Images 2.0 is now available, offering significantly improved image generation with accurate text rendering and cross-language capabilities. This model responds better to detailed instructions and can create visuals that align closely with user requests. It's rolling out to all users today, highlighting OpenAI's focus on precision and utility in AI-generated images. Expect a noticeable shift in how visual content is created and utilized across different languages and formats.

Read it yourself

Why this matters

ChatGPT Images 2.0 is available to all OpenAI users starting today.
It excels in rendering text and following detailed instructions.
Capable of generating images in multiple languages accurately.
Model uses expanded visual and global knowledge for improved outputs.
Enhanced capability for creating sophisticated, layout-heavy visuals.

My thoughts

OpenAI just dropped ChatGPT Images 2.0, and suddenly your LinkedIn title might need an update. One prompt. Clean text. Real faces. No six fingers. No three-day design loops. It feels like Photoshop, Midjourney and a full design team compressed into a single sentence. This Images 2.0 shows a clear trend: AI models are increasingly proficient at tasks requiring high precision, like text rendering in images. This pushes us to rethink how automated tools can handle detailed, nuanced work traditionally thought beyond their scope. The cross-language capability is a huge leap in making AI tools more universally applicable, removing language barriers in visual content. In the end, it’s not about what the tool can do. It’s about what you decide is worth making. Everyone can create. Few can choose. And that’s where the real gap still lives.

🧀 Cheesy pick

A cheesy selection of three tools and one tasty rabbit hole.

Basedash turns your data into answers in plain language.
Rocket New turns your idea into a live project with a single command.
Superset lets AI agents build and ship your product from a single prompt.
Bonus: The real AI challenge is not tech, but redesigning how we work and lead.

🍱 Leftovers

A roundup of updates that are too cheesy to ignore.

MiniMax M2.7, now open source, aces SWE-Pro and Terminal Bench 2—grab it on Hugging Face!
HeyGen launches CLI, empowering your AI agent to create and deliver videos right from the terminal.
HeyGen’s HyperFrames goes open source, transforming HTML into MP4 with agent-native magic.
HeyGen Skills crafts a lasting avatar of you for every future video, ensuring your message hits home.
HeyGen’s HyperFrames rolls out editable Timeline for seamless video updates directly in preview mode.
Microsoft Copilot turns your inbox into a taskmaster by completing to-dos with a forwarded email.
Microsoft Word introduces Copilot to track changes and comment with your enterprise context.
Microsoft Power Apps supercharges your workflow with Copilot, app skills, and agents integration.
Microsoft SharePoint unveils AI-driven Skills for end user customization and best practice sharing.
Microsoft Copilot Cowork rolls out to streamline your Microsoft 365 workflows with new coordination and control features.
Microsoft Copilot transforms missed meetings into concise video recaps within Copilot Chat.
Microsoft Copilot activates Agent Mode as the default workspace in Word, Excel, and PowerPoint, enhancing your document-driven tasks.
Microsoft’s Outlook new Agent Mode lets Copilot manage your inbox and calendar stress-free.
Microsoft unwraps VibeVoice ASR: a no-context-loss, speaker-aware audio transcription model that masters 50+ languages for free.
Microsoft Copilot becomes your inbox and calendar assistant, streamlining Outlook tasks.
Anthropic’s Claude Code introduces Routines to automate tasks with scheduled prompts and API triggers.
Anthropic’s Claude Cowork new feature Dispatch lets you start a conversation on your desktop and finish it on your phone.
Anthropic unveils Claude Design: chat with Claude to craft prototypes and presentations.
Anthropic’s Claude for Word enhances Pro and Max plans, teaming up with Opus 4.7 for seamless productivity.
Anthropic’s Cowork introduces Claude's live artifacts for dynamic dashboards and trackers integrated with your apps.
Anthropic’s Anthropic's Mythos AI model, designed for cybersecurity, faces unauthorized access challenges.
Anthropic Claude Code's /ultrareview sends bug-hunting agents to sniff out issues before critical merges.
Anthropic’s Claude Managed Agents enter public beta with memory that learns from every session.
Runway lets a Character attend your video calls, so you don't have to.
Lovable integrates with Databricks, enabling anyone to create live data apps without technical expertise.
OpenAI unveils GPT‑5.4‑Cyber, offering advanced cyber capabilities for top-tier security pros.
OpenAI’s Codex evolves: now multitasking on Mac, connecting apps, crafting images, and remembering your preferences.
OpenAI launches GPT-Rosalind, a cutting-edge model advancing biology and medicine research.
OpenAI Chronicle enhances Codex memories, using screen context to keep you in the workflow zone.
OpenAI unveils a token-classification model to detect and mask PII in text.
OpenAI’s ChatGPT unveils Workspace Agents to tackle complex tasks and workflows across team tools.
OpenAI unveils ChatGPT for Clinicians and HealthBench Professional, revolutionizing clinical chat tasks.
OpenAI’s ChatGPT integrates with Google Sheets, enabling direct Q&A and updates within your spreadsheets.
OpenAI’s Codex's Auto-review mode powers through lengthy tasks with fewer approvals, ensuring safer execution.
Google Chrome introduces 'Skills' to save and execute your favorite AI prompts with a single click.
Google’s Gemini 3.1 Flash TTS debuts with multi-language support and more expressive voices for enhanced audio experiences.
Google’s Gemini lands on Mac, turning desktop decluttering into a breeze with its new Swift app.
Google’s Gemini CLI introduces Subagents, empowering you to delegate tasks with expert precision.
Google Chrome's new AI Mode lets you compare pages side-by-side without tab switching.
Google’s Deep Research unveils Gemini API-powered agents for precise, transparent research with native charting and fully cited reports.
Google introduces Workspace Intelligence to unify and contextualize data across Docs, Sheets, and more for all Workspace users.
Google unveils Gemini Enterprise Agent Platform, streamlining AI agent management and optimization for businesses.
Google Workspace Intelligence debuts to unify app data with advanced reasoning from Gemini.
Google’s Gemini Embedding 2 debuts in API and Vertex AI, with enhanced stability for multimodal apps.
Google Cloud introduces Knowledge Catalog, a universal engine to boost complex task accuracy for enterprises.
Google’s Ask Gemini transforms your chat into a command center for seamless task management.
Google’s Gemini Enterprise unveils enhanced Agent Designer for transparent, natural-language workflow creation.
Midjourney’s V8.1 update brings back iconic aesthetics with native 2K HD rendering, now 3x faster and cheaper.
HCompany’s HoloTab unleashes AI-powered performance in your browser, outperforming major models at a fraction of the cost.
Kling AI Skill launches with Text/Image-to-Video, 4K Image Gen, and Agent compatibility, cutting dev costs for creatives.
Kling's Video 3.0 series now lets you create stunning 4K videos with a one-click feature.
OpenRouter integrates video generation with top models, enhancing your multi-modal creation toolkit.
OpenRouter’s Create-Agent-TUI lets you craft your own agent interface with customizable features and a sleek terminal UI.
Mistral launches Forge, a platform enabling enterprises to build custom AI models from scratch on their own data.
Mistral Workflows lets enterprises automate AI processes with production-grade reliability and durability.
Windsurf 2.0 centralizes agent management and keeps tasks flowing with Devin cloud delegation.
Tencent Hunyuan’s HY-World 2.0 open-sources multimodal magic for generating interactive 3D worlds from text, images, and videos.
Telegram’s Agentic Bots are now just 2 taps away, ready to integrate with your favorite dev services.
Alibaba-ATH unveils Happy Oyster, a real-time world-building tool; early access now open.
Alibaba’s Qwen3.6 debuts as a powerful open-source model with agentic coding and multimodal reasoning.
Alibaba’s Qwen-Image-2.0-Pro boosts image quality and multilingual text rendering, now ranking #9 worldwide for Text-to-Image.
Base44’s Superagents integrate with iMessage, letting you accomplish tasks directly in your chat.
Perplexity’s Personal Computer syncs your local files, native apps, and browser via the Perplexity Mac App.
Firecrawl’s introduces their web-agent: build open-source AI agents for web interaction and data scraping.
Zapier introduces a governance toolkit, granting enterprises precise AI control without stifling innovation.
Krea’s Seedance Effects revolutionizes your videos with one-click motion and style transformations.
Luma Labs’ Innovative Dreams revolutionizes filmmaking with Hybrid Production using AI and real-time collaboration.
LemonSlice-2.1 Flash powers the fastest interactive avatars with Modal and LiveKit synergy.
Salesforce launches Headless 360, transforming platforms into API-driven ecosystems accessible via Slack and voice commands.
Salesforce Headless 360 democratizes its platform with APIs, transforming how you build on Agentforce and Slack.
Exa Search boosts agentic search with frontier LLMs and 20x faster latencies.
Coinbase Agentic.market powers up the agentic economy by enabling seamless service discovery and integration.
Genspark Build unveils its AI-assisted website and app creation tool.
Devin empowers you to lead a squad of virtual Devins, each with its own infrastructure to tackle complex tasks simultaneously.
Odyssey Revolutionizes Simulation with Odyssey-2 Max, a breakthrough in real-time world interaction.
SpaceXAI teams up with CursorAI to craft top-notch coding and AI models, with a $60 billion acquisition option on the table.
Replit Security Agent offers rapid app security reviews with AI, cutting false positives by 90%.
Ideogram Custom Models lets you train image models to generate on-brand visuals tailored to your art direction.
Xiaomi’s MiMo-V2.5-Pro rivals top models with agentic prowess and 1,000+ tool task mastery.
Foundry's new Hosted agents provide each AI with its own enterprise-grade sandbox for seamless operation.
xAI’s Grok Voice Think Fast 1.0 conquers the Tau Voice Bench, excelling at complex, noisy workflows.
Higgsfield MCP unveils seamless content creation inside agent platforms like OpenClaw and Hermes Agent.
Pika Agents: your AI partners with a voice, face, and personality for seamless creation.
Replit Slides debuts AI-powered, beautifully designed presentations to upstage your old slides.

How’d this digital dip taste?

You made it to the bottom. Quick taste test before you go.

This was it. Our fifty-five digital dip together. Forward this to someone who still thinks AI is optional in their business stack.

If four new models are released within days, which one should you choose? Well, choosing is only the first step. Choosing the right platform is even more important. Closed or open? Control or flexibility? Price today or margin tomorrow? These are not tech decisions. They are business decisions. I help teams cut through the noise, pick the right stack, and set up AI in a way that actually fits their workflows, data, and ambitions. If you are rethinking your AI foundation, let’s build it properly.

Looking forward to what tomorrow brings! ▽

-Wesley

April splits the AI world in two

Hi {{firstName|Futurist}},

🍟 Crispy bites

US labs are moving from model upgrades to platform control

TL;DR

Why this matters

My thoughts

China’s open models put pressure on the US AI stack

TL;DR

Why this matters

My thoughts

OpenAI launches ChatGPT Images 2.0

TL;DR

Why this matters

My thoughts

🧀 Cheesy pick

🍱 Leftovers

How’d this digital dip taste?

Other dips you might like as well

Digital Dips

Home

Resources