AI in April 2026: The Compute Crunch and the Platform Wars

-26 min read
#ai#agentic-ai

April was the month the agent boom hit the bill.

Agentic workflows sent token usage vertical, and both sides of the market felt it. On the consumer side, Uber spent its full-year AI budget in four months. On the provider side, GitHub paused new Copilot signups and Anthropic briefly pulled Claude Code from its $20 plan.

The labs shipped anyway. Anthropic released Opus 4.7, Mythos (gated to security partners only), Claude Design, Routines, a hosted agent runtime, and a desktop redesign. OpenAI dropped GPT-5.5, GPT-Rosalind for life sciences, GPT-5.4-Cyber, Image 2, a privacy filter, and a major Codex update. Google unveiled two TPU v8 chips at Cloud Next and rebranded Vertex AI as the Gemini Enterprise Agent Platform. Meta Superintelligence Labs shipped Muse Spark, its first model since the lab was formed.

The deals reshuffled the stack. OpenAI ended its Microsoft exclusivity and landed on AWS the next day. SpaceX took a $60 billion option on Cursor. China blocked Meta's $2B acquisition of Manus.

And the layoffs hit hard. Snap cut 1,000. Meta cut 8,000 to fund AI capex and is using employee keystrokes to train its agents.

Here is what happened.

The Compute Crunch

The bills came due.

GitHub pauses Copilot signups. From April 20, GitHub stopped new sign-ups for Copilot Pro, Pro+, and student plans. Opus models were dropped from Pro (kept on Pro+). GitHub said agent sessions were consuming far more compute than the plans were built for. Existing subscribers can cancel for a refund within a month.

The Claude Code A/B that escaped. On April 21, Anthropic's logged-out pricing page briefly showed Claude Code removed from the $20 Pro plan. The change was an internal test that leaked to production and was rolled back within hours. Pro users on Opus 4.7 were running Claude Code sessions up to 3x longer than on Opus 4.6.

GitHub's infrastructure is buckling. GitHub now processes 275 million commits per week, on track for 14 billion in 2026. Agent-driven pull requests jumped from 4 million in September to 17 million in March, a 325% rise. Five major incidents hit in the first two days of April. GitHub plans to move half its traffic to Azure Central US by July.

The pattern is industry-wide. Agentic workflows have outrun the compute supply, and the providers are responding by deprioritizing cheap individual plans and shifting capacity to existing and enterprise customers. Flat-rate plans were priced for a chat box. They are not built for an agent that runs for hours.

AI Labs

Anthropic

Anthropic shipped Opus 4.7 (its new flagship), Mythos (a more capable model above Opus, gated to security partners), Claude Design (a new product), Managed Agents (a hosted runtime), Routines (scheduled Claude Code jobs), and a redesigned Claude Code desktop app. All while cleaning up a leak of Claude Code's source.

Claude Code source leak. On March 31, Anthropic accidentally shipped the full source of Claude Code to public npm. A 59.8 MB source map inside version 2.1.88 exposed roughly 513,000 lines across 1,906 TypeScript files. The cause was a Bun bug that served source maps in production. Mirrors of the leaked source spread quickly (e.g. yasasbanukaofficial/claude-code), and a clean-room rewrite called claw-code hit 50,000 GitHub stars in two hours and has since crossed 100,000. The leak revealed 44 unshipped feature flags including a "KAIROS" autonomous daemon mode and an "undercover mode" prompt for public repos. Anthropic issued takedowns, then later said the takedowns were broader than intended.

Project Glasswing and Claude Mythos Preview. On April 7, Anthropic announced Project Glasswing, an effort to secure critical software using a previously unannounced model called Claude Mythos Preview. Mythos sits above Opus, with the largest gains in math, long-context reasoning, software engineering, and cybersecurity. Anthropic says it has found thousands of zero-days in major operating systems and browsers, including a 17-year-old FreeBSD RCE filed as CVE-2026-4747. Mythos is not generally available. Access is gated to about 40 critical-infrastructure organizations, including AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic committed up to $100M in usage credits and $4M in donations to open-source security groups.

Claude Managed Agents. On April 9, Anthropic shipped Managed Agents in public beta. It is a hosted runtime for long-horizon agent work: sandboxed code execution, checkpointing, credential management, scoped permissions, and end-to-end tracing. Pricing is standard token rates plus $0.08 per session-hour. Early customers include Notion, Rakuten, and Asana.

Claude Code desktop redesign and Routines. On April 14, Anthropic shipped a full redesign of the Claude Code desktop app for Mac and Windows. The new layout adds a Mission Control sidebar for parallel sessions filtered by status, an integrated terminal, an in-app file editor, a rebuilt diff viewer, and a side chat at Cmd+;. Routines launched in research preview alongside it. These are scheduled or event-triggered Claude Code jobs that run on Anthropic's web infrastructure, so tasks continue without your laptop open.

Claude Opus 4.7. Released April 16. 87.6% on SWE-Bench Verified (up from 80.8%). 64.3% on SWE-Bench Pro (up from 53.4%). 70% on CursorBench (up from 58%). Vision input lifted from 1,568px to 2,576px. Pricing unchanged at $5 / $25 per million input/output tokens. Available in Claude products, the API, Amazon Bedrock, Vertex AI, and Microsoft Foundry. Anthropic deliberately reduced Opus 4.7's offensive cyber capabilities and gated higher-risk security work behind a separate Cyber Verification Program. The "real" cyber model is Mythos, behind Glasswing.

Claude Design. On April 17, Anthropic launched Claude Design as a research preview from Anthropic Labs. It generates prototypes, slide decks, and one-pagers from natural-language prompts, powered by Opus 4.7. It can read your codebase and design files to apply an existing design system. Exports go to PDF, PPTX, HTML, Canva, or a handoff bundle for Claude Code. VentureBeat called it a Figma challenger.

Claude Code shipped daily. The Claude Code changelog ran from v2.1.89 on April 1 to v2.1.123 on April 29. Highlights: an xhigh effort tier and /effort slider, a non-interactive /ultrareview for CI, a 1-hour prompt cache, a /team-onboarding command, a /recap session summary, native binary CLI, Vim visual mode, custom themes, MCP auto-retry, PostToolUse hooks that can rewrite tool output, and a Linux subprocess sandbox. Claude Code is now updating itself almost every day.

Usage-based pricing for the enterprise. Anthropic moved Claude Enterprise from a flat ~$200/user/month plan with bundled tokens to a $20-per-seat fee plus usage-based compute for deployments above 150 seats. The deeper signal arrived early in the month, when Anthropic blocked Pro and Max subscribers from running their flat-rate plans through third-party agent frameworks like OpenClaw. Flat-rate Claude is no longer a workaround for unlimited tokens.

The Pentagon loss in February was meant to slow Anthropic. Instead the company shipped a frontier model, a tier above it, a new product line, a hosted runtime, and a desktop redesign in three weeks.

OpenAI

OpenAI shipped a major Codex update (turning it into a full developer workstation), GPT-5.5 (its new flagship), GPT-Rosalind (a vertical model for life sciences), GPT-5.4-Cyber (a defensive cyber variant), Workspace Agents (always-on AI teammates), Chronicle (a screen-watching memory feature for Codex), GPT Image 2, the OpenAI Privacy Filter, and a major AWS deal.

Codex for almost everything. On April 16, OpenAI shipped a major Codex update that turns it from a coding agent into a full developer workstation. Codex now supports computer use (clicking and typing in Mac apps), an in-app browser, multi-file and multi-terminal views, remote SSH devbox access, and 90+ new plugins (Atlassian Rovo, CircleCI, GitLab Issues, Microsoft Suite, Neon, Render). Codex hit 3 million weekly active users on April 8, up from 2 million the month before.

Chronicle. Announced April 20, Chronicle is a screen-watching memory feature for Codex. A sandboxed background agent periodically takes screenshots of your screen, sends them to OpenAI for OCR and visual analysis, and saves Markdown summaries locally so Codex always knows what you have been working on. It is opt-in research preview, ChatGPT Pro only, macOS only, and not available in the EU, UK, or Switzerland. Raw screen captures sit unencrypted in a temp directory and auto-delete after six hours. Comparisons to Microsoft Recall followed within a day, along with prompt-injection concerns.

Workspace Agents. Launched April 22 in research preview. Codex-powered, always-on cloud agents that connect to Slack, Google Workspace, Salesforce, Notion, and Atlassian to run multi-step workflows like preparing reports, writing code, and replying to messages. Agents can be shared and improved across a team. Available in ChatGPT Business, Enterprise, Edu, and Teachers plans. Free until May 6, when credit-based pricing kicks in.

This is the AI teammates pattern shipping as default product. Custom GPTs were a content layer: prompts, files, and personas. Workspace Agents are an action layer. Each one has its own identity, access, and runtime. Every employee gets a fleet that lives in Slack and Salesforce alongside them.

GPT-5.4-Cyber. Released April 14. A defensive cyber variant of GPT-5.4 with lowered refusal boundaries for vetted security work and binary reverse-engineering capability. Restricted to OpenAI's Trusted Access for Cyber program, with $10M in API grants for vetted defenders. Partners include CrowdStrike, Cloudflare, and JPMorgan. OpenAI says it has helped fix 3,000+ vulnerabilities.

GPT-Rosalind. Released April 16. A frontier reasoning model for life sciences, named after Rosalind Franklin. Specialized for chemistry, protein engineering, genomics, sequence-to-function interpretation, and experimental planning. Available as research preview through Trusted Access. Partners include Amgen, Moderna, Allen Institute, and Thermo Fisher.

GPT Image 2 / ChatGPT Images 2.0. Launched April 21. Up to 4K output, accepts up to 16 reference images, renders multilingual text (including CJK) cleanly, and runs a reasoning step before generation for spatial layout. Took #1 across all Image Arena categories within 12 hours. Pricing ranges from $0.01 per image (low quality) up to $0.41 per image (high quality 4K).

OpenAI Privacy Filter. Released April 22 on Hugging Face under Apache 2.0. An open-weight on-device model that detects and redacts PII across eight categories. 1.5B total parameters, ~50M active for fast single-pass redaction. 96% F1 on industry benchmarks. OpenAI cautioned it should be a redaction aid, not a safety guarantee.

GPT-5.5. Released April 23. Improvements in coding, computer use, and deep research. Rolled out to ChatGPT Plus, Pro, Business, and Enterprise, plus Codex. API and GPT-5.5 Pro followed the next day. OpenAI shipped 5.5 less than eight weeks after 5.4.

OpenAI on AWS. On April 28, one day after the Microsoft exclusivity ended, AWS and OpenAI announced an expanded partnership. Three offerings hit limited preview on Amazon Bedrock: OpenAI frontier models, Codex on Bedrock, and Bedrock Managed Agents powered by OpenAI. Managed Agents runs the OpenAI harness on Amazon Bedrock AgentCore. Each agent gets its own identity, action logs, and execution inside the customer's environment.

OpenAI is no longer just shipping models. It is shipping a different model for every job (cyber, life sciences, image, privacy, frontier). And the harness around each one (the app, the tools, the memory, the runtime) now matters more than the model itself. The harness, not the model, is where OpenAI's edge lives.

Google

Google's month was anchored by Cloud Next, which ran April 22 to 24 at Mandalay Bay (32,000 attendees, over 260 announcements). It was the clearest signal of the year that Google has stopped trying to win on models alone and is going all-in on the control plane for agents. Outside the conference, Google also shipped Gemini Notebooks, native Mac and Windows Gemini apps, and the 10th Gemini Drop (Personal Intelligence going global, 3-minute Lyria 3 Pro music, and 3D interactive visuals in chat).

Two TPU v8 chips, not one. For the first time, Google split its TPU into training and inference chips. TPU 8t (codenamed Sunfish, designed by Broadcom) is the training chip. TPU 8t superpods scale to 9,600 chips with 2 PB of shared HBM, targeting up to 2.8x better training price-performance than Ironwood. TPU 8i (codenamed Zebrafish, designed by MediaTek) is the inference chip, with about 80% better performance per dollar. Both are on TSMC's 2nm node, targeted for late 2027.

Gemini Enterprise Agent Platform. Vertex AI was rebranded as the Gemini Enterprise Agent Platform. Agentspace was absorbed into Gemini Enterprise. The Agent2Agent (A2A) protocol hit version 1.2 and is now governed by the Linux Foundation's Agentic AI Foundation. Google says 150 organizations are running A2A in production.

Agent Studio and Agent Designer. The platform shipped two builders: a low-code Agent Studio for developers and a no-code Agent Designer for business users. An Agent Inbox lets teams monitor long-running agents working in secure cloud sandboxes.

Workspace Intelligence. A new layer that gives Workspace AI real-time context from Gmail, Chat, Calendar, and Drive. Users no longer have to repeat context to Gemini on every query.

Workspace Studio. A no-code automation builder inside Workspace. Includes an "Ask a Gem" step so flows can call private Gems for summaries or document drafting.

Agentic Data Cloud. A new Knowledge Catalog tags and connects data across the enterprise. Cross-Cloud Lakehouse standardizes on Apache Iceberg for querying data across Google Cloud and AWS.

Wiz security agents. First Wiz-branded agents under Google Cloud: a Threat Hunting agent, a Detection Engineering agent, and a Third-Party Context agent.

Model Garden expands. Over 200 foundation models, including Anthropic's Claude Opus 4.7, Google's Lyria 3 audio model, and Gemini 3.1 Flash Image. Also shipped: web-browsing agent Project Mariner, managed MCP servers across Google Cloud, and the Virgo megascale network fabric.

Veo 3.1. Veo 3.1 accepts reference images for precise mobile-ready video. Veo 3.1 Lite runs at the same speed as Veo 3.1 Fast at less than half the cost.

DESIGN.md goes open source. On April 21, Google Labs open-sourced the DESIGN.md draft spec from Google Stitch. It pairs YAML design tokens with markdown rationale so agents can generate brand-consistent UIs. Apache 2.0.

Gemini Notebooks. On April 8, Google launched Notebooks inside the Gemini app, with bidirectional sync to NotebookLM. NotebookLM stays for grounded research. Gemini Notebooks focuses on creating finished work.

Gemini Mac and Windows apps. Native Gemini app shipped for Mac and Windows in April. Option+Space global hotkey, screen awareness, Imagen, and Veo built in.

April Gemini Drop. The 10th Gemini Drop shipped Personal Intelligence (now global, connects favorite Google apps for personalized answers), 3-minute music tracks via Lyria 3 Pro, and 3D interactive visuals inside chat.

The pattern is hard to miss. Google is no longer just selling Gemini. It is selling the full stack underneath agents. Custom silicon for training and inference, the platform to build agents, the protocol for them to talk, the data plane underneath, security agents on top, and desktop and notebook surfaces for the human in the loop.

The Agent Infrastructure Wars

The major labs and clouds now ship managed agent harnesses. The harness is becoming the product. Microsoft, quiet on models this month, made its move here.

Anthropic Claude Managed Agents. Hosted runtime with sandboxes, checkpoints, credentials, scoped permissions, and tracing. Standard token rates plus $0.08 per session-hour.

Microsoft Hosted Agents in Foundry Agent Service. Each session gets an isolated sandbox with a persistent filesystem, integrated identity, and scale-to-zero economics. Pricing: $0.0994 per vCPU-hour and $0.0118 per GiB-hour during preview. Companion previews of Toolbox and Memory shipped alongside.

Google Gemini Enterprise Agent Platform. Agent Studio (low-code) and Agent Designer (no-code) ship with an Agent Inbox for monitoring long-running agents inside secure Google Cloud sandboxes. Google says 150 organizations are running A2A in production.

Amazon Bedrock. Bedrock now runs both OpenAI's and Anthropic's harnesses side by side. The OpenAI harness landed April 28 with per-agent identity, full action logs, and execution inside the customer environment. Anthropic's full stack (Opus 4.7, Mythos Preview, Claude Managed Agents) also runs on Bedrock. AWS is the only place customers can choose between the two.

The product these companies are now selling is not the model. It is the harness around the model: the sandbox, the identity, the credentials, the trace, the budget, and the audit log. This is the agent operating system layer the 2026 predictions post called the next battleground. In April, it became the actual battleground.

What's interesting is that AWS is playing this differently. Instead of trying to out-build Anthropic and OpenAI on harness, it opened up Bedrock and brought both stacks in. It is betting that the harness will be hard to win head-on, so the smarter move is to be the neutral ground where the customer picks. If that bet holds, the cloud underneath the harness becomes the real moat.

Cursor 3, SDK, and SpaceX's $60B Option

Cursor shipped a major UI overhaul and a TypeScript SDK, and SpaceX's IPO filing revealed a $60 billion option to acquire the company, all in one month.

Cursor 3. Released April 2. The biggest interface overhaul since 2023, rebuilt around an Agents Window for orchestrating many parallel agents (locally, in worktrees, in the cloud, on remote SSH). It also added Design Mode and cloud agents.

Cursor SDK. Released April 29 in public beta. A TypeScript SDK that lets developers build agents on the same runtime, harness, and models that power Cursor. Agents can run locally or on dedicated Cursor cloud VMs, with codebase indexing, MCP servers, skills, hooks, and named subagents. Early customers include Rippling, Notion, Faire, and C3 AI.

SpaceX's $60 billion option. On April 21, SpaceX disclosed in its IPO filing that it had an option to acquire Cursor for $60 billion later this year, or pay $10 billion for joint work if it chose not to exercise. The next day, CNBC reported that Microsoft had previously evaluated buying Cursor before passing, partly because of expected antitrust scrutiny over the Copilot overlap.

xAI, Mistral, Cursor. Reports surfaced of a three-way alliance under discussion: Cursor's coding tools, Mistral's frontier models, and SpaceX/xAI compute. None of the three has confirmed.

The strategic logic is hard to miss. Owning the top coding lab means owning the compute that trains the models and the product that informs the next generation of training data. SpaceX brings the orbital compute pitch. xAI brings the frontier model. Mistral brings open weights and Europe. Cursor brings the surface where the work happens. Google already owns this loop through Google Cloud, Gemini, and Antigravity. Everyone else is now trying to assemble it.

Headless 360 and the Shift to Agent-Native Products

At TrailblazerDX on April 15, Salesforce announced Headless 360. The pitch: expose the entire Salesforce platform (data, workflows, business logic) as APIs, MCP tools, and CLI commands. Agents can use it without ever opening a browser.

Headless 360 added 60+ new MCP tools and 30 preconfigured coding skills, plus a new experience layer that renders interactions across Slack, Voice, and WhatsApp.

This flips the SaaS interaction layer. The old order was a UI for humans first, with an API on the side. The new order is to build for agents first, with the human UI as just one of many surfaces. Your next billion users are not humans.

Headless 360 is one of many. Notion, Atlassian, Stripe, Asana, and now Salesforce all ship MCP servers as first-party products. The product strategy at every horizontal SaaS company is converging: be the best toolset an agent can call.

The Convergence of Desktop Apps

Every major lab now has a native desktop app, and they all look the same.

Anthropic redesigned Claude Code for desktop on April 14. Sidebar of parallel sessions filtered by status, integrated terminal, file editor, diff viewer, side chat at Cmd+;.

OpenAI shipped Codex for almost everything on April 16. Multi-file and multi-terminal views, in-app browser, computer use, remote devbox.

Google shipped a native Gemini Mac app and Windows app in mid-April. Global hotkey, screen awareness, image and video generation built in.

Cognition shipped Windsurf 2.0 on April 15 with an Agent Command Center. A Kanban-style surface that groups all sessions by status (local Cascade and cloud Devin together). One-click handoff to a Devin VM with browser and computer use. Spaces bundle agent sessions, PRs, files, and shared context around a task.

Amazon previewed Quick on April 28: a native Mac and Windows desktop app that connects to local files plus Google Workspace, Microsoft 365, Slack, Zoom, Salesforce, Airtable, Dropbox, and Teams. Proactive OS-level notifications and chat-driven generation of docs, decks, and dashboards. No AWS account required.

The convergence is complete. The shape of every one of these apps is the same: a left rail of agent sessions, a working surface in the middle, a chat side panel, a status filter, a terminal or browser embedded, screen awareness as table stakes. This is the command center shape, where one human watches and steers many agents at once. It is no longer a chat box that occasionally calls a tool. It is a control surface for many agents at once, with chat as one input among many.

Layoffs Continue

April was another heavy month for cuts, with AI cited as the explicit reason.

Snap cuts 1,000. On April 15, Snap announced layoffs of about 1,000 (around 16% of its workforce) and the elimination of 300+ open roles. CEO Evan Spiegel said advances in AI let teams "reduce repetitive work." Expected savings: over $500M annualized by H2 2026, with $95M to $130M in severance charges.

Meta cuts 8,000 and ties it to AI capex. On April 23, Meta announced about 8,000 layoffs (roughly 10%) plus 6,000 unfilled roles, effective late May. Reality Labs, Facebook social, recruiting, sales, and global operations were hit. Middle-management layers were compressed. At a town hall, Zuckerberg said the cuts were driven by AI infrastructure spend. Personnel and infrastructure are Meta's two biggest costs, and scaling one means cutting the other. Capex guidance moved up to $125B to $145B (from $115B to $135B). Expense outlook stayed flat at $162B to $169B. He warned more cuts may follow.

Two forces are pulling on the same budget at the same time. Companies are pouring money into AI infrastructure and tools, while the tiny teams and tiny individuals pattern is playing out in real time and showing that smaller teams can deliver more with AI. Both are happening at once, and leaders now have to choose where to put the next dollar. The cuts are what that choice looks like in practice.

At Meta, the link between cuts and AI was even more direct. Two days before the layoffs were announced, the company started recording the work of the employees still on payroll.

Meta Trains Agents on Its Own Employees

On April 21, Meta launched the Model Capability Initiative (MCI). Tracking software now captures employee mouse movements, clicks, keystrokes, and screenshots across hundreds of apps including Google, LinkedIn, Wikipedia, GitHub, Slack, Atlassian, Threads, and Manus. The stated goal is to teach Meta's AI agents how to do white-collar computer-use work.

Internal employee chat reportedly called the program "dystopian", with worries about exposing passwords, immigration status, and personal information.

The structure here is the part that matters. The agents being trained on employee screens are the ones being staffed up to do the same work that the layoffs took away two days later. Record everything, train an agent on it, replace the role. Meta is the loudest version, but the same pattern is showing up across the industry.

China Blocks the Manus Sale

On April 27, China's National Development and Reform Commission ordered Meta to unwind its $2 billion acquisition of Manus, the Singapore-based AI agent firm with Chinese roots. Manus had rebased in Singapore in 2025 to escape this kind of review. Beijing reached across the border anyway.

The US exit door is closing for Chinese AI startups. Founders will stay onshore, take domestic capital, and build on the domestic stack. The two AI ecosystems are no longer just diverging on chips and models. They are diverging on who is allowed to own the company.

Uber Blows Its AI Budget

Uber CTO Praveen Neppalli Naga told The Information that Uber spent its full 2026 AI budget in four months. Claude Code rolled out in December. Usage doubled by February. By April the CTO said he was "back to the drawing board". 95% of Uber engineers now use AI tools monthly. 70% of committed code originates from AI. Per-engineer API spend ran $500 to $2,000 a month. Uber's annual R&D is $3.4 billion.

Companies will need to face this new paradigm. Token-priced coding agents scale nonlinearly: parallel agents, long sessions, retry loops on failed tasks. A flat seat license priced one engineer. A token-priced agent prices every action that engineer's agents take.

The implication for other companies. Most 2026 AI budgets were sized like SaaS seat licenses. They will miss by a multiple. Finance teams will start asking for per-engineer token budgets, hard caps, and model routing (cheap model for easy turns, Opus only when needed). Procurement will push back on pure usage-based pricing and ask for committed-spend discounts or hybrid plans. Engineering leaders who got the productivity story right will still get hit by the cost story. The same adoption curve that made the rollout look good is the curve that blows the budget.

The other shoe is the vendor side. Anthropic moved Claude Enterprise to usage-based. OpenAI moved Workspace Agents to credits. Cursor's flat $20 plan covers roughly $180 of token-equivalent usage, a subsidy that cannot survive heavy agent use. The cheap-seat era of AI coding tools is ending. Pricing is converging on what it actually costs to serve, and the customer absorbs the variance.

New Models

April was another record month. Most labs shipped at least one frontier model. Many shipped a second specialized model alongside it (cyber, life sciences, image, voice, robotics, privacy).

China kept pace and undercut on price. DeepSeek V4 ships at roughly 1/7 the cost of GPT-5.4. Kimi K2.6 runs agent swarms up to 300 sub-agents and 4,000 coordinated steps. Qwen 3.6 Plus hits 78.8% on SWE-Bench Verified. GLM-5.1 is built around plan-execute-test-fix-optimize loops. All open weights or near-open licenses. Google also expanded its open lineup with Gemma 4 (Apache 2.0, 256K context, native vision and audio across 140+ languages), shipped Gemini 3.1 Flash TTS and Veo 3.1 Lite, and pushed embodied AI forward with Gemini Robotics-ER 1.6 (Boston Dynamics Spot now reads pressure gauges autonomously).

ModelProviderTypeHighlight
Claude Mythos PreviewAnthropicFrontierTier above Opus. Largest gains in math, long context, software engineering, cybersecurity. Gated to Glasswing partners only.
Claude Opus 4.7AnthropicLLM87.6% SWE-Bench Verified. 64.3% SWE-Bench Pro. 70% CursorBench. Vision up to 2,576px. Same price as 4.6.
GPT-5.5OpenAILLMImprovements in coding, computer use, and deep research. API and GPT-5.5 Pro followed the next day.
GPT-5.4-CyberOpenAILLMDefensive cyber variant of GPT-5.4. Binary RE capability. Trusted Access only.
GPT-RosalindOpenAIVerticalFrontier reasoning model for life sciences. Trusted Access. Amgen, Moderna, Allen Institute partners.
GPT Image 2OpenAIImage4K output. 16 reference images. Reasoning before generation. #1 across all Image Arena categories within 12 hours.
OpenAI Privacy FilterOpenAIPIIOpen-weight Apache 2.0. 1.5B total / ~50M active. 96% F1. Eight PII categories.
Gemma 4GoogleOpen LLME2B, E4B, 26B MoE, 31B Dense. Apache 2.0. 256K context. Native vision and audio. 140+ languages.
Veo 3.1 LiteGoogleVideoFree tier launched April 2. 10 free clips/month via Vids. Native audio. 720p/1080p.
Gemini 3.1 Flash TTSGoogleTTSInline audio tags for pace, tone, delivery. 70+ languages. SynthID watermarked. Elo 1,211.
Gemini Robotics-ER 1.6Google DeepMindEmbodiedHigh-level robot reasoning. Boston Dynamics Spot reads pressure gauges autonomously.
Muse SparkMeta MSLMultimodalFirst Meta Superintelligence Labs model. Ships in WhatsApp, Instagram, Facebook, Messenger, Meta AI glasses.
Qwen 3.6 PlusAlibabaLLM1M context. 78.8% SWE-Bench Verified. Hybrid linear attention plus sparse MoE.
Qwen 3.6 Max PreviewAlibabaLLM260K context. Closed weights. Tops six agentic coding benchmarks. $1.30 / $7.80 per million tokens.
Kimi K2.6MoonshotOpen LLM1T MoE, 32B active, 262K context. INT4 native. Agent swarm to 300 sub-agents and 4,000 steps.
GLM-5.1Z.aiOpen LLM754B MoE. MIT license. 58.4 SWE-Bench Pro. Built for plan-execute-test-fix-optimize loops.
DeepSeek V4DeepSeekOpen LLMV4-Pro: 1.6T total / 49B active. V4-Flash: 284B / 13B. 1M context. ~1/7 the price of GPT-5.4. MIT.

The themes from March accelerated. Computer use is table stakes. Open-source from China (Kimi K2.6, GLM-5.1, Qwen 3.6, DeepSeek V4) keeps closing the gap at a fraction of the cost. Vertical models are now a category (Cyber, Rosalind, Privacy Filter). Every model now ships with an agent harness around it, not just a chat box.

Takeaway

Three forces ran into each other in April.

The first force is capability. Opus 4.7, GPT-5.5, Mythos Preview, Kimi K2.6, DeepSeek V4. Coding agents that run for hours. Vertical models for defense and life sciences. Frontier models trained on agentic feedback loops. The capability curve did not slow down.

The second force is cost. GitHub paused signups. Anthropic ran an A/B that the public spotted. Uber spent its full year of AI budget in four months. Anthropic moved Claude Enterprise to usage-based. OpenAI moved Workspace Agents to credits. Capability is no longer the constraint. Compute and dollars are.

The third force is the platform. Google rebranded Vertex AI as the Gemini Enterprise Agent Platform and split TPU into a training chip and an inference chip. Anthropic shipped Managed Agents. Microsoft shipped Hosted Agents. AWS opened Bedrock as neutral ground for both OpenAI's and Anthropic's harnesses. Salesforce shipped Headless 360. Meta started recording its own employees to train an agent fleet. Cursor shipped an SDK built on the same harness that powers Cursor. Every layer of the stack is converging on one shape: a managed harness that gives each agent its own identity, sandbox, credentials, budget, and audit trail.

Capability got cheaper. Compute got tighter. The platform got real.

That was April.

Enjoyed this post?

If this brought you value, consider buying me a coffee. It helps me keep writing.