Hi everyone,

A crisp spring week here in Boston. I just got back from a long walk down Back Bay.

This week's frontier model news was dominated by GPT-5.5. The reviews have been positive enough that I started using it through the Codex app on my Mac for daily work. This issue itself was put together as a collaboration between GPT-5.5 and Opus, and I'll admit GPT-5.5 is leveling the playing field.

The model is the headline. The bigger question for ops leaders is what changes when we stop asking these systems questions and start supervising them.

Signals This Week

1. Work is becoming the training set for its own replacement.

Microsoft offered voluntary buyouts to up to 7% of its US workforce. Meta reportedly plans to cut roughly 8,000 jobs as it ramps AI infrastructure spending. Separately, Reuters and TechCrunch reported that Meta will capture certain employee mouse movements, clicks, keystrokes, and screen interactions to train models on computer work. OpenAI is coming at the same problem from the invited side with Chronicle, an opt-in Codex preview that builds memories from screen captures so Codex can understand what you are working on. The pattern is make office work easier for models to observe, reduce labor expense, spend more on AI infrastructure. (Axios, TechCrunch, Windows Central, CNBC)

2. AI ROI is becoming an operating model problem. Gartner's April report says only 39% of tech leaders believe current AI efforts will improve financial performance, and points to data quality, governance, talent, and change management as the foundations that separate stronger outcomes from stalled ones. BCG says 50-55% of US jobs may be reshaped by AI in the next two to three years, while only 10-15% are vulnerable to elimination over a longer horizon. HBR adds another layer: executives and managers are not aligned on how AI is actually landing inside companies. Put together, the gap is not just technology. It is whether companies can redesign work, retrain people, and measure value clearly enough for AI to show up in the numbers. (Gartner, BCG, HBR)

3. A new insider threat is emerging from the AI tools employees adopt themselves. A Vercel employee signed into Context.ai with their corporate Google Workspace account and granted "Allow All" OAuth permissions. Context.ai had been compromised months earlier by infostealer malware. The attacker used the grant to pivot into the employee's Workspace, then into Vercel's production, then into customer environment variables. This isn't a phished account or a malicious insider. It's a new category: the employee is adopting AI to be more productive, and the act of adoption becomes the launchpad of the breach. Most companies don't know which AI tools their employees have already bound corporate credentials to. (Vercel, TechCrunch)

🎯 The Chatbot Became a Workforce. Now Someone Has to Manage It.

The most interesting thing I've heard from operators this month has nothing to do with model benchmarks. CFOs and COOs getting their first real exposure to AI as a working tool told me almost the same thing: the value isn't that the model is clever. The value is that it grinds through forty pages of reports and thirty emails, then hands back a morning briefing they can act on. That feels less like chatbot novelty and more like a junior analyst finally showing up on time.

OpenAI released GPT-5.5 (codename "Spud") on April 23. The headlines focused on benchmarks and pricing. The more durable framing is OpenAI's own: a model that can take a messy, multi-part task, plan, use tools, check its own work, and keep going with little instruction. Greg Brockman, OpenAI's president, told Alex Kantrowitz on Big Technology that GPT-5.5 is "a step towards a new way of getting work done with a computer."

That is where GPT-5.5 matters. Frontier models are now good enough for most everyday office work: summarizing, drafting, extracting, comparing, synthesizing, and explaining. Most operators do not need a model to win every benchmark. They need a tool that can see the right material, use the right systems, and hand work back in a usable form.

Ethan Mollick, the Wharton professor whose One Useful Thing newsletter has tracked applied AI for three years, framed GPT-5.5 in three layers: the model itself, the apps people use, and the harness, which Mollick defines as "the tools that an AI can use and how the AI models are hooked up to these tools." The last word matters, and most of us are not familiar with it.

Put plainly, a harness is the work wrapper around the model: context, permissions, memory, files, scheduled runs, tool access, and a place to put the output. Claude Code, Codex, OpenClaw, Claude Cowork, and ChatGPT workspace agents are all harnesses, each one wrapping a frontier model in different tools and workflow logic. ChatGPT by itself can summarize a document you upload. A work harness can read the folder where the reports live, compare them to last week, pull related emails, create a briefing, and do it again tomorrow morning. Same model family. Very different business value.

This is why the model race feels different now. GPT-5.5, Claude Opus, Gemini, and soon a growing set of open weight models are all close enough for the work most operators actually need. For everyday use cases, the question is less "can the model do it?" and more "can the surrounding system make it usable, repeatable, permissioned, and easy to trust?" When raw model intelligence stops being the main bottleneck, the value moves to what is wrapped around it.

Once you accept that the work system matters more than the model, the next question is who runs it. OpenAI introduced workspace agents in ChatGPT, with admin controls over tools and permissions, an inspection layer through a Compliance API, and the ability to suspend agents when needed. John Sviokla's April 24 briefing called it bluntly: "GPTs are dead. Long live the long-lived agent." A GPT was a personal artifact. A workspace agent is operational infrastructure.

That is why the layoff and monitoring stories matter too. They get too quickly read as "AI replaced 8,000 people," which is too simple. The more useful read is that companies are trying to identify which parts of work can be observed, repeated, measured, and handed to software. A harness is how that happens.

That is the real story behind GPT-5.5, and it is bigger than a model release. What used to be a chatbot now needs a manager. But don't start with a custom agent strategy. Start by learning what your company already has approved and what a harness can do on one task you already own.

Pick one. Daily email triage is the most universal. Ask your operations or technology team what AI harnesses you already have access to through existing licenses. Then have someone sit with you for 30 minutes and walk through one of those tools, hands-on, on a real task.

The test: can an approved tool read new mail since yesterday, group it by sender and topic, surface anything urgent or deadline-driven, flag customers or projects you care about, and put a one-page summary somewhere you already review? Run it for a week and pay attention to where it breaks, and separate two kinds of breakdowns.

A model problem is the tool getting the meaning wrong, missing an obvious deadline, summarizing poorly, or hallucinating a name. Those are real, but rarer than people expect with frontier models on this kind of work.

A harness problem is the tool not seeing the calendar, not opening the customer folder, not writing into the summary doc, not knowing to skip newsletters, not running on schedule, or failing because OAuth was never completed. Those are missing pieces of the wiring around the model.

Most of what you find will be the second kind. That gap is the actual work of AI adoption today.

That is how non-technical leaders should learn AI now. Not by memorizing model names. By learning what a harness is, where one already exists in their stack, and which everyday work is ready to move from "I ask a chatbot" to "I supervise a coworker."

📡 The Wire

SpaceX buys the right to buy Cursor for $60B. SpaceX announced a deal that gives it the option to acquire Cursor for $60B later this year, or pay $10B for ongoing collaboration if it walks away.

High earners race ahead on AI as the workplace divide widens. A new FT/Focaldata poll of 4,000 US and UK workers finds 60% of top earners use AI daily, versus 16% of lower earners. The heaviest users are workers in their thirties, not the youngest. AI is complementing expertise before it democratizes it. That should worry anyone responsible for training the next generation of analysts.

AI-generated client emails are pushing lawyers' fees up, not down. Law firms are warning clients that AI-generated letters, emails, and patent applications may raise bills, not lower them. The pattern applies beyond law: AI makes it easy to produce more output, but the downstream cost of reviewing and acting on that output still lands somewhere.

💬 Overheard

Codie Sanchez (via Claire Vo)

📚 What I'm Consuming

📄 OffDeal: How an AI-Native Investment Bank Runs on Claude. Two bankers closed 8 deals worth $91M in year one by consolidating 12+ workflows into a single agent. The operator-relevant detail: bankers, not engineers, extend the agent by writing modular skills.

🗞️ The AI Transformation Manifesto (McKinsey, April 2026). Twelve themes from large-scale AI transformations. The through-line: advantage comes from organizational capability, not the technology.

🎙️ How the Best Companies Use AI (The AI Daily Brief, April 19, 27 min). Good synthesis of PwC, McKinsey, and Ramp's internal "Glass" system. AI leaders treat it as a growth lever, not productivity theater.

▶️ How to Set Up Claude Cowork on Mac: Seven Steps, No Code. Practical walkthrough across email, calendar, Notion, desktop, and scheduled tasks. Good starting point if the harness recipe made you curious.

▶️ How I Created OpenClaw, the Breakthrough AI Agent (Peter Steinberger, TED). The origin story of OpenClaw, told by the founder himself. A good reset on what one person with the right tools can ship in 2026.

🌙 After Hours

Frankenstein (2025). Dir. Guillermo del Toro | 149 min | ★★★★★

Low expectations going in. I was fully expecting a boring remake, just using more modern filmmaking processes. But it truly blew me away. The creativity behind the story lies in weaving two arcs together inside a seemingly unrelated Arctic isolation framing (which turns out to be symbolically relevant): Victor's progression from childhood experiences to obsessive creation, and the Creature's emotional arc. The latter is what really locked me in, and it was done exceptionally well.

I was rooting for the Creature the whole time.

🧪 Quanta Lab

The story drowned out by GPT-5.5 this week is that open weight models are getting close to the same bar. Alibaba's Qwen 3.6-35B-A3B runs on consumer hardware with around 20GB of RAM. Moonshot's Kimi K2.6 added long-horizon agent swarm capabilities. DeepSeek's V4 Preview arrived with a 1-million-token context window.

That changes the procurement conversation. The question worth raising with your CIO this quarter is which workloads actually need a frontier vendor, and which ones could run on a good-enough open model sitting behind your firewall. Closed models still win on integration depth, harness maturity, and governance tooling, but the gap is narrow enough that the question is finally worth asking.

Earn Your Complexity applies. Most ops workloads don't need the most expensive intelligence on the planet. They need a competent harness around a good-enough model.

🎙️ Listen

Prefer to listen? Quanta Bits is also available on Apple Podcasts and Spotify.

How this gets made

I collaborate with Spock, my AI agent. He researches extensively: scanning, filtering, and surfacing what's relevant across my business. I read, listen, and watch what resonates, and decide what matters. I provide direction, we draft together. The editorial judgment is mine. He'd tell you the same. Most logical.🖖

Recommended for you