Opus didn't get worse, you're watching economics in real-time
Everyone's saying Opus got worse since Christmas. It didn't. Anthropic is tuning the parameters around the model to survive subsidized subscriptions.
Not everyone, but a lot of people in my X feed are saying the same thing this week. Opus feels worse than it did at Christmas. The replies pile up under every Anthropic post. The same take keeps showing up in the Discords I'm in for AI builders. The model got dumber. Anthropic nerfed it. Something changed.
I use Opus every day for my actual job. I build with it, ship with it, and spend more time inside it than I do inside most of my own tools. So I'd notice if it got worse. And the honest answer is that yes, it feels different. But not in the way people are saying. The model didn't change. The economics around it did, and we're watching them play out in public.
What actually changed
If you'd asked me in mid-December what Opus felt like, I would have said unreasonable. It was doing things I wouldn't have bet on two months earlier. Ralph loops running for hours on end, specs being drafted while I was in meetings, the kind of sustained reasoning that used to be theoretical.
Over Christmas, Anthropic roughly doubled the usage allowances on Pro and Max for the holidays. Everyone I know ran Opus with thinking mode permanently on, because why wouldn't you. I certainly did. The Taylor Otwell plugin I built was one of the things I was pushing hardest, and it paid for itself in a week.
Then mid-January the vibe shifted. Not with a blog post, not with a model card update, but with a hundred small tells:
- Thinking modes got simplified from multiple selectable budgets to a single 'auto'.
- The auto budget feels smaller than the old manual budgets did.
- Token usage crept up across the same tasks, because the model was fixing more of its own mistakes from not thinking as hard the first time.
- Rate limits tightened. Quietly. No announcement.
- Calls started hanging. 'Overloaded' errors came back. The retry-with-backoff dance returned.
None of that is a model change. That's infrastructure under pressure, getting tuned to stay standing. Every single one of those knobs sits around the model, not inside it.
The economics nobody's willing to say out loud
Pro and Max subscriptions are heavily subsidized. $200 a month for what I use of Opus is not a price. It's an introductory offer dressed up as a SKU. The working theory at every AI lab right now is that you burn money on consumer subs to buy habits, and you make it back on enterprise API usage once people are hooked.
Opus is the most expensive model they run. Without aggressive prompt caching, running Opus 24/7 on a Pro plan would not be a sustainable trade for Anthropic even by charitable accounting. Token caching is the reason any of this works at all.
So what happens when you double usage allowances over Christmas, give everyone a holiday and a reason to explore, and millions of developers respond by running the single most expensive model they offer on max thinking for two weeks? You get a beautiful two weeks, followed by a very uncomfortable bill, followed by a quiet tuning pass.
The tuning pass is what we're in now. Let everyone use what they paid for, but throttle just enough to keep the system standing, and just enough to stop the heaviest users from ruining everyone else's experience. It's the same balancing act every capacity-constrained business does, except it happens inside a model you personally like, so it feels like a betrayal.
Sonnet 5 is going to feel like old Opus
Here's the part I'm fairly sure about. Sonnet has always been cheaper to run than Opus. When Sonnet 5 ships, it will get shipped with the full treatment. Max thinking budgets. Generous rate limits. None of the invisible throttling that's currently load-shedding Opus traffic. Because the compute math on Sonnet makes that affordable in a way Opus doesn't.
The result is going to be predictable. People will try Sonnet 5, feel the full power of an unthrottled frontier model, and say it feels like 'old Opus'. Some will declare Opus obsolete. The migration will quietly take the load off the expensive tier.
And then, with the pressure gone, the Opus restrictions will quietly lift. Not because anything got fixed. Because the load shifted elsewhere and they can afford to let it breathe again.
This isn't a conspiracy. It's not a product scandal. It's the visible shape of capacity planning for a model that costs real money to serve. The same pattern will repeat with every new tier for years.
What I'm actually doing about it
I'm not switching off Opus. I'm still N times more productive with it on than off, and the kind of work I'm doing at Ryde is exactly the kind that rewards long-horizon reasoning even when the model is in a slightly worse mood than it was in December.
What I have done is stop being wasteful. A few things I changed in the last two weeks, in case they're useful:
- I'm more deliberate about which model gets which task. Sonnet for scaffolding, drafts, explorations, anything that doesn't need deep reasoning. Opus for the hard thinking, the one-shot architectural calls, the things where a bad answer costs me an hour. The Ralph loop I wrote about makes this easy because model selection is one line.
- I lean on prompt caching like my life depends on it. If you're not structuring your prompts so the expensive context is cacheable, you're paying for things Anthropic is also paying for, and you're accelerating the exact squeeze we're in.
- I stopped using thinking mode as a default. Turning it on for everything was the luxury version of working, and the luxury version is over. It's back to being a tool I reach for when I need it, not the setting I leave on.
- I watch for the 'overloaded' signal as a real signal. If the API is fighting me, I swap to a cheaper tier for the next hour instead of drumming my fingers through exponential backoff. Most of the time the cheaper tier is fine.
None of this is sacrifice. It's just the productive shape of using a heavily subsidized product that's currently being rebalanced in real time.
The bigger pattern
The thing I'd actually like people to take away is that model quality and infrastructure quality are going to keep getting confused for each other, and not just with Anthropic. Every lab is running the same playbook. Subsidized consumer tier, frontier model as the hook, invisible throttling when the math stops working, new tier to shift load, repeat. If you measure the experience and not the model weights, you will periodically feel betrayed. If you understand what's actually happening, you get a much cheaper emotional ride and a much better read on what to build on top of.
Opus didn't get worse. It got popular, at a price that couldn't sustain the popularity, and the adults in the room are quietly keeping the lights on while pretending nothing is happening. Once Sonnet 5 is out and the load moves, you'll feel Opus come back. Not because they fixed it. Because they can finally afford to let it run properly again.
Watch the knobs, not the weights. That's where the story is.
Hi, I'm Mischa. I've been Shipping products and building ventures for over a decade. First exit at 25, second at 30. Now Partner & CPO at Ryde Ventures, an AI venture studio in Amsterdam. Currently shipping Stagent and Onoma. Based in Hong Kong. I write about what I learn along the way.
Keep reading: The most skeptical about AI haven't shipped with it.