AIProductivity

Model Routing: Why I Use Different AI Models for Different Tasks

Nathan Atherton· Staff Software EngineerFebruary 5, 20266 min read

When I started using Claude Code seriously, I used Opus for everything. It's the most capable model, so why wouldn't I? It took about two weeks (and a surprisingly large bill) to realise that was the wrong approach. Not just for cost - for quality and speed too.

The Problem with One-Size-Fits-All

Using Opus (the most powerful model) for a lint check is wasteful in three ways:

Speed. Opus is slower than Sonnet. For a simple "run prettier and check the output" task, the extra reasoning capability doesn't help - it just adds latency.
Cost. Opus costs significantly more per token. When you're running verification after every implementation agent finishes, those costs add up fast.
Context. Every token Opus processes is context it could be using for something more important. Wasting context on trivial tasks means less room for complex ones.

On the flip side, using Sonnet for complex architectural decisions is a false economy. The cheaper model might miss subtle issues that Opus catches, and the time spent debugging its mistakes costs more than the Opus tokens would have.

My Routing Strategy

After experimenting for a few months, I settled on this routing table:

| Task Type                              | Model   | Why                                    |
|----------------------------------------|---------|----------------------------------------|
| Complex implementation, refactors      | Opus    | Needs deep understanding, fewer errors |
| Debugging complex issues               | Opus    | Requires reasoning about system state  |
| Architecture and design review         | Opus    | Judgement-intensive, high stakes       |
| Code review                            | Opus    | Needs to catch subtle issues           |
| Verification (lint, prettier, types)   | Sonnet  | Mechanical checks, speed matters       |
| Simple bug fixes, nit fixes            | Sonnet  | Straightforward, well-defined scope    |
| File formatting, import sorting        | Sonnet  | Pure pattern application               |
| Trivial checks (file exists, etc.)     | Haiku   | Near-instant, minimal reasoning        |
| Codebase research, exploration         | Explore | Optimised for read-heavy tasks         |

How I Implement This

In my Claude Code configuration, model routing is defined in the CLAUDE.md that every agent receives. When the team lead spawns an agent, it specifies the model based on the task type:

## Agent Model Selection

| Task Type                    | Agent Config                                    |
|------------------------------|-------------------------------------------------|
| Implementation, refactors    | subagent_type: "general-purpose" (default opus) |
| Verification, lint, prettier | model: "sonnet"                                 |
| Code review, architecture    | subagent_type: "general-purpose" (default opus) |
| Trivial checks               | model: "haiku"                                  |
| Codebase research            | subagent_type: "Explore"                        |

The team lead knows these rules, so when it creates a "run typecheck" task, it automatically assigns it to Sonnet rather than Opus.

The Results

After implementing model routing, I saw three improvements:

1. Faster feedback loops

Verification agents using Sonnet return results 30-50% faster than Opus. When you're running verification after every implementation agent finishes, this compounds. A full feature build with 5 implementation steps and 5 verification steps is noticeably faster.

2. Better quality where it matters

By reserving Opus for complex tasks, I actually get better results from it. The context window isn't cluttered with lint output and prettier logs. When Opus is thinking about a complex refactor, it has more room to reason.

3. Lower costs without sacrificing quality

My average daily spend dropped by about 40% after implementing routing. But the quality of implementation work (the expensive Opus tasks) stayed the same or improved. I'm spending less overall while getting more from the expensive model.

When Routing Goes Wrong

I've made routing mistakes. The most common one: underestimating task complexity.

A "simple bug fix" that's actually a subtle race condition needs Opus, not Sonnet. Sonnet will confidently produce a fix that addresses the symptom but not the root cause. The fix passes tests, looks correct, and breaks in production three days later.

My rule of thumb: when in doubt, route to Opus. The cost of using Opus on a task that could have been Sonnet is a few extra pence. The cost of using Sonnet on a task that needed Opus is hours of debugging.

Practical Advice

If you're starting with model routing:

Start with everything on Opus. Get comfortable with AI-assisted development first. Optimise later.
Move verification to Sonnet first. This is the safest optimisation - lint and typecheck are mechanical tasks where Sonnet matches Opus in quality.
Use Haiku for true trivia. "Does this file exist?" or "What's the current git branch?" Don't waste any real model's context on these.
Monitor your burn rate. A custom status line showing cost per hour makes the impact of routing decisions immediately visible.
Never route complex debugging to a cheaper model. This is where I've seen the worst outcomes. Debugging requires the deepest reasoning.

Model routing isn't about being cheap. It's about matching the right tool to the job. A lighter model on a mechanical task is faster and less noisy, not just cheaper. That's the real win.

The Problem with One-Size-Fits-All

My Routing Strategy

How I Implement This

The Results

1. Faster feedback loops

2. Better quality where it matters

3. Lower costs without sacrificing quality

When Routing Goes Wrong

Practical Advice

Related Posts

The Claude Code Power Setup That Changed How I Build Software

Building Observable AI Workflows with Custom Status Lines

Asking Claude What's on My Driveway