Model Selection in the Claude Agent SDK
Haiku for development, Sonnet for production, Opus for maximum capability. How to set a fallback model and why switching models mid-session via hooks is possible.
The three tiers of Anthropic models
Every agent you build runs on a model. Picking the right one affects cost, speed, and output quality. Anthropic's lineup has three tiers with distinct trade-offs:
| Model | Use case | Trade-offs |
|---|---|---|
| Haiku | Development, simple tasks, high-volume automations | Fast and cheap, lower reasoning capability |
| Sonnet | Most production agents, client-facing deliverables | Strong reasoning, moderate cost |
| Opus | Complex analysis, maximum capability tasks | Highest quality, highest cost |
Pass the model name directly to your query options:
for await (const message of query(messages(), {
model: "claude-sonnet-4-5", // or claude-haiku-4-5, claude-opus-4-5
permissionMode: "bypassPermissions",
dangerouslyAllowBypassPermissions: true,
})) {
// handle messages
}
Sonnet is the default fallback model if you set a fallback and your primary model becomes unavailable. If you do not configure a fallback, the SDK uses Sonnet automatically in failure scenarios.
The development-to-production model pattern
Use Haiku during development. You are testing whether your agent structure works — whether tool calls fire correctly, sessions resume properly, hooks execute at the right moments. None of that depends on which model you run. Haiku costs a fraction of Sonnet and is significantly faster.
Switch to Sonnet when:
- You are testing output quality against a real deliverable (a client brief, a competitive summary, a board update)
- The task requires multi-step reasoning or synthesis across multiple sources
- You are making the final deployment decision
Switch to Opus when the quality difference is visible and the client will notice it — typically complex financial analysis, nuanced strategic recommendations, or anything where a subtle error has real consequences.
For most fractional operator workflows — research, reporting, meeting prep, competitive snapshots — Sonnet handles everything. The Haiku-to-Opus progression is about cost efficiency and incremental quality gains, not a ceiling on what each model can do.
Setting a fallback model
A fallback activates when the primary model hits its budget limit or encounters a temporary failure:
for await (const message of query(messages(), {
model: "claude-opus-4-5",
fallbackModel: "claude-sonnet-4-5",
maxBudgetUsd: 2.0,
permissionMode: "bypassPermissions",
dangerouslyAllowBypassPermissions: true,
})) {
// handle messages
}
If the session exceeds $2 while running Opus, the SDK switches to Sonnet and continues. The session does not fail mid-task.
This matters most for long-running agents — a Monday morning brief that sweeps five client folders, or a competitive research pipeline pulling from a dozen sources. Without a fallback, a budget cap terminates the run. With one, you get a slightly lower-quality result instead of no result at all.
Switching models mid-session
You can change models during a session based on what the agent is doing. A hook that detects a specific tool call can trigger a model escalation.
The practical use case: an agent runs on Haiku for routine data gathering (reading files, pulling notes, checking statuses), then escalates to Sonnet when it hits the synthesis step — writing the actual client-facing deliverable. You pay for intelligence where it shows, not across the full run.
The implementation is covered in the hooks guide. For most operators, the simpler pattern — Haiku in development, Sonnet in production — delivers the right trade-off without the added complexity.
The operator default: Sonnet for production agents, Haiku during testing. That single pattern handles 90% of real fractional workflows: pre-call research, weekly status briefs, competitive snapshots, meeting summaries. Add Opus selectively when you need the extra reasoning depth and the client deliverable justifies the cost difference.
---
Author: FractionalSkill