Guides Claude Agent SDK

Model Selection in the Claude Agent SDK

Haiku for development, Sonnet for production, Opus for maximum capability. How to set a fallback model and why switching models mid-session via hooks is possible.

3 min read

Claude model selection SDK Haiku Sonnet Opus agent fallback model SDK model option query agent model cost

The three tiers of Anthropic models

Every agent you build runs on a model. Picking the right on…

The development-to-production model pattern

Use Haiku during development. You are testing whether your …

Setting a fallback model

A fallback activates when the primary model hits its budget…

Switching models mid-session

You can change models during a session based on what the ag…

The three tiers of Anthropic models

Every agent you build runs on a model. Picking the right one affects cost, speed, and output quality. Anthropic's lineup has three tiers with distinct trade-offs:

Model	Use case	Trade-offs
Haiku	Development, simple tasks, high-volume automations	Fast and cheap, lower reasoning capability
Sonnet	Most production agents, client-facing deliverables	Strong reasoning, moderate cost
Opus	Complex analysis, maximum capability tasks	Highest quality, highest cost

Pass the model name directly to your query options:

for await (const message of query(messages(), {
  model: "claude-sonnet-4-5",  // or claude-haiku-4-5, claude-opus-4-5
  permissionMode: "bypassPermissions",
  dangerouslyAllowBypassPermissions: true,
})) {
  // handle messages
}

Sonnet is the default fallback model if you set a fallback and your primary model becomes unavailable. If you do not configure a fallback, the SDK uses Sonnet automatically in failure scenarios.

The development-to-production model pattern

Use Haiku during development. You are testing whether your agent structure works — whether tool calls fire correctly, sessions resume properly, hooks execute at the right moments. None of that depends on which model you run. Haiku costs a fraction of Sonnet and is significantly faster.

Switch to Sonnet when:

You are testing output quality against a real deliverable (a client brief, a competitive summary, a board update)
The task requires multi-step reasoning or synthesis across multiple sources
You are making the final deployment decision

Switch to Opus when the quality difference is visible and the client will notice it — typically complex financial analysis, nuanced strategic recommendations, or anything where a subtle error has real consequences.

For most fractional operator workflows — research, reporting, meeting prep, competitive snapshots — Sonnet handles everything. The Haiku-to-Opus progression is about cost efficiency and incremental quality gains, not a ceiling on what each model can do.

Setting a fallback model

A fallback activates when the primary model hits its budget limit or encounters a temporary failure:

for await (const message of query(messages(), {
  model: "claude-opus-4-5",
  fallbackModel: "claude-sonnet-4-5",
  maxBudgetUsd: 2.0,
  permissionMode: "bypassPermissions",
  dangerouslyAllowBypassPermissions: true,
})) {
  // handle messages
}

If the session exceeds $2 while running Opus, the SDK switches to Sonnet and continues. The session does not fail mid-task.

This matters most for long-running agents — a Monday morning brief that sweeps five client folders, or a competitive research pipeline pulling from a dozen sources. Without a fallback, a budget cap terminates the run. With one, you get a slightly lower-quality result instead of no result at all.

Switching models mid-session

You can change models during a session based on what the agent is doing. A hook that detects a specific tool call can trigger a model escalation.

The practical use case: an agent runs on Haiku for routine data gathering (reading files, pulling notes, checking statuses), then escalates to Sonnet when it hits the synthesis step — writing the actual client-facing deliverable. You pay for intelligence where it shows, not across the full run.

The implementation is covered in the hooks guide. For most operators, the simpler pattern — Haiku in development, Sonnet in production — delivers the right trade-off without the added complexity.

The operator default: Sonnet for production agents, Haiku during testing. That single pattern handles 90% of real fractional workflows: pre-call research, weekly status briefs, competitive snapshots, meeting summaries. Add Opus selectively when you need the extra reasoning depth and the client deliverable justifies the cost difference.

---

Author: FractionalSkill

The three tiers of Anthropic models

The development-to-production model pattern

Setting a fallback model

Switching models mid-session

Ready to Start Building?

Get New Guides First