Model Selection in the Claude Agent SDK

Haiku for development, Sonnet for production, Opus for maximum capability. How to set a fallback model and why switching models mid-session via hooks is possible.

3 min read
Claude model selection SDK Haiku Sonnet Opus agent fallback model SDK model option query agent model cost

The three tiers of Anthropic models

Every agent you build runs on a model. Picking the right one affects cost, speed, and output quality. Anthropic's lineup has three tiers with distinct trade-offs:

ModelUse caseTrade-offs
HaikuDevelopment, simple tasks, high-volume automationsFast and cheap, lower reasoning capability
SonnetMost production agents, client-facing deliverablesStrong reasoning, moderate cost
OpusComplex analysis, maximum capability tasksHighest quality, highest cost

Pass the model name directly to your query options:

for await (const message of query(messages(), {
  model: "claude-sonnet-4-5",  // or claude-haiku-4-5, claude-opus-4-5
  permissionMode: "bypassPermissions",
  dangerouslyAllowBypassPermissions: true,
})) {
  // handle messages
}

Sonnet is the default fallback model if you set a fallback and your primary model becomes unavailable. If you do not configure a fallback, the SDK uses Sonnet automatically in failure scenarios.

The development-to-production model pattern

Use Haiku during development. You are testing whether your agent structure works — whether tool calls fire correctly, sessions resume properly, hooks execute at the right moments. None of that depends on which model you run. Haiku costs a fraction of Sonnet and is significantly faster.

Switch to Sonnet when:

  • You are testing output quality against a real deliverable (a client brief, a competitive summary, a board update)
  • The task requires multi-step reasoning or synthesis across multiple sources
  • You are making the final deployment decision

Switch to Opus when the quality difference is visible and the client will notice it — typically complex financial analysis, nuanced strategic recommendations, or anything where a subtle error has real consequences.

For most fractional operator workflows — research, reporting, meeting prep, competitive snapshots — Sonnet handles everything. The Haiku-to-Opus progression is about cost efficiency and incremental quality gains, not a ceiling on what each model can do.

Setting a fallback model

A fallback activates when the primary model hits its budget limit or encounters a temporary failure:

for await (const message of query(messages(), {
  model: "claude-opus-4-5",
  fallbackModel: "claude-sonnet-4-5",
  maxBudgetUsd: 2.0,
  permissionMode: "bypassPermissions",
  dangerouslyAllowBypassPermissions: true,
})) {
  // handle messages
}

If the session exceeds $2 while running Opus, the SDK switches to Sonnet and continues. The session does not fail mid-task.

This matters most for long-running agents — a Monday morning brief that sweeps five client folders, or a competitive research pipeline pulling from a dozen sources. Without a fallback, a budget cap terminates the run. With one, you get a slightly lower-quality result instead of no result at all.

Switching models mid-session

You can change models during a session based on what the agent is doing. A hook that detects a specific tool call can trigger a model escalation.

The practical use case: an agent runs on Haiku for routine data gathering (reading files, pulling notes, checking statuses), then escalates to Sonnet when it hits the synthesis step — writing the actual client-facing deliverable. You pay for intelligence where it shows, not across the full run.

The implementation is covered in the hooks guide. For most operators, the simpler pattern — Haiku in development, Sonnet in production — delivers the right trade-off without the added complexity.

The operator default: Sonnet for production agents, Haiku during testing. That single pattern handles 90% of real fractional workflows: pre-call research, weekly status briefs, competitive snapshots, meeting summaries. Add Opus selectively when you need the extra reasoning depth and the client deliverable justifies the cost difference.

---

Author: FractionalSkill

Keep Going

Ready to Start Building?

Pick the next step that matches where you are right now.

Tutorial
Claude Code Basics

Start with the terminal basics. A hands-on, step-by-step guide to your first 10 minutes with Claude Code.

Start the Tutorial
Guide
AI-Powered Workflows

Automate your client work. Learn how to connect AI tools into workflows that handle repetitive tasks for you.

Read the Guide
Community
Join the Community

Connect with other fractional leaders building with AI. Share workflows, get feedback, and learn from operators who are ahead of you.

Apply to Join