Deep Thinking in Claude Code

When to use extended thinking, how ultrathink works, and which tasks actually benefit from longer reasoning time.

7 min read
claude code ultrathink extended thinking claude claude code deep thinking AI reasoning

--- title: How Claude Code's extended thinking actually works description: Extended thinking is on by default in Claude Code. Here's how to control it, when to push it harder, and when to leave it alone. author: FractionalSkill ---

How Claude Code's extended thinking actually works

Most operators assume Claude Code works like a chatbot -- you send a prompt, it generates a response. The mental model is linear: prompt in, answer out.

The real process has an extra step most people never see.

Before responding to anything, Claude Code performs internal reasoning. It works through the problem, considers approaches, checks for gaps, and then writes the response. That reasoning phase is extended thinking. It runs by default on every session. You don't have to turn it on.

What you can control is how deep that reasoning goes -- and when to push it much harder than the default.

What extended thinking is and what it costs you

Extended thinking is Claude's internal scratchpad. Before it writes a single word of its response, it reasons through the problem in a separate pass. That reasoning is what makes it possible for Claude to catch mistakes before they appear in the output, weigh tradeoffs between approaches, and produce a plan that actually holds together.

On current models (Opus 4.6 and Sonnet 4.6), this reasoning is adaptive. Claude allocates more or less thinking based on how complex the task appears. A simple question about a file gets light thinking. A multi-step architectural decision gets more. The unit Anthropic uses to measure this reasoning work is "thinking tokens" -- essentially the computational cost of that internal reasoning pass, billed separately from the response you read.

You're charged for those thinking tokens whether you see them or not. That's the tradeoff worth knowing upfront.

By default, thinking appears as a collapsed stub in the UI. You can see the full internal reasoning by pressing Ctrl+O to toggle verbose mode -- the thinking shows up as gray italic text above Claude's response. For most sessions, you won't need to read it. But when Claude produces something unexpected, opening verbose mode lets you trace exactly where its reasoning went wrong.

> What you're actually paying for: Thinking tokens are billed even when the summary is collapsed. If you're running long sessions on complex work, those tokens add up. You can limit the budget with the MAXTHINKINGTOKENS environment variable, or disable thinking entirely with Option+T (macOS) or Alt+T (Windows/Linux).

The one keyword that actually changes reasoning depth

Here's a distinction the official documentation makes clearly, and that most people using Claude Code get wrong.

Phrases like "think hard," "think more carefully," or "reason through this" do not change how much thinking Claude allocates. They're read as regular prompt instructions -- natural language that Claude interprets, not a mechanism that adjusts the thinking token budget. Claude will sound more deliberate, but it isn't actually doing more reasoning work under the hood.

The word that does work is: ultrathink

Include ultrathink anywhere in your prompt, and Claude Code sets its effort level to high for that turn. On Opus 4.6 and Sonnet 4.6, that means the model dynamically allocates significantly more thinking tokens before it responds. It's documented in the official Claude Code reference as a single-turn override that doesn't permanently change your effort setting.

You: ultrathink a migration plan for moving our client reporting 
     system from manual spreadsheets to automated data pulls. 
     Current state: 4 clients, 3 different Excel templates, 
     one shared Google Sheet with version conflicts.

This is different from asking Claude to "think carefully." The ultrathink keyword is mechanically recognized -- it changes resource allocation, not just tone.

The practical difference shows up on hard problems. Complex architectural decisions, debugging sessions where previous attempts have failed, and multi-client workflow designs where the wrong choice creates rework across several engagements are all situations where the deeper reasoning pass produces meaningfully better output.

> Use ultrathink selectively. It costs more and takes longer. For routine tasks -- drafting a client update, reformatting a document, running a search -- default thinking is appropriate. Save the keyword for decisions that are hard to reverse.

How to configure thinking depth across a session

Beyond the single-turn ultrathink override, Claude Code gives you several ways to tune reasoning depth across an entire session or globally.

Effort levels control thinking depth for the current session on Opus 4.6 and Sonnet 4.6. Run /effort inside a session to adjust, or set the CLAUDECODEEFFORT_LEVEL environment variable before launching.

Effort levelWhat it doesWhen to deploy it
LowMinimal thinking allocationDrafts, reformatting, low-stakes queries
DefaultAdaptive reasoningMost everyday work
HighMaximum thinking allocationArchitecture, planning, complex debugging

/config lets you toggle thinking mode globally. If you disable thinking here, it stays off until you re-enable it. This persists in ~/.claude/settings.json as alwaysThinkingEnabled.

showThinkingSummaries: true in your project's settings.json expands the thinking display by default, so you see the full reasoning in every session without having to toggle verbose mode manually. Useful when you're debugging a complex engagement workflow and want to track Claude's reasoning at each step.

For multi-client work, the most practical setup is to leave effort at default, use ultrathink for the moments that genuinely warrant deep reasoning, and keep verbose mode off until something goes wrong.

Pairing extended thinking with plan mode

Extended thinking compounds when combined with plan mode. This is where the feature earns its place in a fractional workflow.

Plan mode (activated with Shift+Tab twice, or --permission-mode plan at startup) prevents Claude from writing or editing any files. It can only read and reason. When you add ultrathink to a prompt in plan mode, Claude spends its full reasoning capacity producing a plan -- no file edits, no premature execution, no assumptions written into your project before you've reviewed them.

For client engagements where a wrong architectural decision creates days of cleanup, this two-pass workflow is worth the extra time:

1. Start in plan mode 2. Prompt with ultrathink and a specific, bounded problem 3. Read the plan 4. Adjust it before Claude touches a single file 5. Switch to regular mode to execute

You: ultrathink a plan to consolidate our three client onboarding 
     workflows into one repeatable process. Each client currently 
     has separate folders, separate templates, and different 
     naming conventions. We want one structure that works for all.

Claude will read your file structure, reason through the consolidation, and produce a written plan with tradeoffs before making any changes. The ultrathink keyword means it won't take shortcuts in that reasoning phase.

> For new or complex engagements: Run plan mode with ultrathink on the first session when you're picking up an unfamiliar codebase or designing a new workflow. You want the full reasoning pass before Claude starts writing anything.

What to check when thinking doesn't help

Extended thinking improves output on hard problems. It doesn't fix problems caused by missing context.

If Claude produces a weak plan even after ultrathink, the issue is usually one of these:

  • The prompt is underspecified. Claude can reason deeply on a vague question and still produce a vague answer. Give it the constraints that matter: current state, desired outcome, the specific constraints your client has.
  • The right files aren't in scope. Claude's thinking operates on what it can read. Use @filename to pull specific files into the conversation before asking for a plan.
  • The effort level was overridden. Check whether MAXTHINKINGTOKENS is set too low in your environment, or whether someone disabled thinking in the global config. Run /config to verify the current state.

The reasoning capacity is only as useful as the inputs it has to work with. Extended thinking amplifies good context. It doesn't substitute for it.

Keep Going

Ready to Start Building?

Pick the next step that matches where you are right now.

Tutorial
Claude Code Basics

Start with the terminal basics. A hands-on, step-by-step guide to your first 10 minutes with Claude Code.

Start the Tutorial
Guide
AI-Powered Workflows

Automate your client work. Learn how to connect AI tools into workflows that handle repetitive tasks for you.

Read the Guide
Community
Join the Community

Connect with other fractional leaders building with AI. Share workflows, get feedback, and learn from operators who are ahead of you.

Apply to Join