The Query Function: Two Modes Explained
Single message mode vs streaming input mode. Why the async generator unlocks hooks, session resumption, and multi-turn conversations that plain string prompts cannot.
Two ways to call query
The query function is the entry point for every agent you build with the SDK. But it has two distinct calling modes, and the one you pick determines what features you have access to.
Single message mode — pass a plain string as the prompt:
for await (const message of query("Write a haiku about Monday mornings.", {
model: "claude-sonnet-4-5",
})) {
// handle messages
}
Clean and simple, but it locks you out of several features:
- Image uploads
- Queued or batched messages
- Interrupts
- Hooks
Streaming input mode — pass an async generator function:
async function* messages() {
yield {
role: "user" as const,
content: "Write a haiku about Monday mornings.",
};
}
for await (const message of query(messages(), {
model: "claude-sonnet-4-5",
})) {
// handle messages
}
This unlocks everything. Hooks work. Session resumption works. Multi-turn conversations work. The full SDK feature set is available.
For any agent you plan to put in production, use streaming input mode. The syntax difference is small. The capability difference is significant.
> Note on TypeScript v2: A second version of the TypeScript SDK interface is in preview as of this writing. It may simplify some of this syntax in the future. Until the v2 interface is stable and production-ready, use the v1 patterns shown here.
The async generator in detail
The * after function is what makes it an async generator. Without it, the function is a regular async function and the SDK will reject it.
async function* messages() { // note the *
yield {
role: "user" as const,
content: "First message",
};
yield {
role: "user" as const,
content: "Second message",
};
}
Think of it as a conveyor belt. The SDK processes the first yielded message completely — all tool calls, all reasoning, the full response — then picks up the second. This is how you build sequential multi-message conversations without session management overhead.
What for await gives you
The query function streams messages back as the agent works. The for await loop processes each one. Not every message needs handling — you pick which types you care about.
for await (const message of query(messages(), options)) {
switch (message.type) {
case "system":
// Session started — contains session ID, model, tools list
break;
case "assistant":
// Claude's output — text blocks, tool calls, thinking blocks
break;
case "result":
// Session ended — contains cost, duration, turn count, final result
break;
}
}
The three message types you will use most:
| Type | When it fires | What it contains |
|---|---|---|
system | Once at session start | Session ID, model, available tools |
assistant | During agent work | Text blocks, tool calls |
result | Once at session end | Final result, cost, duration, turn count |
In practice, most agents only need the result message to get the final answer. But the system message is where you grab the session ID for conversation resumption, and the assistant messages are where you log tool usage or stream output to a UI.
A minimal working pattern
This is the structure most production agents use as their foundation:
import { query } from "@anthropic-ai/claude-agent-sdk";
async function* messages() {
yield {
role: "user" as const,
content: "Your prompt here.",
};
}
let response = "";
let sessionId = "";
for await (const message of query(messages(), {
model: "claude-sonnet-4-5",
permissionMode: "bypassPermissions",
dangerouslyAllowBypassPermissions: true,
})) {
if (message.type === "result" && message.subtype === "success") {
response = message.result;
sessionId = message.session_id;
}
}
console.log(response);
console.log("Session:", sessionId);
Every variation you build — with custom tools, session resumption, hooks, Slack integration — extends this pattern. The structure stays the same.
---
Author: FractionalSkill