Guides Claude Agent SDK

Production Hardening Your Slack Agent

Add thread context injection, thread locking for concurrent messages, and spending caps. The three steps that separate a demo deployment from one your team can rely on.

4 min read

production hardening agent thread locking Slack thread context agent concurrent messages bot agent deployment checklist

What production hardening actually means

Deploying an agent and making it ready for a team to use ar…

Thread context injection

Without context, the agent does not know its environment. I…

Thread locking for concurrent messages

Two teammates tag the bot simultaneously in the same channe…

Spending caps and turn limits

Add these to your agent.ts query options:

The production-ready pattern

A hardened agent handler looks like this:

What production hardening actually means

Deploying an agent and making it ready for a team to use are two different things. The Slack bot works technically. It responds to messages and maintains session history.

What it does not do yet:

It does not know if it is in a DM or a channel
It does not handle two people messaging it simultaneously in the same thread
It has no spending or turn limits protecting against runaway sessions

These are not edge cases. They happen in real team usage. This guide handles all three.

Thread context injection

Without context, the agent does not know its environment. It treats a DM the same as a channel. It does not know the current time.

Add a helper that builds a context block for each message:

// src/helpers.ts

export function formatThreadContext(
  isDm: boolean,
  isResumption: boolean,
  sessionId?: string
): string {
  const location = isDm
    ? "Direct Message"
    : "Channel thread";

  const sessionContext = isResumption
    ? `This is a continuing conversation (session: ${sessionId}).`
    : "This is a new conversation.";

  const time = new Date().toISOString();

  return `[Context: ${location} | ${sessionContext} | Current time: ${time}]\n\n`;
}

Use it when building the message passed to the agent:

const context = formatThreadContext(
  event.channel_type === "im",
  !!existingSessionId,
  existingSessionId
);

const { response, sessionId } = await agentChat(
  context + userMessage,
  existingSessionId
);

The agent now knows whether it is in a DM or channel, whether the conversation is new or resumed, and the current timestamp. These three inputs change how it frames responses — more casual in DMs, more structured in channels, time-aware when relevant.

You can extend this context block with anything your use case needs: the user's name from your directory, the team's current projects, recent decisions. Thread context is your mechanism for giving the agent situational awareness.

Thread locking for concurrent messages

Two teammates tag the bot simultaneously in the same channel thread. Both events hit your server at once. Without locking, both process concurrently — the agent gets confused by the interleaved requests.

Thread locking queues requests for the same thread so they execute sequentially:

// src/bot.ts

const threadLocks = new Map<string, Promise<void>>();

async function withThreadLock<T>(
  threadKey: string,
  fn: () => Promise<T>
): Promise<T> {
  const existing = threadLocks.get(threadKey) ?? Promise.resolve();
  let resolve!: () => void;
  const next = new Promise<void>((r) => (resolve = r));
  threadLocks.set(threadKey, next);
  await existing;
  try {
    return await fn();
  } finally {
    resolve();
    if (threadLocks.get(threadKey) === next) {
      threadLocks.delete(threadKey);
    }
  }
}

Wrap your event handler logic:

app.event("app_mention", async ({ event, client }) => {
  const threadKey = `${event.channel}::${event.thread_ts ?? event.ts}`;

  await withThreadLock(threadKey, async () => {
    // All the existing handler logic goes here
  });
});

Now if three messages arrive for the same thread simultaneously, they execute in the order they arrived. The agent always has the most current context before generating each response.

Spending caps and turn limits

Add these to your agent.ts query options:

for await (const message of query(messages(), {
  model: "claude-sonnet-4-5",
  maxTurns: 30,
  maxBudgetUsd: 1.00,
  systemPrompt: SYSTEM_PROMPT,
  permissionMode: "bypassPermissions",
  dangerouslyAllowBypassPermissions: true,
  ...(sessionId ? { resume: sessionId } : {}),
})) {
  // handle messages
}

30 turns covers almost every real request a team member would make through Slack. $1 per session is generous for conversational use and protective against unusual requests. Adjust both values based on your team's actual usage patterns once you have a week of data.

The production-ready pattern

A hardened agent handler looks like this:

app.event("app_mention", async ({ event, client }) => {
  const threadTs = event.thread_ts ?? event.ts;
  const threadKey = `${event.channel}::${threadTs}`;

  await withThreadLock(threadKey, async () => {
    const existingSessionId = sessionStore.get(threadKey);
    const userMessage = stripMentionText(event.text ?? "");
    const context = formatThreadContext(false, !!existingSessionId, existingSessionId);

    const thinkingMessage = await client.chat.postMessage({
      token: process.env.SLACK_BOT_TOKEN,
      channel: event.channel,
      thread_ts: threadTs,
      text: "_Thinking..._",
    });

    try {
      const { response, sessionId } = await agentChat(
        context + userMessage,
        existingSessionId
      );
      sessionStore.set(threadKey, sessionId);

      await client.chat.update({
        token: process.env.SLACK_BOT_TOKEN,
        channel: event.channel,
        ts: thinkingMessage.ts!,
        text: markdownToSlack(response),
      });
    } catch (error) {
      if (existingSessionId) sessionStore.delete(threadKey);
      await client.chat.update({
        token: process.env.SLACK_BOT_TOKEN,
        channel: event.channel,
        ts: thinkingMessage.ts!,
        text: "Something went wrong.",
      });
    }
  });
});

Thread context, thread locking, error recovery, session persistence, thinking indicator, and Slack markdown — all of it in one coherent handler. This is the pattern to deploy to a real team.

---

Author: FractionalSkill

What production hardening actually means

Thread context injection

Thread locking for concurrent messages

Spending caps and turn limits

The production-ready pattern

Ready to Start Building?

Get New Guides First