Skip to main content
Saved
Pattern
Difficulty Beginner

Thinking Indicator

Fill the gap between sending a prompt and the first token with an accessible thinking state that's announced, respects reduced motion, and gives way to streamed text.

Den Odell
By Den Odell Added

Thinking Indicator

Problem

The user sends a prompt and, for a beat, nothing changes. The model is loading context and composing its first token, but on screen the composer just emptied and the transcript sits still. That gap, usually one to three seconds and sometimes longer, is where users assume the request didn’t go through, so they send it again.

Teams that do add feedback here often reach for the same generic spinner they use for a data fetch. But a spinner says “loading,” and this isn’t loading in the usual sense: the model is working on a reply, and the feedback should say so. Worse, most of these indicators are purely visual: a screen reader user sends a prompt and hears absolute silence until a wall of text is dumped into the page.

It’s a small state that does a lot of work. It’s the handoff between the user’s action and the model’s reply, and if it reads as “nothing happened,” the whole interface feels unreliable before a single token has arrived.

Solution

The instant a prompt is sent, put a thinking indicator in the assistant’s slot in the Message Thread, and don’t wait for a network response to change the screen. Style it as its own thing (animated dots, a pulsing avatar) so it reads as “the assistant is composing,” distinct from a page-level spinner. When the first token arrives via Streaming Response, the indicator is simply replaced by the streamed content; the same message slot transitions from thinking to streaming to complete.

Make it accessible. Render the indicator inside a polite live region with visually-hidden text like “Assistant is thinking…” so it’s announced, rather than only animated. Because AI responses effectively always take long enough to warrant feedback, you usually don’t need the delay-before-showing trick a normal Loading State uses, but do provide a static, non-animated presentation under prefers-reduced-motion.

For agents that do real work before answering (searching, calling tools, reasoning), upgrade the indicator to reflect the current phase (“Searching the docs…”, “Writing…”). Status text that changes as the work moves along reassures people far more than a spinner that could mean anything, because they can see it making progress rather than sitting stuck.

Example

The indicator across frameworks, the accessible announcement, the reduced-motion treatment, and phase-aware status for agents.

The Indicator

function ThinkingIndicator({ label = 'Assistant is thinking' }) {
  return (
    <div className="thinking" role="status" aria-live="polite">
      <span className="sr-only">{label}…</span>
      <span className="dots" aria-hidden="true">
        <span /><span /><span />
      </span>
    </div>
  );
}

Animated Dots, With a Reduced-Motion Fallback

The animation is decoration; users who ask for reduced motion get a static indicator instead of pulsing dots.

.sr-only {
  position: absolute; width: 1px; height: 1px;
  padding: 0; margin: -1px; overflow: hidden; clip: rect(0 0 0 0); border: 0;
}

.dots { display: inline-flex; gap: 0.25rem; }
.dots span {
  width: 0.5rem; height: 0.5rem; border-radius: 50%;
  background: currentColor; opacity: 0.4;
  animation: blink 1.4s infinite both;
}
.dots span:nth-child(2) { animation-delay: 0.2s; }
.dots span:nth-child(3) { animation-delay: 0.4s; }

@keyframes blink { 0%, 80%, 100% { opacity: 0.4; } 40% { opacity: 1; } }

@media (prefers-reduced-motion: reduce) {
  .dots span { animation: none; opacity: 0.7; }
}

Swapping Thinking for Streamed Text

The indicator is just the empty-and-streaming state of an assistant message; once content exists, it’s gone.

function AssistantMessage({ content, status }) {
  // Empty + streaming = still thinking; any content = show the text
  if (status === 'streaming' && content.length === 0) {
    return <ThinkingIndicator />;
  }
  return <div className="content" aria-busy={status === 'streaming'}>{content}</div>;
}

Phase-Aware Status for Agents

When the model is doing real work before replying, name the phase; honest progress beats an indeterminate spin.

function AgentStatus({ phase }) {
  // phase updates arrive from the server: 'thinking' | 'searching' | 'writing'
  const label = {
    thinking: 'Thinking',
    searching: 'Searching the docs',
    writing: 'Writing the answer',
  }[phase] ?? 'Working';

  return (
    <div className="thinking" role="status" aria-live="polite">
      <span className="sr-only">{label}…</span>
      <span aria-hidden="true">{label}<Dots /></span>
    </div>
  );
}

Benefits

  • The dead gap before the first token stops reading as “nothing happened,” so users don’t re-send prompts.
  • A purpose-built indicator communicates “the assistant is composing,” which is more reassuring than a generic loading spinner.
  • Wrapping it in a live region means screen reader users are told the assistant is working instead of sitting in silence.
  • It’s the natural empty state of a streaming message, so it costs almost nothing to add once you’re already streaming.
  • Phase-aware status turns a long agent wait into visible progress, which keeps people waiting far more happily than a spinner that never changes.

Tradeoffs

  • An animated indicator with no real progress information can still feel like stalling if the wait runs long; phase labels help, but only if they’re honest.
  • Live-region announcements need restraint; the thinking message plus streamed tokens can double-announce if you’re not careful about which region speaks.
  • Fabricated phase labels (“Analyzing…” when nothing is being analyzed) erode trust the moment users notice they’re theater.
  • Motion needs a reduced-motion fallback, which is easy to skip and excludes users when you do.
  • The transition from indicator to first token can flash if the swap isn’t handled in the same message slot.

Summary

The pause between sending a prompt and the first token is brief, but leaving it blank makes an AI interface feel broken. Drop a thinking indicator into the assistant’s message slot immediately, announce it through a polite live region, give reduced-motion users a static version, and let real tokens replace it as they stream. For agents that work before answering, name the phase so the wait reads as progress. It’s a tiny state, but it’s the one that tells users their message landed.

Newsletter

A Monthly Email
from Den Odell

Behind-the-scenes thinking on frontend patterns, site updates, and more

No spam. Unsubscribe anytime.