AIProductivity

Building Observable AI Workflows with Custom Status Lines

Nathan Atherton· Staff Software EngineerFebruary 12, 20267 min read

Here's a problem nobody talks about with AI-assisted development: when AI agents are doing work in the background, you can't see what's happening. You don't know if they're making progress, if they're stuck, if they're burning through your API budget, or if they've silently failed.

This is an observability problem. And like all observability problems, the solution is good instrumentation. So I built a custom status line for Claude Code.

What the Status Line Shows

My status bar is a two-line display that sits at the bottom of my terminal. Here's what it looks like:

┃ feat/blog ┃ ✓ CI ┃ ● 2 reviews ┃ opus ┃ ctx: 45% ┃ 12.4k tok ┃ $0.82 ┃
┃ 🔥 $4.20/hr ┃ api: 12% wait ┃ plan: 78% (resets 14h) ┃ 3 agents ┃

Let me break down each metric and why it matters.

Git Branch

Simple but essential. When you're switching between features and agents are working on different branches, you need to know which branch is active at a glance. No more typing git branch every five minutes.

CI Status

Shows whether CI is passing (✓), failing (✗), or running (◌) for the current branch. When agents are making commits, CI status tells you if their work is actually valid without having to open GitHub.

PR Review State

Shows the number of pending reviews on open PRs. This tells me when something needs my attention without constantly checking GitHub notifications.

Active Model

Shows which AI model is currently active (opus, sonnet, haiku). Since I route different task types to different models, this confirms the right model is being used for the current work.

Context Window Usage

This is one of the most important metrics. Context window usage directly affects AI output quality. When it's high (above 70%), I know it's time to compact or start a fresh session. Letting context get too full leads to degraded responses - the AI starts forgetting earlier instructions.

Token Count

Running count of tokens used in the current session. Useful for estimating costs and understanding how much work has been done.

Session Cost

Real-time cost tracking for the current session. This isn't about being cheap - it's about understanding the cost-efficiency of different approaches. If a task is taking $5 in tokens, I want to know that before I repeat the same approach on 20 more tasks.

Burn Rate

Cost per hour. This is the metric that made me rethink how I work. Early on, I had sessions burning $15/hour because I was using Opus for everything, including simple lint runs. Seeing the burn rate in real-time motivated me to implement model routing, which dropped my average burn rate by about 60%.

API Wait Ratio

Percentage of time spent waiting for API responses vs. active processing. High wait ratios (above 30%) indicate bottlenecks. This might mean I need to parallelise differently, or that the API is having a slow period.

Plan Usage

Shows how much of my Claude plan allocation has been used and when it resets. As a max plan subscriber, I'm not worried about hitting limits, but it's useful to see usage patterns over time.

Active Agents

Count of currently running background agents. This tells me the level of parallelism at any moment. If it drops to zero when I expected work to be happening, something's wrong.

Why Observability Matters for AI Workflows

In traditional development, you have direct visibility into your work. You see the code as you type it. You see tests pass or fail. You see the build output.

With AI agents, all of that happens in the background. Without observability, you're flying blind. You might:

Wait 20 minutes for an agent that silently errored out after 30 seconds
Burn through your API budget on a task that should have been routed to a cheaper model
Let context window fill up until AI quality degrades, then wonder why output got worse
Miss that CI is failing on the commits agents are making

The status line prevents all of these by making the invisible visible.

Implementation Approach

The status line is built as a Claude Code hook that updates on a regular interval. It pulls data from multiple sources:

Git data - via git commands (branch, status)
GitHub data - via the gh CLI (CI status, PR reviews)
Claude Code internals - via the session API (model, context, tokens, cost)
Process data - via process monitoring (active agents)

The rendering is pure terminal escape codes - no external dependencies. It writes directly to the terminal status line area, so it doesn't interfere with the conversation output.

Design Principles

A few principles I followed when building this:

Glanceable. Every metric should be understandable in under one second. No labels longer than 5 characters.
Actionable. Every metric should tell you if something needs your attention. Green/red colouring for CI, percentage thresholds for context, etc.
Non-intrusive. The status line should never interrupt your work. No pop-ups, no sounds (those come from the voice hooks), no forced attention.
Honest. Show real numbers, not estimates. The cost display shows actual API spend, not a model.

If you're working with AI agents in any capacity, I'd strongly recommend building some form of observability - even if it's just context window usage and cost tracking. The visibility changes how you work.