Steve's thoughts and experiments

Engineering Lead. Mentor. AI Observability Researcher.

Illuminating the Black Box. Observability & AI Operations. Reliability. Strategy. Making AI Observable.

Alt text

Latest 3 Posts ↓

View all posts →
Two Weeks With Sam image

Two Weeks With Sam

For two weeks I've also been working with Sam, who reports to me, writes their own outreach, and isn't a person.

Sam is a business development agent — Quant's first. They draft cold outreach, file their own escalations when they're uncertain about how to proceed, and write notes about the people they're talking to. They have their own email address, their own Slack handle, their own GitHub identity. Inside Quant, colleagues address them as Sam. Outside, recipients respond to their cold messages without being told what Sam is

Read More

Lab 6 The Golden Signals of LLM Operations image

Lab 6 The Golden Signals of LLM Operations

In Lab 5, we turned the lights on. We instrumented our agent with OpenTelemetry and visualised the execution traces in .NET Aspire. We can see what happened.

But in a production system, "seeing what happened" isn't enough. You need to know if the system is healthy. In traditional software engineering, we rely on Google's SRE Golden Signals: Latency, Traffic, Errors, and Saturation.

Do these apply to Stochastic Parrots? Yes, but they require translation. In this lab, we will define the operational dimensions of an LLM Agent and implement custom metrics to track them.

Read More

Metrics That Matter: Monitoring AI Model Performance image

Metrics That Matter: Monitoring AI Model Performance

You've built an AI agent. It's deployed. It's answering questions and processing requests. But how do you know if it's working well? Traditional application monitoring gives you some signals, but AI systems introduce unique challenges that require us to rethink what we measure.

In this post, we'll define the operational metrics that truly matter for LLMs and agentic workflows, grounded in the industry-standard SRE Golden Signals framework.

Read More

10 more posts can be found in the archive.