observability

Expose build metadata for debugging deploys

When you’re on call, you eventually ask: “what version is running?” I expose a tiny /version endpoint that returns build metadata derived from debug.ReadBuildInfo plus a few variables set at build time. The goal isn’t perfect SBOMs; it’s fast debuggin

tracing for structured logging and distributed tracing

The tracing crate is the modern standard for instrumentation in async Rust. It provides structured logging with spans (representing work) and events (point-in-time records). Spans can be nested, creating a tree that represents causality. I instrument

Grafana dashboards as code with JSON provisioning

Grafana visualizes Prometheus metrics through configurable dashboards. Dashboard JSON models define panels, queries, and layouts programmatically. Panel types include timeseries for graphs, stat for single values, table for tabular data, and gauge for

Request ID + structured logging (Express + pino)

Debugging distributed requests is miserable without a stable request id, so I generate it once at the edge, echo it back via x-request-id, and attach it to every log line with a child logger. I keep this intentionally boring because it delivers 80% of

OpenTelemetry tracing for Node HTTP

Once services talk to other services, logs alone don’t tell you where time goes. I like OpenTelemetry because it’s vendor-neutral: export to Honeycomb, Datadog, Tempo—whatever your team uses. I keep the initial setup minimal: instrument http, express,

HTTPtrace to debug DNS/connect/TLS timing in production-like runs

When an outbound call is slow, it’s not always the server—it can be DNS, connection reuse, or TLS handshakes. net/http/httptrace lets you instrument a single request and see exactly where time is going. I attach a trace to the request context and reco

Lograge-Style JSON Logging Without Extra Gems

Production debugging is about logs that are searchable. Emit JSON with consistent keys (requestid, memberid, duration). Even if you later add Lograge, this structure maps cleanly.

Web Vitals reporting to an API endpoint

Synthetic performance tests don’t always match what real users experience. Web Vitals reporting gives you field data (CLS, LCP, INP, etc.). I collect vitals on the client and POST them to a lightweight endpoint, then aggregate on the backend. Sampling

gRPC unary interceptor for auth and timing logs

Interceptors are the cleanest way to standardize cross-cutting behavior in gRPC. I use a unary interceptor to extract authorization from metadata, validate it, attach the principal to context.Context, and log the method name and duration. This keeps s

Prometheus metrics for request latency

During an incident, the first question is usually ‘is it getting worse?’, and log lines don’t answer that well. Prometheus-style metrics make it easy to track rates and percentiles over time. I instrument request duration as a histogram and label by m

HTTP Timeouts + Retries Wrapper (Faraday)

I wrapped external HTTP calls once I realized most “flaky APIs” were actually my fault: no timeouts, unclear retries, and logs that didn’t tell a story. In Client with timeouts, I centralize a Faraday connection with explicit open_timeout and timeout,

Capture status code and bytes written via ResponseWriter wrapper

When you need accurate request logs or metrics, you can’t rely on “what you intended to write” — you need what actually got written. Wrapping http.ResponseWriter to capture WriteHeader and count bytes is a simple way to record status codes and respons