Sampling logs to reduce noise during high-QPS incidents

1359
0

Structured logs are great, but at high QPS they can become their own outage: too much IO, too much storage, and noisy dashboards. zap includes a sampler that allows you to keep early logs and then sample at a fixed rate. I like this for “successful request” logs while keeping errors unsampled. The important point is to be intentional: sampling should never hide rare failures, so error logs should stay at full fidelity, and metrics should be your primary signal for latency and error rate. The code below builds a core logger and wraps it with zapcore.NewSamplerWithOptions. In production, I tune the sampling rates based on real throughput and I document them so operators understand why they see fewer log lines during traffic spikes.