reliability

HTTP client timeout with AbortController (fetch)

Unbounded network calls eventually will hang, and then your Node process gets stuck with slow requests chewing up the connection pool. I wrap fetch with an AbortController timeout so every outbound call has an upper bound. The key is distinguishing be

Rate limiting by IP + user (Express)

A single abusive client can ruin your latency budget for everyone else, so I rate limit early rather than trying to ‘detect abuse’ after the outage starts. I combine an IP bucket with a user bucket: IP protects unauthenticated endpoints, user protects

Request timeout handling with Rack::Timeout

Long-running requests tie up worker threads and degrade overall application responsiveness. Rack::Timeout enforces request timeouts at the Rack layer, killing requests that exceed configured limits. I set conservative timeouts (15-30 seconds) and hand

Cache Stampede Protection with race_condition_ttl

If a hot key expires, you can stampede your DB. race_condition_ttl lets one process recompute while others serve stale content briefly. This is a reliability pattern masquerading as caching.

Transactional Outbox for Reliable Event Publishing

I used a transactional outbox when I needed my database write and my event publish to succeed or fail together. In OutboxEvent model, I treated the outbox like a queue: a durable row per event, a dedupe_key for idempotency, and a ready scope that pull

Exponential backoff with jitter for retries

Retries are dangerous when they synchronize; that’s how you turn a minor outage into a stampede. I implement exponential backoff with jitter so clients spread out naturally. The Retry helper takes a shouldRetry predicate and always checks ctx.Done() b

Safe YAML decoding (KnownFields) for config files

YAML configuration is convenient, but it’s also a footgun when typos silently get ignored. I enable KnownFields(true) so unknown keys cause an error, which turns “silent misconfig” into a fast failure. That is especially useful during refactors when f

DB-Level “no overlapping ranges” with exclusion constraint

Scheduling/booking is tricky. Postgres exclusion constraints prevent overlapping time ranges at the database layer—far more reliable than application checks. Rails can still validate, but the DB is the source of truth.

Robust Webhook Verification (HMAC + Timestamp)

Webhooks are a security boundary. Verify signatures with constant-time compare, include a timestamp window to prevent replay, and store processed event IDs to make handlers idempotent.

Atomic “Read + Mark Processed” with UPDATE … RETURNING

If you have a queue table, avoid races by selecting and updating in one statement. Postgres UPDATE … RETURNING is the simplest building block for a correct custom queue / maintenance pipeline.

Database-Backed Unique Slugs with Retry

Slug generation is deceptively racy under concurrency. Use a unique index plus retry with a suffix. Keep it deterministic and fast; don’t query in a loop without bounds.

Robust environment parsing for bool/int with explicit defaults

I like config that fails loudly when it’s wrong. The helpers below parse environment variables with explicit defaults and good error messages, which prevents subtle “zero value” behavior in production. This is especially important for flags like ENABL