reliability

Defensive Deserialization for ActiveJob

Deploys happen while jobs are in the queue. Be defensive: accept both old and new payload shapes, and keep migrations forward-compatible. This prevents “deploy broke jobs” incidents.

Postgres advisory lock for one-at-a-time work

Sometimes you just need ‘only one worker does this thing at a time’, and building distributed locks from scratch is risky. Postgres advisory locks are a pragmatic option when your DB is already the source of truth. I derive a deterministic lock key (l

SQL migration safety: add column nullable, backfill, then constrain

The fastest way to surprise yourself in production is ADD COLUMN ... NOT NULL DEFAULT ... on a large table. My safe pattern is: add the column nullable with no default, backfill in batches, then add the NOT NULL constraint. If I need a default for new

Graceful shutdown for Node HTTP servers

Deploys got a lot calmer once I treated SIGTERM as a first-class signal instead of an afterthought. In Kubernetes (and most PaaS platforms), you’re expected to stop accepting new requests quickly while finishing in-flight work. The worst failure mode

HTTP client timeout with AbortController (fetch)

Unbounded network calls eventually will hang, and then your Node process gets stuck with slow requests chewing up the connection pool. I wrap fetch with an AbortController timeout so every outbound call has an upper bound. The key is distinguishing be

Rate limiting by IP + user (Express)

A single abusive client can ruin your latency budget for everyone else, so I rate limit early rather than trying to ‘detect abuse’ after the outage starts. I combine an IP bucket with a user bucket: IP protects unauthenticated endpoints, user protects

Cache Stampede Protection with race_condition_ttl

If a hot key expires, you can stampede your DB. race_condition_ttl lets one process recompute while others serve stale content briefly. This is a reliability pattern masquerading as caching.

Transactional Outbox for Reliable Event Publishing

I used a transactional outbox when I needed my database write and my event publish to succeed or fail together. In OutboxEvent model, I treated the outbox like a queue: a durable row per event, a dedupe_key for idempotency, and a ready scope that pull

DB-Level “no overlapping ranges” with exclusion constraint

Scheduling/booking is tricky. Postgres exclusion constraints prevent overlapping time ranges at the database layer—far more reliable than application checks. Rails can still validate, but the DB is the source of truth.

Robust Webhook Verification (HMAC + Timestamp)

Webhooks are a security boundary. Verify signatures with constant-time compare, include a timestamp window to prevent replay, and store processed event IDs to make handlers idempotent.

Atomic “Read + Mark Processed” with UPDATE … RETURNING

If you have a queue table, avoid races by selecting and updating in one statement. Postgres UPDATE … RETURNING is the simplest building block for a correct custom queue / maintenance pipeline.

Database-Backed Unique Slugs with Retry

Slug generation is deceptively racy under concurrency. Use a unique index plus retry with a suffix. Keep it deterministic and fast; don’t query in a loop without bounds.