After getting burned by long-lived connections that slowly accumulate bad state (or get killed by the network) and then explode during peak traffic, I got strict about pg pooling. I keep the pool size small per instance and scale horizontally instead of cranking max and hoping for the best. I also set application_name so DBA tooling can immediately attribute connections to the right service. The small but critical detail is a connection acquisition timeout (e.g. connectionTimeoutMillis): without it, requests can hang and you’ll misdiagnose it as ‘the API is slow’ when the real issue is ‘the DB pool is exhausted’. Once pool timeouts are first-class errors, capacity planning becomes much clearer.