refactor(api): add redis-backed rate limiting fallback

2026-03-30 23:23:56 +02:00
parent bcfb18393e
commit ef5e8016a4
9 changed files with 357 additions and 61 deletions
@@ -58,9 +58,9 @@ The previously critical SSE and browser parser coverage issues were addressed du

 ### Medium

-1. Rate limiting is process-local and not deployment-grade.
-   Evidence: [rate-limit.ts](/home/hartmut/Documents/Copilot/capakraken/packages/api/src/middleware/rate-limit.ts) uses an in-memory `Map` and explicitly notes that multi-instance deployments need Redis-backed replacement.
-   Risk: protections weaken as soon as the app scales horizontally.
+1. Rate limiting now supports deployment-grade shared counters, but rollout discipline still matters.
+   Evidence: [rate-limit.ts](/home/hartmut/Documents/Copilot/capakraken/packages/api/src/middleware/rate-limit.ts) now prefers Redis-backed counters when `REDIS_URL` is configured, while preserving an in-memory fallback for local development and degraded operation.
+   Risk: protections still depend on production actually wiring Redis for all instances instead of silently running on the fallback path.

 2. Performance hotspots are well understood but not yet structurally solved.
   Evidence: the current performance review identifies repeated in-memory filtering, broad invalidation, and heavyweight timeline/report derivations in [performance-optimization-review-2026-03-18.md](/home/hartmut/Documents/Copilot/capakraken/docs/performance-optimization-review-2026-03-18.md).
@@ -193,7 +193,7 @@ Goals:

 - complete the move to image-based deploys as the canonical path
 - document staging and production bootstrap as code, not tribal knowledge
- replace in-memory rate limits with Redis-backed limits where appropriate
+- ensure staging and production run the Redis-backed rate-limit path intentionally and monitor fallback usage
 - define rollback drills and incident response playbooks

 Definition of done:
@@ -47,6 +47,7 @@
 - the remaining estimate search, planning lookup, self-service timeline read, and navigation assistant helpers now live in their own domain module, keeping another mixed read-only cluster out of the monolithic assistant router without changing the assistant contract
 - the country listing and country detail assistant helpers now live in their own domain module, keeping the remaining geo/readmodel lookups out of the monolithic assistant router without changing the assistant contract
 - the remaining vacation workflow and entitlement assistant helpers now live in their own domain module, leaving `packages/api/src/router/assistant-tools.ts` as an aggregator/composition layer instead of the last mixed monolithic execution block
+- API and auth rate limiting now prefer shared Redis-backed counters when `REDIS_URL` is configured, while retaining an in-memory fallback for local/degraded operation with focused regression coverage

 ## Next Up

@@ -61,9 +62,8 @@ The remaining work is now structural rather than another quick batch:

 1. secrets and runtime configuration policy
 2. oversized router decomposition
-3. production-grade rate limiting
-4. canonical image-based production delivery
-5. performance hotspot reduction
+3. canonical image-based production delivery
+4. performance hotspot reduction

 ## Working Rule

@@ -131,7 +131,8 @@ Configured in `next.config.ts`:

 - **Per-IP rate limiting**: via middleware on all API routes
 - **Per-user rate limiting**: configurable per-procedure
- **Auth-specific rate limiting**: 5 attempts / 15 min per email (in-memory sliding window)
+- **Shared rate-limit backend**: Redis-backed counters when `REDIS_URL` is configured; in-memory fallback remains available for local development and degraded operation
+- **Auth-specific rate limiting**: 5 attempts / 15 min per email
 - **AI API call rate limits**: upstream provider limits surfaced as user-friendly errors

 ## 9. Error Handling