Files
CapaKraken/docs/ai-excellence-due-diligence-roadmap.md

13 KiB

AI Excellence Due Diligence And Roadmap

Date: 2026-03-30 Purpose: Frank assessment of the current codebase plus a pragmatic roadmap to turn CapaKraken into a reference project for disciplined AI-assisted software engineering.

Executive Summary

CapaKraken is already well beyond a prototype. The repository shows a real domain model, a non-trivial bounded-context split, a meaningful automated test baseline, and active delivery discipline.

At the same time, the codebase still carries several risks that are typical of fast-moving AI-assisted development:

  1. some critical cross-cutting concerns are only partially productized
  2. several files and routers have grown beyond comfortable ownership size
  3. runtime secret handling is now materially cleaner, but the repo still needs to standardize the operational source of truth around that model
  4. the current operational model is improving, but not yet fully standardized
  5. production-grade multi-instance safeguards are not complete yet

The project feels strong enough to build on, but it is not yet a showcase of "how AI-built software should look" without another cleanup and hardening pass.

Current Strengths

  • Clear monorepo and package split across api, application, db, engine, shared, staffing, ui, and web, with shared tooling through turbo and pnpm.
  • Product scope is substantial and business-oriented rather than CRUD-only: estimating, planning, demand/assignment, chargeability, import/export, dashboards, report building, and admin surfaces.
  • CI already enforces typecheck, lint, unit tests, build, and E2E with PostgreSQL and Redis in the loop.
  • Application-layer use cases exist and are not just thin router wrappers.
  • Documentation coverage is materially better than average for a fast-moving product.

Status Update Since Initial Review

The highest-risk quick wins from the original review are now closed:

  • SSE delivery is now audience-scoped with architecture guardrails in CI
  • browser-side spreadsheet parsing now has focused regression coverage in apps/web
  • the route access matrix is in place and the ready-now audience-hardening slices were completed
  • comment visibility is now entity-scoped across API policy, assistant metadata, web consumers, and mention autocomplete

Due Diligence Findings

Critical

No currently open item in this review remains in the earlier "critical quick fix" class. The previously critical SSE and browser parser coverage issues were addressed during the hardening batch.

High

  1. Router and UI module size is now an operational risk. Evidence: assistant-tools.ts, resource.ts, allocation.ts, timeline.ts, vacation.ts, and large frontend files such as SystemSettingsClient.tsx and TimelineProjectPanel.tsx are each well past the size where safe ownership stays easy. Risk: AI-generated changes become harder to review, humans lose local reasoning context, and regressions become more likely.

  2. Runtime secret policy is mostly corrected, but deploy standardization still has to catch up. Evidence: runtime resolution and admin flows now treat environment-backed secrets as the preferred source in settings.ts, system-settings-runtime.ts, and SystemSettingsClient.tsx. Risk: a strong secret policy is only fully effective once staging and production provisioning use one canonical deployment path and operators clear remaining legacy database copies. Update: the application no longer persists new operational secret values through admin settings; the remaining work is rollout discipline and cleanup completion.

  3. Least-privilege is materially better documented now, but it still needs long-lived enforcement rather than relying mainly on one hardening batch. Evidence: the route audience model is now explicit in route-access-matrix.md and backed by multiple focused auth tests, but the remaining guarantee still depends on continuing test coverage and architecture guardrails as new routes evolve. Risk: future feature work can slowly widen access again if the matrix and tests are not treated as an enforced contract.

Medium

  1. Rate limiting now supports deployment-grade shared counters, but rollout discipline still matters. Evidence: rate-limit.ts now prefers Redis-backed counters when REDIS_URL is configured, while preserving an in-memory fallback for local development and degraded operation. Risk: protections still depend on production actually wiring Redis for all instances instead of silently running on the fallback path.

  2. Performance hotspots are well understood but not yet structurally solved. Evidence: the current performance review identifies repeated in-memory filtering, broad invalidation, and heavyweight timeline/report derivations in performance-optimization-review-2026-03-18.md. Risk: user experience and infrastructure cost will degrade as data volume grows.

  3. Rollback and incident drills still need to be exercised, even though the deployment path is now standardized. Evidence: the canonical production path now runs through release-image.yml, deploy-staging.yml, deploy-prod.yml, and the single host compose file docker-compose.prod.yml. Risk: a clean architecture path still needs operator rehearsal before it becomes operationally boring under pressure.

Overall Rating

Product Engineering Quality

8/10

This is materially better than a typical startup CRUD app and already has the bones of a serious internal platform or vertical SaaS.

Security Posture

7.5/10

There are good foundations, and the most obvious real-time, comment-visibility, and runtime-secret-policy gaps were closed, but long-lived least-privilege enforcement and operational standardization still need structural work.

Maintainability

6.5/10

The architecture is promising, but file size, router density, and compatibility residue will eventually slow everyone down unless addressed deliberately.

Operational Maturity

7.5/10

Good CI and improving deploy discipline are in place, but production standardization still needs one more step.

AI-Excellence Readiness

7/10

The project already proves that AI can help build serious software fast. It does not yet prove that AI-assisted development can stay consistently clean, minimal, and audit-friendly at scale.

What A Showcase AI Project Should Demonstrate

To be a true showcase for AI-assisted development, this repository should visibly demonstrate:

  • small, composable files with clear ownership boundaries
  • explicit security and permission models at every boundary
  • deterministic build and deploy flow
  • measurable quality gates beyond "tests pass"
  • strong documentation that explains not only what exists, but why the structure is this way
  • low-friction reviewability, so humans can still govern AI speed

Roadmap

Phase 1: Close the Dangerous Gaps

Status: substantially completed

Goals:

  • Keep SSE audience scoping under test and CI guardrails.
  • Keep hardened spreadsheet parser boundaries under regression coverage.
  • Treat the route access matrix and narrowed auth slices as maintained architecture contracts.
  • Enforce the environment-only runtime secret policy operationally and clear remaining legacy database secret residue. Status: mostly completed in code. Runtime consumers prefer environment values, admin updates no longer store new secret material, and operators now need to finish rollout/bootstrap documentation plus cleanup of old database copies.

Definition of done:

  • standard users cannot subscribe to unrelated real-time planning events
  • file import paths stay covered by focused regression tests
  • every sensitive router remains explicitly classified by audience
  • secret storage policy is documented and enforced

Phase 2: Cut Down Complexity

Target window: 2 to 4 weeks

Goals:

  • Split oversized routers into bounded router modules by feature slice.
  • Split oversized React components into container, state, and presentational layers.
  • Introduce file-size and complexity guardrails for new code.
  • Create "AI review rules" for generated patches: max file growth, required tests, required docs for cross-cutting changes.

Priority candidates:

  • packages/api/src/router/assistant-tools.ts
  • packages/api/src/router/resource.ts
  • packages/api/src/router/allocation.ts
  • packages/api/src/router/timeline.ts
  • apps/web/src/components/admin/SystemSettingsClient.tsx
  • apps/web/src/components/timeline/*

Definition of done:

  • no new source file over 500 lines without an explicit exception
  • top 10 largest business-critical source files are materially reduced
  • patch reviews become narrower and easier to reason about

Phase 3: Make Quality Measurable

Target window: 2 to 3 weeks

Goals:

  • Add architecture fitness checks, not just lint/tests.
  • Add API authorization tests for all sensitive routers.
  • Add bundle-size and route-size monitoring for the web app.
  • Add mutation-path audit coverage checks where business-critical state changes occur.
  • Add a dependency and unsafe-library policy.

Suggested checks:

  • role/permission regression tests
  • SSE audience contract tests
  • import abuse tests with oversized files
  • max file size / max router size lint or CI checks
  • coverage thresholds for critical packages

Definition of done:

  • the repo can fail CI for architectural regressions, not only syntax or unit failures
  • critical security assumptions are test-backed

Phase 4: Standardize Operations

Target window: 1 to 2 weeks

Goals:

  • document staging and production bootstrap as code, not tribal knowledge
  • ensure staging and production run the Redis-backed rate-limit path intentionally and monitor fallback usage
  • define rollback drills and incident response playbooks

Definition of done:

  • one production deployment path
  • one rollback path
  • one source of truth for runtime configuration

Phase 5: Turn It Into A Reference Project

Target window: ongoing

Goals:

  • add a concise engineering doctrine for AI-assisted development in this repo
  • publish coding heuristics for humans and AI: file size limits, change budgets, ownership boundaries, review expectations
  • maintain a "why this is structured this way" architecture guide
  • log selected before/after refactors to demonstrate how AI was used responsibly

Artifacts to add:

  • docs/engineering-doctrine.md
  • docs/architecture-decision-records/
  • docs/ai-collaboration-standards.md
  • a small set of "reference slices" that show exemplary patterns end to end

Suggested Order Of Execution

  1. router/component decomposition
  2. architecture fitness checks in CI
  3. full operational standardization
  4. production-grade rate limiting
  5. performance hotspot reduction

Success Criteria For The Next 60 Days

  • no critical or high-severity known security gap remains open without an owner and due date
  • no core router continues to grow unchecked
  • at least one major domain slice is refactored into a clear "reference implementation" pattern
  • production deployment uses the same artifact that passed CI
  • the repo gains explicit AI-development rules that improve reviewability instead of just increasing output

Bottom Line

CapaKraken is already good enough to justify further investment. It is not a cleanup disaster.

The opportunity is not to rebuild it. The opportunity is to harden the weak edges, reduce oversized ownership surfaces, and make the engineering standards visible enough that the repository becomes evidence that AI can accelerate serious software without normalizing architectural debt.