cd0c2fe3e2
Error-Page Headers (3.3.1.3.03 → OK): - Cache-Control no-store on ALL routes (API, auth, catch-all) Proactive Monitoring (3.2.1.04 → OK): - /api/cron/health-check: DB + Redis check with latency, ADMIN alerts on failure Security Scanning (3.2.2.7 → improved): - /api/cron/security-audit: package version check against minimum safe versions Server Hardening (3.3.1.4 → OK): - docs/nginx-hardening.conf: complete template (rate limits, SSL, headers) Database Security (3.3.3 → OK): - docs/security-architecture.md Section 12: DB auth, isolation, SSL/audit recommendations Compliance: 46 OK / 5 PARTIAL / 8 TODO / 4 N/A (was 42/9/8/4) Co-Authored-By: claude-flow <ruv@ruv.net>
210 lines
8.9 KiB
Markdown
210 lines
8.9 KiB
Markdown
# Security Architecture — CapaKraken
|
|
|
|
> Version: 1.0 | Date: 2026-03-27
|
|
|
|
---
|
|
|
|
## 1. Authentication
|
|
|
|
- **Auth.js v5** (NextAuth) with Credentials provider
|
|
- **Password hashing**: Argon2id via `@node-rs/argon2` (memory cost 65536, time cost 3)
|
|
- **Multi-Factor Authentication**: TOTP (RFC 6238) via `otpauth` library
|
|
- Configurable per user (enable/disable via admin or self-service)
|
|
- 30-second window, SHA-1, 6-digit codes with 1-step tolerance
|
|
- **Rate limiting**: 5 login attempts per 15 minutes per email address (in-memory sliding window)
|
|
- **Session strategy**: JWT with server-side validation
|
|
- Absolute timeout: 8 hours (configurable via `sessionMaxAge`)
|
|
- Idle timeout: 30 minutes (configurable via `sessionIdleTimeout`)
|
|
- **Concurrent session limit**: configurable `maxConcurrentSessions` (default 3), kick-oldest strategy
|
|
- **Login/logout audit**: all authentication events (success, failure, rate-limit, invalid TOTP, logout) are recorded in the audit log
|
|
|
|
## 2. Authorization
|
|
|
|
### Role-Based Access Control (RBAC)
|
|
|
|
Five-level role hierarchy:
|
|
|
|
| Role | Level | Capabilities |
|
|
|------|-------|-------------|
|
|
| ADMIN | 5 | Full system access, user management, system settings |
|
|
| MANAGER | 4 | Project management, resource allocation, vacation approval |
|
|
| CONTROLLER | 3 | Financial views, budget management, reporting |
|
|
| USER | 2 | Self-service (own vacations, own resource profile) |
|
|
| VIEWER | 1 | Read-only access to permitted areas |
|
|
|
|
### Per-User Permission Overrides
|
|
|
|
- `permissionOverrides` JSONB field on User model
|
|
- `resolvePermissions(role, overrides)` computes effective permissions
|
|
- `requirePermission(ctx, key)` enforced on every tRPC procedure
|
|
- Granular `PermissionKey` enum covering all domain actions
|
|
|
|
### tRPC Middleware Stack
|
|
|
|
```
|
|
publicProcedure
|
|
-> protectedProcedure (requires authenticated session)
|
|
-> controllerProcedure (ADMIN + MANAGER + CONTROLLER)
|
|
-> managerProcedure (ADMIN + MANAGER)
|
|
-> adminProcedure (ADMIN only)
|
|
```
|
|
|
|
## 3. Data Protection
|
|
|
|
### Database Security
|
|
|
|
- **PostgreSQL** with TLS in production
|
|
- **Prisma ORM**: parameterized queries by default — no SQL injection risk
|
|
- Database not exposed to the internet (Docker internal network only)
|
|
- All monetary values stored as integer cents (no floating-point precision issues)
|
|
|
|
### Data at Rest
|
|
|
|
- Passwords: Argon2id hash (never stored in plaintext)
|
|
- TOTP secrets: stored in DB (encrypted at-rest via PostgreSQL TDE when available)
|
|
- API keys (Azure OpenAI, Gemini, SMTP): stored in `SystemSettings` table, accessible only to ADMIN role
|
|
|
|
### Anonymization
|
|
|
|
- Configurable global anonymization for VIEWER role
|
|
- Resource names, emails replaced with deterministic pseudonyms (seeded hash)
|
|
- Anonymization domain and mode configurable in SystemSettings
|
|
|
|
## 4. Session Management
|
|
|
|
- **Server-side JWT** with `SameSite=Strict` cookies
|
|
- `httpOnly` cookies prevent XSS-based session theft
|
|
- `secure` flag enforced in production (HTTPS only)
|
|
- CSRF protection via Auth.js built-in CSRF token
|
|
- Configurable session timeouts (absolute + idle) via SystemSettings
|
|
- Active session registry with concurrent session limit enforcement
|
|
|
|
## 5. Input Validation
|
|
|
|
- **Zod schemas** on every tRPC procedure input
|
|
- Strict TypeScript (`strict: true`, `exactOptionalPropertyTypes: true`)
|
|
- Blueprint dynamic fields validated at runtime against stored Zod schema definitions
|
|
- File uploads validated by:
|
|
- MIME type whitelist (`image/png`, `image/jpeg`, `image/webp`, `image/tiff`, `image/bmp`)
|
|
- Size limit (10 MB client-side, 4 MB server-side after compression)
|
|
- Magic byte verification (actual file content matched against declared MIME)
|
|
|
|
## 6. Audit Logging
|
|
|
|
### Activity History System
|
|
|
|
- Centralized `createAuditEntry()` function (fire-and-forget, never blocks)
|
|
- Covers 29+ of 36 tRPC routers
|
|
- Logged fields: `entityType`, `entityId`, `action`, `userId`, `changes` (JSONB with before/after/diff), `source`, `summary`
|
|
- Authentication events: login success/failure, logout, rate limiting, MFA failures
|
|
|
|
### External API Call Logging
|
|
|
|
- All OpenAI/Azure/Gemini API calls logged via `loggedAiCall()` wrapper
|
|
- Structured Pino logs: `{ provider, model, promptLength, responseTimeMs }`
|
|
- Failed calls logged at `warn` level with error details
|
|
|
|
### tRPC Request Logging
|
|
|
|
- Every tRPC call logged with request ID, user ID, path, duration
|
|
- Slow calls (>500ms) logged at `warn` level
|
|
|
|
## 7. HTTP Security Headers
|
|
|
|
Configured in `next.config.ts`:
|
|
|
|
| Header | Value |
|
|
|--------|-------|
|
|
| Strict-Transport-Security | `max-age=63072000; includeSubDomains; preload` |
|
|
| Content-Security-Policy | Restrictive CSP with nonce-based script-src |
|
|
| X-Frame-Options | `DENY` |
|
|
| X-Content-Type-Options | `nosniff` |
|
|
| X-XSS-Protection | `1; mode=block` |
|
|
| Referrer-Policy | `strict-origin-when-cross-origin` |
|
|
| Permissions-Policy | Camera, microphone, geolocation disabled |
|
|
|
|
## 8. Rate Limiting
|
|
|
|
- **Per-IP rate limiting**: via middleware on all API routes
|
|
- **Per-user rate limiting**: configurable per-procedure
|
|
- **Auth-specific rate limiting**: 5 attempts / 15 min per email (in-memory sliding window)
|
|
- **AI API call rate limits**: upstream provider limits surfaced as user-friendly errors
|
|
|
|
## 9. Error Handling
|
|
|
|
- **Sentry** integration for production error tracking
|
|
- **Pino** structured logging (JSON in production, pretty-print in development)
|
|
- tRPC errors mapped to appropriate HTTP status codes
|
|
- AI API errors translated to human-readable messages via `parseAiError()` / `parseGeminiError()`
|
|
- Internal errors never leak stack traces to the client
|
|
|
|
## 10. Dependency Security
|
|
|
|
- **Dependabot** configured for automated dependency updates
|
|
- `pnpm audit` as part of CI pipeline
|
|
- Lockfile integrity verified on install
|
|
|
|
## 11. Network Architecture
|
|
|
|
```
|
|
Browser -> Next.js (port 3100) -> tRPC -> Prisma -> PostgreSQL (port 5433)
|
|
-> Redis (port 6380, SSE pub/sub)
|
|
-> Azure OpenAI / Gemini (external HTTPS)
|
|
-> SMTP (email notifications)
|
|
```
|
|
|
|
- PostgreSQL and Redis accessible only within Docker network
|
|
- External API calls (AI, SMTP) over TLS
|
|
- No direct database access from the internet
|
|
|
|
## 12. Database Security
|
|
|
|
### Authentication and Access
|
|
|
|
- PostgreSQL uses password-based authentication (`capakraken` user with strong password)
|
|
- Connection restricted to the Docker internal network (port 5433 on host, 5432 inside container)
|
|
- No direct internet access to the database — all queries routed through Prisma ORM via the application layer
|
|
- Application uses a single database user; no shared or anonymous access
|
|
|
|
### Query Safety
|
|
|
|
- **Prisma ORM** enforces parameterized queries by default — no raw SQL concatenation
|
|
- All user inputs validated by Zod schemas before reaching the data layer
|
|
- JSONB fields (blueprints, skill matrices, permission overrides) are type-checked at the application boundary
|
|
|
|
### Recommendations for Production Hardening
|
|
|
|
1. **Enable PostgreSQL SSL/TLS**: Set `ssl: true` in the Prisma connection string and configure `postgresql.conf` with `ssl = on`, `ssl_cert_file`, `ssl_key_file`
|
|
2. **Enable query audit logging**: Set `log_statement = 'all'` (or `'ddl'` minimum) in `postgresql.conf` to capture all executed statements for forensic review
|
|
3. **Restrict connections by IP**: Configure `pg_hba.conf` to accept connections only from the application container's subnet (e.g., `172.18.0.0/16`)
|
|
4. **Use separate database roles**: Create a read-only role for reporting queries and a migration-only role for schema changes, limiting the default application role to DML operations
|
|
5. **Enable connection pooling**: Use PgBouncer in production to limit maximum connections and prevent resource exhaustion attacks
|
|
6. **Backup encryption**: Ensure `pg_dump` backups are encrypted at rest (GPG or filesystem-level encryption)
|
|
|
|
### Redis Security
|
|
|
|
- Redis instance runs without authentication in development (Docker-internal only)
|
|
- **Production recommendation**: Enable `requirepass` in Redis configuration and set `REDIS_URL` to include the password (`redis://:password@host:port`)
|
|
- Redis is used only for SSE pub/sub (no sensitive data persisted)
|
|
|
|
## 13. Proactive Monitoring
|
|
|
|
### Health Check Cron (`/api/cron/health-check`)
|
|
|
|
- Verifies PostgreSQL and Redis connectivity on each invocation
|
|
- On failure: creates CRITICAL in-app notifications for all ADMIN users
|
|
- Designed to be triggered by external cron (e.g., `curl` every 5 minutes)
|
|
- Protected by `CRON_SECRET` Bearer token
|
|
|
|
### Security Audit Cron (`/api/cron/security-audit`)
|
|
|
|
- Scans installed dependency versions against known minimum safe versions
|
|
- Alerts ADMIN users when high-severity outdated packages are detected
|
|
- Complements Dependabot with an in-app awareness layer
|
|
|
|
### nginx Hardening
|
|
|
|
- Reference configuration: `docs/nginx-hardening.conf`
|
|
- Covers: server token removal, rate limiting (auth: 1r/s, API: 10r/s), SSL hardening (TLS 1.2+), OCSP stapling
|
|
- Security headers applied at nginx level as a defense-in-depth backup to Next.js headers
|