feat: Sprint 0 — CI/CD pipeline, production Docker, health checks
CI Pipeline (.github/workflows/ci.yml): - 5 jobs: typecheck, lint, test, build, e2e (parallel where possible) - PostgreSQL 16 + Redis 7 service containers for test/e2e - pnpm store, Turborepo, Playwright browser caching - Concurrency groups cancel in-progress runs Production Docker: - Dockerfile.prod: 3-stage build (deps → build → runtime ~150MB) - docker-compose.prod.yml: postgres + redis + app with health checks - .dockerignore for fast builds - next.config.ts: output: "standalone" for minimal runtime Health Check Endpoints: - GET /api/health — liveness probe (200 OK, no deps) - GET /api/ready — readiness probe (postgres + redis connectivity) Documentation: - docs/ci-cd-manual.md — full pipeline manual with troubleshooting - plan.md — Product Owner strategic plan (bottlenecks, growth, automation) Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
@@ -0,0 +1,316 @@
|
||||
# Planarchy CI/CD Manual
|
||||
|
||||
## Overview
|
||||
|
||||
Planarchy uses GitHub Actions for continuous integration and Docker for deployment. This document covers the full pipeline from code push to production.
|
||||
|
||||
---
|
||||
|
||||
## 1. CI Pipeline (Automatic on every PR)
|
||||
|
||||
### What triggers it
|
||||
|
||||
| Event | Trigger |
|
||||
|-------|---------|
|
||||
| Pull request to `main` | All CI jobs run |
|
||||
| Push to `main` | All CI jobs run |
|
||||
|
||||
### Jobs and their purpose
|
||||
|
||||
```
|
||||
PR opened / pushed
|
||||
│
|
||||
├──→ typecheck (tsc --noEmit, ~40s)
|
||||
├──→ lint (ESLint via Turborepo, ~20s)
|
||||
├──→ test (Vitest unit tests, ~60s, needs PostgreSQL + Redis)
|
||||
│
|
||||
└──→ build (next build, ~90s, runs after typecheck)
|
||||
│
|
||||
└──→ e2e (Playwright, ~3-5min, runs after build)
|
||||
```
|
||||
|
||||
**typecheck, lint, and test run in parallel** for speed. Build waits for typecheck. E2E waits for build.
|
||||
|
||||
### What each job checks
|
||||
|
||||
| Job | Command | What it catches |
|
||||
|-----|---------|----------------|
|
||||
| **typecheck** | `pnpm --filter @planarchy/web exec tsc --noEmit` | Type errors across the full web app |
|
||||
| **lint** | `pnpm lint` | Code style violations, unused imports, etc. |
|
||||
| **test** | `pnpm test:unit` | Unit test failures in engine, staffing, API, shared |
|
||||
| **build** | `pnpm --filter @planarchy/web exec next build` | SSR errors, dynamic import issues, bundle problems |
|
||||
| **e2e** | `pnpm test:e2e` | End-to-end user flow regressions |
|
||||
|
||||
### Required status checks
|
||||
|
||||
Before merging a PR, **all 5 jobs must pass**. Configure this in GitHub Settings > Branches > Branch protection rules > Require status checks.
|
||||
|
||||
### Caching
|
||||
|
||||
The pipeline caches these artifacts to speed up subsequent runs:
|
||||
|
||||
| Cache | Key | Saves |
|
||||
|-------|-----|-------|
|
||||
| pnpm store | `pnpm-lock.yaml` hash | ~30s install time |
|
||||
| Turborepo | `.turbo` directory | ~60s on unchanged packages |
|
||||
| Playwright browsers | Playwright version | ~45s browser download |
|
||||
|
||||
---
|
||||
|
||||
## 2. Local Development Quality Gates
|
||||
|
||||
Run these before pushing to catch issues early:
|
||||
|
||||
```bash
|
||||
# Quick check (< 2 min)
|
||||
pnpm --filter @planarchy/web exec tsc --noEmit && pnpm lint
|
||||
|
||||
# Full check (< 3 min)
|
||||
pnpm test:unit
|
||||
|
||||
# Full check including build (< 5 min)
|
||||
pnpm --filter @planarchy/web exec next build
|
||||
```
|
||||
|
||||
### Pre-commit hook (optional)
|
||||
|
||||
You can add a Git pre-commit hook to run the quick check automatically:
|
||||
|
||||
```bash
|
||||
# .husky/pre-commit
|
||||
pnpm --filter @planarchy/web exec tsc --noEmit
|
||||
pnpm lint
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Health Check Endpoints
|
||||
|
||||
Two endpoints are available for monitoring:
|
||||
|
||||
### GET `/api/health` — Liveness Probe
|
||||
|
||||
Returns 200 if the Node.js process is running. No external dependencies checked.
|
||||
|
||||
```json
|
||||
{ "status": "ok", "timestamp": "2026-03-19T10:00:00.000Z" }
|
||||
```
|
||||
|
||||
**Use for:** Kubernetes/Docker liveness probe, uptime monitoring.
|
||||
|
||||
### GET `/api/ready` — Readiness Probe
|
||||
|
||||
Checks PostgreSQL and Redis connectivity. Returns 200 if all services are reachable, 503 if not.
|
||||
|
||||
```json
|
||||
// Healthy
|
||||
{ "status": "ready", "postgres": "ok", "redis": "ok" }
|
||||
|
||||
// Unhealthy
|
||||
{ "status": "not_ready", "postgres": "ok", "redis": "error" }
|
||||
```
|
||||
|
||||
**Use for:** Kubernetes/Docker readiness probe, load balancer health checks, nginx upstream checks.
|
||||
|
||||
---
|
||||
|
||||
## 4. Production Docker Build
|
||||
|
||||
### Building the production image
|
||||
|
||||
```bash
|
||||
# Build the image
|
||||
docker build -f Dockerfile.prod -t planarchy:latest .
|
||||
|
||||
# Test it locally
|
||||
docker compose -f docker-compose.prod.yml up -d
|
||||
```
|
||||
|
||||
### Image details
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Base | `node:20-bookworm-slim` |
|
||||
| Size | ~150-200 MB (vs ~1.5 GB dev image) |
|
||||
| Output | Next.js standalone mode |
|
||||
| Healthcheck | `curl -f http://localhost:3000/api/health` |
|
||||
| Port | 3000 (internal), mapped to 3100 externally |
|
||||
|
||||
### Environment variables
|
||||
|
||||
The production image requires these environment variables:
|
||||
|
||||
```env
|
||||
# Required
|
||||
DATABASE_URL=postgresql://user:pass@host:5432/planarchy
|
||||
REDIS_URL=redis://host:6379
|
||||
NEXTAUTH_URL=https://planarchy.your-domain.com
|
||||
NEXTAUTH_SECRET=<random-32-char-string>
|
||||
|
||||
# Optional
|
||||
SENTRY_DSN=https://xxx@sentry.io/xxx
|
||||
SMTP_HOST=smtp.example.com
|
||||
SMTP_PORT=587
|
||||
SMTP_USER=notifications@example.com
|
||||
SMTP_PASSWORD=<password>
|
||||
SMTP_FROM=Planarchy <notifications@example.com>
|
||||
```
|
||||
|
||||
Generate a secure `NEXTAUTH_SECRET`:
|
||||
|
||||
```bash
|
||||
openssl rand -base64 32
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Deployment
|
||||
|
||||
### docker-compose (simplest)
|
||||
|
||||
```bash
|
||||
# On your server
|
||||
git pull
|
||||
docker compose -f docker-compose.prod.yml up -d --build
|
||||
|
||||
# Run database migrations
|
||||
docker compose -f docker-compose.prod.yml exec app \
|
||||
npx prisma db push --skip-generate
|
||||
|
||||
# Seed initial data (first deployment only)
|
||||
docker compose -f docker-compose.prod.yml exec app \
|
||||
npx prisma db seed
|
||||
```
|
||||
|
||||
### Manual deployment (current setup)
|
||||
|
||||
Since `planarchy.hartmut-noerenberg.com` runs behind nginx:
|
||||
|
||||
```bash
|
||||
# On the server
|
||||
cd /home/hartmut/Documents/Copilot/planarchy
|
||||
git pull origin main
|
||||
pnpm install
|
||||
pnpm --filter @planarchy/db exec prisma generate
|
||||
pnpm --filter @planarchy/web exec next build
|
||||
rm -rf apps/web/.next/cache # clear stale cache
|
||||
|
||||
# Restart the app (systemd, pm2, or manual)
|
||||
fuser -k 3100/tcp 2>/dev/null
|
||||
PORT=3100 pnpm --filter @planarchy/web start &
|
||||
```
|
||||
|
||||
### nginx configuration
|
||||
|
||||
The existing nginx reverse proxy should forward to port 3100:
|
||||
|
||||
```nginx
|
||||
server {
|
||||
server_name planarchy.hartmut-noerenberg.com;
|
||||
|
||||
location / {
|
||||
proxy_pass http://127.0.0.1:3100;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# SSE support (keep connection open)
|
||||
proxy_read_timeout 86400s;
|
||||
proxy_buffering off;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Monitoring Setup
|
||||
|
||||
### Sentry (error tracking)
|
||||
|
||||
After creating a Sentry project, add the DSN to `.env.production`:
|
||||
|
||||
```env
|
||||
SENTRY_DSN=https://xxx@sentry.io/xxx
|
||||
```
|
||||
|
||||
Errors are automatically captured by the Sentry integration in Next.js.
|
||||
|
||||
### Uptime monitoring
|
||||
|
||||
Point an external monitor (UptimeRobot, Better Stack, etc.) at:
|
||||
|
||||
```
|
||||
https://planarchy.hartmut-noerenberg.com/api/health
|
||||
```
|
||||
|
||||
Alert if status code != 200 for more than 2 consecutive checks.
|
||||
|
||||
---
|
||||
|
||||
## 7. Troubleshooting
|
||||
|
||||
### CI job fails: "tsc --noEmit"
|
||||
|
||||
TypeScript error in the web app. Run locally:
|
||||
```bash
|
||||
pnpm --filter @planarchy/web exec tsc --noEmit
|
||||
```
|
||||
|
||||
### CI job fails: "test:unit"
|
||||
|
||||
Unit test failure. Run locally:
|
||||
```bash
|
||||
pnpm test:unit
|
||||
```
|
||||
|
||||
### CI job fails: "next build"
|
||||
|
||||
Build error (often `ssr: false` in Server Components, missing exports). Run locally:
|
||||
```bash
|
||||
pnpm --filter @planarchy/web exec next build
|
||||
```
|
||||
|
||||
### CI job fails: "e2e"
|
||||
|
||||
Playwright test failure. Check the HTML report artifact in the GitHub Actions run.
|
||||
|
||||
### Production: 502 Bad Gateway
|
||||
|
||||
The Next.js process isn't running. Check:
|
||||
```bash
|
||||
ss -tlnp | grep 3100 # Is anything listening?
|
||||
tail -50 /tmp/planarchy-dev.log # Check app logs
|
||||
```
|
||||
|
||||
Restart:
|
||||
```bash
|
||||
fuser -k 3100/tcp 2>/dev/null
|
||||
pnpm dev & # or pnpm start for production mode
|
||||
```
|
||||
|
||||
### Production: 500 Internal Server Error
|
||||
|
||||
Usually a stale Prisma client after schema changes:
|
||||
```bash
|
||||
pnpm --filter @planarchy/db exec prisma generate
|
||||
rm -rf apps/web/.next
|
||||
pnpm --filter @planarchy/web exec next build
|
||||
# Restart the server
|
||||
```
|
||||
|
||||
### Database connection issues
|
||||
|
||||
Check the `/api/ready` endpoint:
|
||||
```bash
|
||||
curl -s https://planarchy.hartmut-noerenberg.com/api/ready | jq .
|
||||
```
|
||||
|
||||
If `postgres: "error"`, verify:
|
||||
```bash
|
||||
docker ps | grep postgres # Is container running?
|
||||
psql -h localhost -p 5433 -U planarchy -d planarchy # Can you connect?
|
||||
```
|
||||
Reference in New Issue
Block a user