docs: consolidate all plans into ROADMAP.md

Merges PLAN.md (Phases A-F), PLAN_REFACTOR.md (Phases 1-8), plan.md (GMSH),
docs/rfcs/0001, and visual-audit-report findings into a single prioritized roadmap.

10 priorities with dependency graph and 'what to do next' decision options.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-11 14:42:53 +01:00
parent ca62319688
commit 208370628e
+260
View File
@@ -0,0 +1,260 @@
# Schaeffler Automat — Master Roadmap
> **Consolidated:** 2026-03-11
> **Branch:** `refactor/v2`
> **Sources merged:** `PLAN.md` (Phases AF), `PLAN_REFACTOR.md` (Phases 18), `plan.md` (GMSH), `docs/rfcs/0001`, `visual-audit-report.md`
---
## ✅ What Is Done
| Area | Detail |
|---|---|
| Phase A | Flamenco removed, blender-renderer → render-worker Celery container, threejs-renderer removed, MinIO added |
| Phase B | Domain-driven project structure, Tenant model + RLS migrations 035/036, Tenant management UI |
| Phase C | WorkflowDefinition model, standard workflows seeded, React Flow Workflow Editor |
| Phase D | OCC mesh attributes extraction, Blender integration |
| Phase E | MediaAsset catalog (model + API + frontend) |
| Sharp Edges V02 | GCPnts curve sampling → 17,129 segment pairs in GLB extras → Blender KD-tree marks sharp+seam |
| Tessellation presets | Draft/Standard/Fine preset buttons in Admin UI, default deflections updated |
| Media cache-bust | `?v={file_size_bytes}` in download URLs + `Cache-Control: no-cache` headers |
| GPU activation order | Fix: `_activate_gpu()` called before AND after `open_mainfile` to survive engine reset |
| Material system | Aliases-first lookup, `get_material_library_path()` via AssetLibrary |
---
## 🗺 Open Work — Prioritized
### Priority 1 — Tessellation Quality (Blocking visual quality)
**Goal:** Eliminate fan triangles and faceting on cylindrical surfaces (rings, bearings).
**Plan:** `plan.md` (6 tasks, GMSH Frontal-Delaunay as BRepMesh replacement)
| Task | File | What |
|---|---|---|
| T1 | `render-worker/Dockerfile` | `pip install gmsh>=4.15.0` |
| T2 | `export_step_to_gltf.py` | `--tessellation_engine occ\|gmsh` CLI arg |
| T3 | `export_step_to_gltf.py` | `_tessellate_with_gmsh()`: BREP → GMSH → Poly_Triangulation write-back |
| T4 | `admin.py` | `tessellation_engine` setting in SETTINGS_DEFAULTS + SettingsOut |
| T5 | `export_glb.py` | Read setting, pass `--tessellation_engine` to CLI |
| T6 | `Admin.tsx` | Dropdown: OCC vs. GMSH with description |
**Risk:** GMSH Surface-Tag ↔ OCC Face mapping must be verified experimentally.
**Status:** Not started. `/implement` to begin.
---
### Priority 2 — Dead Code Deletion (Quick wins, no risk)
**Goal:** Remove code that is provably dead and confuses future readers.
| Item | Location | Why Dead |
|---|---|---|
| Pillow overlay block | `blender_render.py` lines 798851 | `transparent_bg=True` always, `else:` branch never runs |
| STL workflow | `admin.py`, `cad.py`, multiple tasks | Pipeline is GLB-only; `stl_quality`, `VALID_STL_QUALITIES`, `stl_size_bytes` all orphaned |
| `render_order_line_task` in step_tasks | `step_tasks.py` lines 7051050 | Duplicates `rendering/tasks.render_order_line_still_task` |
| `blender-renderer/` directory | repo root | Removed from docker-compose.yml already |
| `threejs-renderer/` directory | repo root | Migration 033 removed it from services |
| `flamenco/` directory | repo root | Migration 032 removed Flamenco |
| `renderproblems_tmp/` directory | repo root | Temp debugging screenshots, not code |
**Estimated effort per agent:** 1 session
**Can run in parallel with Priority 3.**
---
### Priority 3 — Celery Task Decomposition (Maintainability, parallel-safe)
**Goal:** Split `step_tasks.py` (1,170 lines, 8+ pipeline steps) into focused modules.
**Target structure:**
```
backend/app/tasks/
├── step_tasks.py → keep only: process_step_file (thin dispatch)
├── pipeline/
│ ├── extract.py → extract_cad_metadata (OCC parsing, 0.1s)
│ ├── thumbnail.py → render_step_thumbnail (Blender call)
│ ├── stills.py → render_order_line_still_task (Cycles still renders)
│ └── turntable.py → render_order_line_turntable_task
```
Also: split `render-worker/scripts/blender_render.py` (853 lines) into sub-modules:
```
render-worker/scripts/
├── blender_render.py → entry point only (~80 lines)
├── _blender_gpu.py → GPU probe + activation
├── _blender_import.py → GLB import, rotation, smooth shading
├── _blender_materials.py → material library application + fallback
├── _blender_camera.py → auto camera from bbox, clip planes
└── _blender_scene.py → scene setup (Mode A vs Mode B)
```
**Estimated effort:** 2 sessions
**Depends on:** Priority 2 (delete duplicate task first)
---
### Priority 4 — Render Job Tracking (Correctness bug)
**Goal:** Fix the broken render job cancellation (synthetic `render-{line_id}` ID never matches real Celery task ID → `revoke()` is a no-op).
**What to build:**
- `RenderJobDocument` Pydantic schema stored in `order_lines.render_job_doc` (JSONB)
- Fields: `celery_task_id`, `state` FSM, `steps[]` with timing, `gpu_info`
- Migration: `alembic revision` — add `render_job_doc JSONB` to `order_lines`
- Update render tasks to write real `self.request.id` into the document
- Fix `orders.py` cancel endpoint to read `celery_task_id` from the document
**Files:**
- New: `backend/app/domains/rendering/job_document.py`
- New migration `06x_render_job_document.py`
- Modified: `render_order_line.py`, `render_thumbnail.py`, `orders.py`
**Estimated effort:** 1 session
**Depends on:** Priority 3 (need clean task structure first)
---
### Priority 5 — Structured Logging (Observability)
**Goal:** Replace inconsistent `logger.info(f"...")` / `emit()` / `log_task_event()` mix with a `PipelineLogger` that writes to Python logging + Redis SSE + DB.
**What to build:**
- `backend/app/core/pipeline_logger.py``PipelineLogger` class
- `step_start(step, context)``[STEP_NAME] starting — context`
- `step_done(step, duration_s, result)``[STEP_NAME] done in 0.34s`
- `step_error(step, error, exc)``[STEP_NAME] ERROR: ...`
- Optional new table: `pipeline_events(task_id, step_name, level, message, duration_s, context JSONB)`
- Migrate all task files to use PipelineLogger
**Estimated effort:** 1-2 sessions
**Can start immediately, but has broad blast radius (touches all task files).**
---
### Priority 6 — Tenant Isolation Completion (Security/correctness)
**Goal:** Make RLS actually work. Currently `build_tenant_db_dep()` yields `db` without calling `set_tenant_context()`, so all tenant isolation is silent no-op.
**What to build:**
- `TenantContextMiddleware` in `backend/app/core/middleware.py`
- Extracts `tenant_id` from JWT, stores in `request.state`
- After DB session acquired: `SET LOCAL app.current_tenant_id = '...'`
- Celery tasks: add `set_tenant_context(db, tenant_id)` call at start of each task (Celery bypasses HTTP middleware)
- Role hierarchy migration: `admin``global_admin`, add `tenant_admin`
**Files:**
- New: `backend/app/core/middleware.py`
- Migration: `06x_role_hierarchy.py`
- Modified: `backend/app/main.py` (register middleware), `utils/auth.py`, all routers
**Estimated effort:** 2 sessions
**Depends on:** Stable auth layer (Priority 3 dead code removal first)
---
### Priority 7 — UV Unwrap Workflow (New feature, user-requested)
**Goal:** Produce UV-unwrapped geometry GLBs with clean seams for downstream texture authoring.
**Depends on:** Priority 1 (GMSH tessellation must produce conforming seams first).
**What to build:**
- New Blender script `_blender_uvunwrap.py` called from `export_gltf.py` after sharp edge marking
- UV unwrap via `bpy.ops.uv.unwrap(method='ANGLE_BASED', margin=0.001)`
- Seams are already set by `_apply_sharp_edges_from_occ()` (edge.seam=True)
- UV coordinates embedded in the production GLB
- Admin toggle: `uv_unwrap_enabled` in system_settings
**Estimated effort:** 1 session
**Depends on:** Priority 1 (GMSH) for clean seams
---
### Priority 8 — UI/UX Polish (from visual-audit-report.md)
**Top actionable items from the audit:**
| Item | Where | Fix |
|---|---|---|
| Tooltip system | All settings pages | Add `title` or tooltip component to every input |
| Empty state messages | MediaBrowser, ProductLibrary | "No assets yet — upload a STEP file" |
| Notification batching | NotificationCenter | Group per-render noise into job summaries |
| Mobile navigation | Layout.tsx | Hamburger menu for viewport < 768px |
| Kanban rejection flow | OrderDetail | Drag-to-reject with reason field |
**Estimated effort:** 2 sessions
**Independent of all backend priorities.**
---
### Priority 9 — Phase F: Hash-based Conversion Caching (Performance)
**Goal:** Skip re-tessellation if STEP file hash hasn't changed. Cache geometry GLB by `sha256(step_file)`.
**What to build:**
- `cad_files.step_hash` column (SHA256, nullable)
- Before GLB generation: compute hash, check if cached GLB exists for same hash
- If hit: copy cached GLB, skip OCC+GMSH, skip Blender
- Migration: `06x_step_hash_column.py`
**Estimated effort:** 1 session
**Depends on:** Priority 1 (GMSH) stable first
---
### Priority 10 — USD Workflow RFC (Strategic, long-term)
**Document:** `docs/rfcs/0001-step-to-usd-workflow.md`
**Status:** Proposed only — not planned for implementation yet.
**Summary:** Replace dual-GLB pipeline (geometry GLB → production GLB) with a single USD canonical scene. Three.js has no USD loader, so this requires either switching viewers or waiting for Three.js USD support.
**Decision needed:** Is the viewer constraint acceptable? Deferred until Priority 14 are complete.
---
## Dependency Graph
```
Priority 1 (GMSH)
└── Priority 7 (UV Unwrap)
└── Priority 9 (Hash Cache)
Priority 2 (Dead Code)
└── Priority 3 (Task Decomposition)
└── Priority 4 (Render Job Tracking)
└── Priority 6 (Tenant Isolation)
Priority 5 (Logging) — independent, can start anytime
Priority 8 (UI/UX) — independent, can start anytime
Priority 10 (USD) — deferred
```
---
## What To Do Next
**Option A — Fix visual quality first:**
`/implement` on `plan.md` (Priority 1: GMSH tessellation)
**Option B — Clean up dead code first (low risk, fast wins):**
→ Start Priority 2 dead code deletion, then Priority 3 task decomposition
**Option C — Parallel sprint (2 agents):**
→ Agent 1: Priority 1 (GMSH) in worktree
→ Agent 2: Priority 2+3 (dead code + task split) in separate worktree
**Option D — UI/UX sprint:**
→ Priority 8 audit items, completely independent of backend
---
## Archive
Old planning files are kept for reference but superseded by this document:
- `PLAN.md` — original Phase AF plan (Phases AE complete, Phase F = Priority 9 here)
- `PLAN_REFACTOR.md` — 1,173-line architectural plan (Phases 18 mapped to Priorities 28 above)
- `plan.md` — active GMSH implementation plan (Priority 1)
- `docs/rfcs/0001-step-to-usd-workflow.md` — USD RFC (Priority 10)
- `review-report.md` — latest code review results
- `visual-audit-report.md` — UX audit results