Files
HartOMat/ROADMAP.md
T
Hartmut cbffcfbf8b docs: record usd-core decision, add Dockerfile task 1.0
- Mark USD library question as decided: usd-core>=24.11 (pxr module)
- Add Task 1.0 to USD implementation plan: Dockerfile install step
- Add usd-core to Priority 2 file targets in ROADMAP

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 15:13:10 +01:00

391 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Schaeffler Automat — Master Roadmap
> **Consolidated:** 2026-03-11
> **Branch:** `refactor/v2`
> **Sources merged:** `PLAN.md` (Phases AF), `PLAN_REFACTOR.md` (Phases 18), `plan.md` (GMSH), `docs/rfcs/0001`, `visual-audit-report.md`
---
## ✅ What Is Done
| Area | Detail |
|---|---|
| Phase A | Flamenco removed, blender-renderer → render-worker Celery container, threejs-renderer removed, MinIO added |
| Phase B | Domain-driven project structure, Tenant model + RLS migrations 035/036, Tenant management UI |
| Phase C | WorkflowDefinition model, standard workflows seeded, React Flow Workflow Editor |
| Phase D | OCC mesh attributes extraction, Blender integration |
| Phase E | MediaAsset catalog (model + API + frontend) |
| Sharp Edges V02 | GCPnts curve sampling → 17,129 segment pairs in GLB extras → Blender KD-tree marks sharp+seam |
| Tessellation presets | Draft/Standard/Fine preset buttons in Admin UI, default deflections updated |
| Media cache-bust | `?v={file_size_bytes}` in download URLs + `Cache-Control: no-cache` headers |
| GPU activation order | Fix: `_activate_gpu()` called before AND after `open_mainfile` to survive engine reset |
| Material system | Aliases-first lookup, `get_material_library_path()` via AssetLibrary |
---
## 🗺 Open Work — Reconciled Priorities
This roadmap now treats the USD refactor as an implementation workstream, not as a blocked strategic idea.
The key architectural clarification from [docs/rfcs/0001-step-to-usd-workflow.md](/home/hartmut/Documents/Copilot/schaefflerautomat/docs/rfcs/0001-step-to-usd-workflow.md#L139) is:
- USD becomes the canonical persisted scene asset
- the browser does not need to render USD directly
- the current 3D viewer workflow is preserved through a derived preview asset plus canonical `partKey`
That removes the old assumption that USD work must wait for a Three.js USD loader.
### Priority 1 — Pipeline Cleanup Foundation
**Goal:** Reduce refactor risk by simplifying the current pipeline before introducing canonical scene concepts.
This priority combines dead-code deletion and task decomposition because both are prerequisites for a controlled cut-over to USD.
**Milestones:**
- M1: Dead code deleted — Pillow block, STL settings, orphaned directories
- M2: `step_tasks.py` decomposed into `backend/app/tasks/pipeline/` submodules
- M3: `blender_render.py` decomposed into `render-worker/scripts/_blender_*.py` submodules
**File targets:**
| Action | Path |
|---|---|
| DELETE lines 798851 | `render-worker/scripts/blender_render.py` (Pillow overlay, `else:` branch never runs) |
| DELETE | `blender-renderer/` directory |
| DELETE | `threejs-renderer/` directory |
| DELETE | `flamenco/` directory |
| DELETE | `renderproblems_tmp/` |
| DELETE lines 7051050 | `backend/app/tasks/step_tasks.py` (`render_order_line_task` — duplicates `rendering/tasks`) |
| REMOVE settings | `admin.py`: `VALID_STL_QUALITIES`, `stl_quality`, `generate-missing-stls` endpoint |
| REMOVE endpoint | `cad.py`: `POST /cad/{id}/generate-stl/{quality}` |
| CREATE | `backend/app/tasks/pipeline/extract.py` — metadata extraction (OCC parsing, < 2s) |
| CREATE | `backend/app/tasks/pipeline/thumbnail.py` — Blender thumbnail render |
| CREATE | `backend/app/tasks/pipeline/stills.py` — still render task |
| CREATE | `backend/app/tasks/pipeline/turntable.py` — turntable render task |
| THIN (< 80 lines) | `backend/app/tasks/step_tasks.py` — dispatch only |
| CREATE | `render-worker/scripts/_blender_gpu.py` |
| CREATE | `render-worker/scripts/_blender_import.py` |
| CREATE | `render-worker/scripts/_blender_materials.py` |
| CREATE | `render-worker/scripts/_blender_camera.py` |
| CREATE | `render-worker/scripts/_blender_scene.py` |
| THIN (< 80 lines) | `render-worker/scripts/blender_render.py` — entry point only |
**Acceptance gates:**
- `grep -r "VALID_STL_QUALITIES\|stl_quality\|from PIL\|from Pillow" backend/ render-worker/` → 0 matches
- `wc -l backend/app/tasks/step_tasks.py` → < 100 lines
- `wc -l render-worker/scripts/blender_render.py` → < 80 lines
- Upload `81113-l_cut.stp`, trigger thumbnail → still renders correctly (no regression)
- `ls blender-renderer/ threejs-renderer/ flamenco/` → all return "No such file"
### Priority 2 — USD Foundation Without Viewer Regression
**Goal:** Introduce canonical part identity and the three-layer material assignment model while keeping the current GLB-based browser UX working end-to-end.
**Milestones:**
- M1: `export_step_to_usd.py` produces valid USD with part hierarchy and `schaeffler:partKey` on every prim
- M2: `usd_master` MediaAsset type exists in DB and is stored after each export
- M3: `GET /api/cad/{id}/scene-manifest` returns partKey list with effective assignments
- M4: `PUT /api/cad/{id}/part-materials` accepts `{partKey → materialName}` map and persists it
- M5: Browser ThreeDViewer saves material overrides keyed by `partKey`, survives page reload
**File targets:**
| Action | Path |
|---|---|
| ADD pip install | `render-worker/Dockerfile``usd-core>=24.11` (provides `pxr` module) ✅ decided |
| CREATE | `render-worker/scripts/export_step_to_usd.py` — XCAF → USD, hierarchy + metadata + partKey |
| ADD enum value | `backend/app/domains/media/models.py``usd_master` to `MediaAssetType` |
| CREATE migration | `backend/alembic/versions/060_usd_master_asset_type.py` |
| ADD JSONB columns | `backend/app/domains/products/models.py``CadFile.source_material_assignments`, `resolved_material_assignments`, `manual_material_overrides` |
| CREATE migration | `backend/alembic/versions/061_material_assignment_layers.py` |
| CREATE | `backend/app/services/part_key_service.py``generate_part_key(xcaf_label)`, `build_scene_manifest()` |
| CREATE | `backend/app/domains/products/schemas.py``SceneManifest`, `PartEntry` Pydantic models |
| ADD endpoint | `backend/app/api/routers/cad.py``GET /cad/{id}/scene-manifest` |
| MODIFY endpoint | `backend/app/api/routers/cad.py``GET/PUT /cad/{id}/part-materials` → partKey-keyed |
| ADD task | `backend/app/domains/pipeline/tasks/export_glb.py``generate_usd_master_task` (dual-writes beside GLB) |
| CREATE | `frontend/src/api/sceneManifest.ts``SceneManifest` interface, `fetchSceneManifest()` |
**Open questions to decide before M1:**
- USD authoring library: `pxr` (OpenUSD full Python SDK) vs. plain USDA text templating vs. `usd-core` pip package
- seam/sharp payload encoding: custom primvars (`primvars:schaeffler:seamEdgeVertexPairs`) or a separate JSON sidecar
**Acceptance gates:**
- `python3 export_step_to_usd.py --step_path 81113-l_cut.stp` → valid `.usd` file, 25 part prims, each has `schaeffler:partKey` attribute
- `GET /api/cad/{id}/scene-manifest` returns `parts[]` array with `part_key`, `source_name`, `effective_material`, `is_unassigned`
- Click part in ThreeDViewer → assign material → reload page → material still assigned (persisted via `partKey`, not mesh name)
- CAD file with mismatched Excel names: UI shows `unmatched_source_rows` count > 0 and unassigned parts highlighted
- No regression: existing click/select/isolate/ghost/hide still works in browser
**References:**
- RFC: `docs/rfcs/0001-step-to-usd-workflow.md`
- Execution checklist: `docs/plans/0001-step-to-usd-implementation.md`
### Priority 3 — Tessellation and Topology Quality
**Goal:** Eliminate fan triangles on cylindrical surfaces (rings, bearings) and produce clean seams for UV unwrap.
**Milestones:**
- M1: GMSH 4.15+ installed in render-worker container
- M2: `export_step_to_gltf.py --tessellation_engine gmsh` produces fan-free GLB
- M3: `tessellation_engine` system setting wired through CLI → Admin UI dropdown
**File targets:**
| Action | Path |
|---|---|
| ADD pip install | `render-worker/Dockerfile``gmsh>=4.15.0` |
| ADD arg + function | `render-worker/scripts/export_step_to_gltf.py``--tessellation_engine`, `_tessellate_with_gmsh()` |
| ADD setting | `backend/app/api/routers/admin.py``tessellation_engine` in `SETTINGS_DEFAULTS` + `SettingsOut` |
| MODIFY task | `backend/app/domains/pipeline/tasks/export_glb.py` — read setting, pass to CLI |
| ADD UI | `frontend/src/pages/Admin.tsx` — dropdown: OCC vs. GMSH |
*(Full task breakdown in `plan.md`)*
**Acceptance gates:**
- `docker compose exec render-worker python3 -c "import gmsh; print(gmsh.__version__)"``4.15.x`
- `python3 export_step_to_gltf.py --step_path 81113-l_cut.stp --tessellation_engine gmsh` → no vertex with valence > 10 at cylinder seam edges (inspect via Blender Mesh Analysis overlay)
- Standard preset output size with GMSH ≤ 2× size with OCC at same deflection
- Sharp edge pairs still extracted and injected into GLB extras after GMSH tessellation
**Note:** Better tessellation directly benefits Priority 2 (USD seam/sharp payload) and Priority UV-unwrap work.
### Priority 4 — Viewer Migration to Canonical Part Identity
**Goal:** Move browser interactions from raw GLB mesh-name matching to canonical `partKey` without any UX regression.
**Milestones:**
- M1: Preview GLB derivation embeds `partKey` as mesh `extras.partKey` on every selectable object
- M2: `ThreeDViewer` reads `partKey` from scene manifest + mesh extras on click, no longer uses raw `mesh.name`
- M3: `MaterialPanel` shows `partKey`, source name, assignment provenance; saves overrides by `partKey`
- M4: Unmatched source rows and unassigned parts surfaced in `MaterialPanel` reconciliation section
**File targets:**
| Action | Path |
|---|---|
| CREATE | `frontend/src/api/sceneManifest.ts``SceneManifest` + `PartEntry` interfaces |
| MODIFY | `frontend/src/components/cad/ThreeDViewer.tsx` — use `partKey` from scene manifest for selection, isolation, ghost |
| MODIFY | `frontend/src/components/cad/MaterialPanel.tsx` — show provenance, unmatched/unassigned sections |
| MODIFY | `frontend/src/api/cad.ts` — update `PartMaterialsMap` interface to `{ [partKey: string]: string }` |
| MODIFY | `backend/app/api/routers/cad.py``GET/PUT /cad/{id}/part-materials` keyed by partKey with provenance |
| ADD util | `render-worker/scripts/export_step_to_gltf.py` — embed `partKey` into mesh extras during GLB export |
**Acceptance gates:**
- `mesh.userData.partKey` exists on every mesh object in the Three.js scene after GLB load
- Select a part → DevTools shows the assignment payload contains `part_key: "ring_outer"`, not `mesh_name: "RingOuter_AF0"`
- Upload file with 3 unmatched Excel rows → MaterialPanel shows "3 unmatched source rows"
- After full page reload: all manual material assignments are restored correctly
- Isolation, hide, and ghost still work as before (no regression)
### Priority 5 — Canonical USD Export and Render Migration
**Goal:** Switch Blender still/turntable renders to consume the canonical USD stage, retiring the production GLB as an intermediate render artifact.
**Milestones:**
- M1: `render-worker/scripts/import_usd.py` — Blender can import USD + restore seam/sharp from primvars
- M2: Still render from USD matches current production GLB render quality (side-by-side comparison)
- M3: Turntable render from USD works end-to-end
- M4: `generate_production_glb_task` bypassed; renders consume `usd_master` directly
**File targets:**
| Action | Path |
|---|---|
| CREATE | `render-worker/scripts/export_step_to_usd.py` — STEP→USD exporter (seam/sharp payload on mesh prims) |
| CREATE | `render-worker/scripts/import_usd.py` — Blender USD import helper: reads `primvars:schaeffler:seamEdgeVertexPairs`, marks seam+sharp |
| MODIFY | `render-worker/scripts/blender_render.py` — accept `--usd_path` flag alongside `--glb_path` |
| MODIFY | `backend/app/services/render_blender.py` — pass `usd_master` asset path when available |
| MODIFY | `backend/app/domains/pipeline/tasks/export_glb.py` — retire `generate_gltf_production_task` once USD path validated |
| KEEP (compat) | `render-worker/scripts/export_gltf.py` — retained as fallback until USD path confirmed stable |
**Acceptance gates:**
- `render_still --usd_path usd_master.usd` → PNG output visually identical to current production GLB render (diff tolerance < 5% SSIM)
- Blender log shows `[USD_IMPORT] 25 parts imported, 5044 seam/sharp edges restored` (not reconstructed by angle)
- `GET /api/media?asset_type=gltf_production` returns 0 new entries after switch (old records preserved)
- Turntable MP4 plays without texture or material pop artifacts
### Priority 6 — Admin and Product Surface Simplification
**Goal:** Remove the geometry-GLB vs production-GLB mental model from admin, product detail, and repair flows.
**Milestones:**
- M1: Admin tessellation settings collapsed from 4 knobs to `scene_*` + `preview_*` (2 or 3 knobs max)
- M2: Bulk actions renamed from `generate-missing-geometry-glbs``generate-missing-canonical-scenes`
- M3: ProductDetail shows single canonical scene status card, not dual-GLB
**File targets:**
| Action | Path |
|---|---|
| MODIFY | `backend/app/api/routers/admin.py` — rename/collapse tessellation settings; new bulk action labels |
| CREATE migration | `backend/alembic/versions/06x_rename_tessellation_settings.py` — UPDATE system_settings SET key = 'scene_linear_deflection' WHERE key = 'gltf_production_linear_deflection' |
| MODIFY | `frontend/src/pages/Admin.tsx` — simplified tessellation panel: scene quality + optional preview override |
| MODIFY | `frontend/src/pages/ProductDetail.tsx` — single canonical scene card (status, regenerate button) |
| MODIFY | `backend/app/domains/media/models.py` — deprecate `gltf_geometry` / `gltf_production` (keep values, add `deprecated=True` metadata) |
**Acceptance gates:**
- Admin Settings page has max 3 tessellation quality fields (down from 4)
- ProductDetail page: one "Canonical Scene" status section, no separate geometry/production rows
- `GET /api/admin/settings` no longer exposes `gltf_preview_linear_deflection` key (replaced by `scene_linear_deflection`)
### Priority 7 — Render Job Tracking and Structured Logging
**Goal:** Fix broken render job cancellation (synthetic `render-{line_id}` ID never matches real Celery task ID) and establish structured per-step logging.
**Milestones:**
- M1: `RenderJobDocument` schema + migration; tasks write real `self.request.id` to DB
- M2: Cancel endpoint reads `celery_task_id` from job doc and calls `revoke()` — actually stops task
- M3: `PipelineLogger` integrated in all task files; every step emits `[STEP] start/done/error` with duration
**File targets:**
| Action | Path |
|---|---|
| CREATE | `backend/app/domains/rendering/job_document.py``RenderJobDocument` Pydantic model, `update_step()`, `set_state()` |
| CREATE | `backend/app/core/pipeline_logger.py``PipelineLogger(step_start/done/error)` writing to logging + Redis SSE |
| CREATE migration | `backend/alembic/versions/062_render_job_document.py` — add `render_job_doc JSONB` to `order_lines` |
| MODIFY | `backend/app/domains/pipeline/tasks/render_order_line.py` — write `celery_task_id` + step events to job doc |
| MODIFY | `backend/app/domains/pipeline/tasks/render_thumbnail.py` — same |
| MODIFY | `backend/app/api/routers/orders.py` — cancel reads `render_job_doc.celery_task_id`, calls `celery.control.revoke()` |
**Acceptance gates:**
- Start a 60s render task → click Cancel → `celery inspect active` shows task is gone within 15s
- `GET /api/orders/{id}/lines/{line_id}` response includes `render_job_doc.steps[]` with per-step `duration_s`
- Worker log shows `[THUMBNAIL] done in 34.2s` format (not bare f-strings)
### Priority 8 — Tenant Isolation Completion
**Goal:** Make PostgreSQL RLS enforcement real. Currently `build_tenant_db_dep()` yields `db` without calling `SET LOCAL app.current_tenant_id`, making all tenant isolation a silent no-op.
**Milestones:**
- M1: `TenantContextMiddleware` registered; all HTTP requests set RLS context from JWT
- M2: All Celery tasks call `set_tenant_context(db, tenant_id)` at task start
- M3: `global_admin` + `tenant_admin` roles in DB; `require_admin()``require_global_admin()`
**File targets:**
| Action | Path |
|---|---|
| CREATE | `backend/app/core/middleware.py``TenantContextMiddleware(BaseHTTPMiddleware)` |
| MODIFY | `backend/app/main.py``app.add_middleware(TenantContextMiddleware)` |
| MODIFY | `backend/app/utils/auth.py``create_access_token()` embeds `tenant_id` in JWT claims |
| MODIFY | `backend/app/tasks/pipeline/thumbnail.py`, `extract.py`, `stills.py`, `turntable.py``set_tenant_context()` at start |
| CREATE migration | `backend/alembic/versions/063_role_hierarchy.py` — rename `admin``global_admin`, add `tenant_admin` |
| MODIFY | All routers using `require_admin()``require_global_admin()` |
**Acceptance gates:**
- Login as tenant A user → `GET /api/products` returns 0 results when tenant A has no products, even if tenant B has 50
- Verify via: `SELECT count(*) FROM products WHERE tenant_id != '<tenantA_id>'` returns 0 from within tenant A session
- Celery task logs show `[TENANT] context set: tenant_id=<uuid>` at start
- `GET /api/admin/users` returns 403 for `tenant_admin` role (only `global_admin` can list all users)
### Priority 9 — Hash-Based Scene Conversion Caching
**Goal:** Skip re-tessellation when the STEP file has not changed. Cache canonical scene + preview derivatives by `SHA256(step_file)`.
**Milestones:**
- M1: `cad_files.step_hash` column in DB; hash computed and stored on each export
- M2: Export task checks hash before processing — returns cached asset UUID on hit
- M3: Hash invalidated correctly when admin forces reprocess or deflection settings change
**File targets:**
| Action | Path |
|---|---|
| ADD column | `backend/app/domains/products/models.py``CadFile.step_hash: str \| None` |
| CREATE migration | `backend/alembic/versions/064_step_hash.py``ADD COLUMN step_hash VARCHAR(64)` |
| MODIFY | `backend/app/domains/pipeline/tasks/export_glb.py` (or future USD task) — hash check before subprocess call |
| ADD util | `backend/app/services/step_processor.py``compute_step_hash(file_path) -> str` |
**Acceptance gates:**
- Upload same STEP file twice → second task completes in < 2s (cache hit logged: `[CACHE] hash match, skipping tessellation`)
- Change deflection setting → force reprocess → new export runs fresh (hash same but settings changed, cache bypassed)
- `GET /api/cad/{id}` response includes `step_hash` field
### Priority 10 — UI/UX Polish
**Goal:** Address independent UI items from `visual-audit-report.md` that require no backend changes.
**Milestones:**
- M1: Tooltip/help text on every Admin settings input
- M2: Empty state messages in MediaBrowser, ProductLibrary, Orders
- M3: Notification batching — group per-render noise into job summaries
- M4: Mobile navigation — hamburger menu at < 768px
- M5: Kanban rejection flow — drag-to-reject with reason field
**File targets:**
| Action | Path |
|---|---|
| MODIFY | `frontend/src/pages/Admin.tsx``title` attributes on all inputs; help text below complex settings |
| MODIFY | `frontend/src/pages/MediaBrowser.tsx` — empty state: "No assets yet — upload a STEP file to get started" |
| MODIFY | `frontend/src/pages/ProductLibrary.tsx` — empty state |
| MODIFY | `frontend/src/pages/Orders.tsx` — empty state |
| MODIFY | `frontend/src/components/layout/Layout.tsx` — hamburger + slide-in nav for mobile |
| MODIFY | `frontend/src/pages/OrderDetail.tsx` — reject button with reason modal |
| CREATE | `frontend/src/components/shared/Tooltip.tsx` — reusable tooltip wrapper |
**Acceptance gates:**
- All Admin settings inputs have visible help text or tooltip (manual check: no input label without explanation)
- Mobile viewport (375px): no horizontal scroll, nav accessible via hamburger
- Submit a render → NotificationCenter shows one "Render complete (3 files)" summary, not 3 individual toasts
---
## Dependency Graph
```
Priority 1 (Pipeline Cleanup Foundation)
└── Priority 2 (USD Foundation Without Viewer Regression)
└── Priority 4 (Viewer Migration to Canonical Part Identity)
└── Priority 5 (Canonical USD Export and Render Migration)
└── Priority 6 (Admin and Product Surface Simplification)
└── Priority 9 (Hash-Based Scene Conversion Caching)
Priority 3 (Tessellation and Topology Quality)
└── Priority 5 (Canonical USD Export and Render Migration)
Priority 7 (Render Job Tracking and Structured Logging) — can run in parallel
Priority 8 (Tenant Isolation Completion) — can run in parallel
Priority 10 (UI/UX Polish) — independent
```
---
## What To Do Next
**Recommended execution path:**
1. Do Priority 1 first: clean up and split the current pipeline.
2. Start Priority 2 immediately after: add `partKey`, assignment-layer semantics, and scene manifest without changing the browser UX.
3. Run Priority 3 in parallel or immediately after, depending on whether current tessellation quality blocks scene-authoring confidence.
4. Use the implementation plan in `docs/plans/0001-step-to-usd-implementation.md` as the execution checklist for the USD workstream.
**Parallel sprint option (2 agents):**
- Agent 1: Priority 1 (pipeline cleanup foundation)
- Agent 2: Priority 3 (tessellation and topology quality)
**Parallel sprint option (3 agents):**
- Agent 1: Priority 1
- Agent 2: Priority 3
- Agent 3: Priority 7 or Priority 8
**Do not defer anymore:**
- canonical `partKey`
- part-keyed browser material overrides
- scene manifest / preview contract
These are now considered implementation prerequisites for the long-term refactor, not optional strategy work.
---
## Archive
Old planning files are kept for reference but superseded by this document:
- `PLAN.md` — original Phase AF plan (Phases AE complete, Phase F = Priority 9 here)
- `PLAN_REFACTOR.md` — 1,173-line architectural plan (Phases 18 mapped to Priorities 28 above)
- `plan.md` — active GMSH implementation plan (Priority 1)
- `docs/rfcs/0001-step-to-usd-workflow.md` — USD RFC (Priority 10)
- `docs/plans/0001-step-to-usd-implementation.md` — actionable USD implementation plan
- `review-report.md` — latest code review results
- `visual-audit-report.md` — UX audit results