Files
HartOMat/ROADMAP.md
T
2026-03-11 21:51:38 +01:00

25 KiB
Raw Blame History

Schaeffler Automat — Master Roadmap

Consolidated: 2026-03-11 Branch: refactor/v2 Sources merged: PLAN.md (Phases AF), PLAN_REFACTOR.md (Phases 18), plan.md (GMSH), docs/rfcs/0001, visual-audit-report.md


What Is Done

Area Detail
Phase A runtime Flamenco removed from the active pipeline, render-worker replaced the old Blender service path, MinIO added; legacy blender-renderer/ and threejs-renderer/ directories still remain in the repo
Phase B Domain-driven project structure, Tenant model + RLS migrations 035/036, Tenant management UI
Phase C WorkflowDefinition model, standard workflows seeded, React Flow Workflow Editor
Phase D OCC mesh attributes extraction, Blender integration
Phase E MediaAsset catalog (model + API + frontend)
Sharp Edges V02 GCPnts curve sampling → 17,129 segment pairs in GLB extras → Blender KD-tree marks sharp+seam
Tessellation presets Draft/Standard/Fine preset buttons in Admin UI, default deflections updated
Media cache-bust ?v={file_size_bytes} in download URLs + Cache-Control: no-cache headers
GPU activation order Fix: _activate_gpu() called before AND after open_mainfile to survive engine reset
Material system Aliases-first lookup, get_material_library_path() via AssetLibrary
Pipeline task split backend/app/tasks/step_tasks.py is now a 23-line compatibility shim; active task implementations live in backend/app/domains/pipeline/tasks/
Render job tracking RenderJobDocument, PipelineLogger, and cancel-via-real-celery_task_id are already wired into the render pipeline
Tenant isolation baseline TenantContextMiddleware, JWT tenant_id, and the global_admin / tenant_admin role hierarchy are in place for HTTP requests
Hash groundwork compute_step_hash() exists and CadFile.step_file_hash is already persisted during thumbnail processing

🔎 Status Snapshot

Verified against the repository on 2026-03-11.

Priority Status Re-evaluated state
1. Pipeline Cleanup Foundation In progress step_tasks.py decomposition is done; dead-code cleanup and blender_render.py decomposition are still open
2. USD Foundation Without Viewer Regression Not started in code Decisions are documented, but there is no export_step_to_usd.py, usd_master, scene-manifest, or partKey implementation yet
3. Tessellation and Topology Quality Not started in code No gmsh install/wiring, no tessellation_engine setting, no Admin dropdown yet
7. Render Job Tracking and Structured Logging Done RenderJobDocument, migration 048, PipelineLogger, and revoke-by-real-task-id are present
8. Tenant Isolation Completion In progress HTTP-side RLS context is wired; Celery task-side set_tenant_context() propagation still needs to be added
9. Hash-Based Scene Conversion Caching Partial foundation Existing step_file_hash and STL-cache utilities should be extended, not rebuilt from scratch
10. UI/UX Polish Partial Admin help tooltips, mobile nav, and some empty states exist; notification batching and remaining polish items are still open

🗺 Open Work — Reconciled Priorities

This roadmap now treats the USD refactor as an implementation workstream, not as a blocked strategic idea.

The key architectural clarification from docs/rfcs/0001-step-to-usd-workflow.md is:

  • USD becomes the canonical persisted scene asset
  • the browser does not need to render USD directly
  • the current 3D viewer workflow is preserved through a derived preview asset plus canonical partKey

That removes the old assumption that USD work must wait for a Three.js USD loader.

Priority 1 — Pipeline Cleanup Foundation

Goal: Reduce refactor risk by simplifying the current pipeline before introducing canonical scene concepts.

This priority combines dead-code deletion and task decomposition because both are prerequisites for a controlled cut-over to USD.

Status: In progress. M2 is complete; M1 and M3 remain open.

Milestones:

  • M1: Dead code deleted — Pillow block, STL settings, orphaned directories
  • M2: step_tasks.py decomposed into backend/app/domains/pipeline/tasks/ submodules
  • M3: blender_render.py decomposed into render-worker/scripts/_blender_*.py submodules

File targets:

Action Path
DELETE lines 798851 render-worker/scripts/blender_render.py (Pillow overlay, else: branch never runs)
DELETE blender-renderer/ directory
DELETE threejs-renderer/ directory
DELETE flamenco/ directory
DELETE renderproblems_tmp/
DONE backend/app/tasks/step_tasks.py — reduced to compatibility shim only
REMOVE settings admin.py: VALID_STL_QUALITIES, stl_quality, generate-missing-stls endpoint
REMOVE endpoint cad.py: POST /cad/{id}/generate-stl/{quality}
DONE backend/app/domains/pipeline/tasks/extract_metadata.py — metadata extraction
DONE backend/app/domains/pipeline/tasks/render_thumbnail.py — thumbnail render task
DONE (combined) backend/app/domains/pipeline/tasks/render_order_line.py — still/dispatch pipeline entry
OPEN turntable-specific pipeline task split still needs to be carved out explicitly if kept as a separate concern
THIN (< 80 lines) backend/app/tasks/step_tasks.py — dispatch only
CREATE render-worker/scripts/_blender_gpu.py
CREATE render-worker/scripts/_blender_import.py
CREATE render-worker/scripts/_blender_materials.py
CREATE render-worker/scripts/_blender_camera.py
CREATE render-worker/scripts/_blender_scene.py
THIN (< 80 lines) render-worker/scripts/blender_render.py — entry point only

Acceptance gates:

  • grep -r "VALID_STL_QUALITIES\|stl_quality\|from PIL\|from Pillow" backend/ render-worker/ → 0 matches
  • wc -l backend/app/tasks/step_tasks.py → < 100 lines
  • wc -l render-worker/scripts/blender_render.py → < 80 lines
  • Upload 81113-l_cut.stp, trigger thumbnail → still renders correctly (no regression)
  • ls blender-renderer/ threejs-renderer/ flamenco/ → all return "No such file"

Priority 2 — USD Foundation Without Viewer Regression

Goal: Introduce canonical part identity and the three-layer material assignment model while keeping the current GLB-based browser UX working end-to-end.

Status: Not started in code. Architecture decisions are documented, but repo work has not begun.

Milestones:

  • M1: export_step_to_usd.py produces valid USD with part hierarchy and schaeffler:partKey on every prim
  • M2: usd_master MediaAsset type exists in DB and is stored after each export
  • M3: GET /api/cad/{id}/scene-manifest returns partKey list with effective assignments
  • M4: PUT /api/cad/{id}/part-materials accepts {partKey → materialName} map and persists it
  • M5: Browser ThreeDViewer saves material overrides keyed by partKey, survives page reload

File targets:

Action Path
ADD pip install render-worker/Dockerfileusd-core>=24.11 (provides pxr module) decided
CREATE render-worker/scripts/export_step_to_usd.py — XCAF → USD, hierarchy + metadata + partKey
ADD enum value backend/app/domains/media/models.pyusd_master to MediaAssetType
CREATE migration backend/alembic/versions/060_usd_master_asset_type.py
ADD JSONB columns backend/app/domains/products/models.pyCadFile.source_material_assignments, resolved_material_assignments, manual_material_overrides
CREATE migration backend/alembic/versions/061_material_assignment_layers.py
CREATE backend/app/services/part_key_service.pygenerate_part_key(xcaf_label), build_scene_manifest()
CREATE backend/app/domains/products/schemas.pySceneManifest, PartEntry Pydantic models
ADD endpoint backend/app/api/routers/cad.pyGET /cad/{id}/scene-manifest
MODIFY endpoint backend/app/api/routers/cad.pyGET/PUT /cad/{id}/part-materials → partKey-keyed
ADD task backend/app/domains/pipeline/tasks/export_glb.pygenerate_usd_master_task (dual-writes beside GLB)
CREATE frontend/src/api/sceneManifest.tsSceneManifest interface, fetchSceneManifest()

Open questions to decide before M1:

  • None blocking at the architecture level. The roadmap decisions for usd-core and index-space seam/sharp primvars are already captured in docs/plans/0001-step-to-usd-implementation.md.

Acceptance gates:

  • python3 export_step_to_usd.py --step_path 81113-l_cut.stp → valid .usd file, 25 part prims, each has schaeffler:partKey attribute
  • GET /api/cad/{id}/scene-manifest returns parts[] array with part_key, source_name, effective_material, is_unassigned
  • Click part in ThreeDViewer → assign material → reload page → material still assigned (persisted via partKey, not mesh name)
  • CAD file with mismatched Excel names: UI shows unmatched_source_rows count > 0 and unassigned parts highlighted
  • No regression: existing click/select/isolate/ghost/hide still works in browser

References:

  • RFC: docs/rfcs/0001-step-to-usd-workflow.md
  • Execution checklist: docs/plans/0001-step-to-usd-implementation.md

Priority 3 — Tessellation and Topology Quality

Goal: Eliminate fan triangles on cylindrical surfaces (rings, bearings) and produce clean seams for UV unwrap.

Status: Not started in code. This is still a pure planning workstream at the moment.

Milestones:

  • M1: GMSH 4.15+ installed in render-worker container
  • M2: export_step_to_gltf.py --tessellation_engine gmsh produces fan-free GLB
  • M3: tessellation_engine system setting wired through CLI → Admin UI dropdown

File targets:

Action Path
ADD pip install render-worker/Dockerfilegmsh>=4.15.0
ADD arg + function render-worker/scripts/export_step_to_gltf.py--tessellation_engine, _tessellate_with_gmsh()
ADD setting backend/app/api/routers/admin.pytessellation_engine in SETTINGS_DEFAULTS + SettingsOut
MODIFY task backend/app/domains/pipeline/tasks/export_glb.py — read setting, pass to CLI
ADD UI frontend/src/pages/Admin.tsx — dropdown: OCC vs. GMSH

(Full task breakdown in plan.md)

Acceptance gates:

  • docker compose exec render-worker python3 -c "import gmsh; print(gmsh.__version__)"4.15.x
  • python3 export_step_to_gltf.py --step_path 81113-l_cut.stp --tessellation_engine gmsh → no vertex with valence > 10 at cylinder seam edges (inspect via Blender Mesh Analysis overlay)
  • Standard preset output size with GMSH ≤ 2× size with OCC at same deflection
  • Sharp edge pairs still extracted and injected into GLB extras after GMSH tessellation

Note: Better tessellation directly benefits Priority 2 (USD seam/sharp payload) and Priority UV-unwrap work.

Priority 4 — Viewer Migration to Canonical Part Identity

Goal: Move browser interactions from raw GLB mesh-name matching to canonical partKey without any UX regression.

Milestones:

  • M1: Preview GLB derivation embeds partKey as mesh extras.partKey on every selectable object
  • M2: ThreeDViewer reads partKey from scene manifest + mesh extras on click, no longer uses raw mesh.name
  • M3: MaterialPanel shows partKey, source name, assignment provenance; saves overrides by partKey
  • M4: Unmatched source rows and unassigned parts surfaced in MaterialPanel reconciliation section

File targets:

Action Path
CREATE frontend/src/api/sceneManifest.tsSceneManifest + PartEntry interfaces
MODIFY frontend/src/components/cad/ThreeDViewer.tsx — use partKey from scene manifest for selection, isolation, ghost
MODIFY frontend/src/components/cad/MaterialPanel.tsx — show provenance, unmatched/unassigned sections
MODIFY frontend/src/api/cad.ts — update PartMaterialsMap interface to { [partKey: string]: string }
MODIFY backend/app/api/routers/cad.pyGET/PUT /cad/{id}/part-materials keyed by partKey with provenance
ADD util render-worker/scripts/export_step_to_gltf.py — embed partKey into mesh extras during GLB export

Acceptance gates:

  • mesh.userData.partKey exists on every mesh object in the Three.js scene after GLB load
  • Select a part → DevTools shows the assignment payload contains part_key: "ring_outer", not mesh_name: "RingOuter_AF0"
  • Upload file with 3 unmatched Excel rows → MaterialPanel shows "3 unmatched source rows"
  • After full page reload: all manual material assignments are restored correctly
  • Isolation, hide, and ghost still work as before (no regression)

Priority 5 — Canonical USD Export and Render Migration

Goal: Switch Blender still/turntable renders to consume the canonical USD stage, retiring the production GLB as an intermediate render artifact.

Milestones:

  • M1: render-worker/scripts/import_usd.py — Blender can import USD + restore seam/sharp from primvars
  • M2: Still render from USD matches current production GLB render quality (side-by-side comparison)
  • M3: Turntable render from USD works end-to-end
  • M4: generate_production_glb_task bypassed; renders consume usd_master directly

File targets:

Action Path
CREATE render-worker/scripts/export_step_to_usd.py — STEP→USD exporter (seam/sharp payload on mesh prims)
CREATE render-worker/scripts/import_usd.py — Blender USD import helper: reads primvars:schaeffler:seamEdgeVertexPairs, marks seam+sharp
MODIFY render-worker/scripts/blender_render.py — accept --usd_path flag alongside --glb_path
MODIFY backend/app/services/render_blender.py — pass usd_master asset path when available
MODIFY backend/app/domains/pipeline/tasks/export_glb.py — retire generate_gltf_production_task once USD path validated
KEEP (compat) render-worker/scripts/export_gltf.py — retained as fallback until USD path confirmed stable

Acceptance gates:

  • render_still --usd_path usd_master.usd → PNG output visually identical to current production GLB render (diff tolerance < 5% SSIM)
  • Blender log shows [USD_IMPORT] 25 parts imported, 5044 seam/sharp edges restored (not reconstructed by angle)
  • GET /api/media?asset_type=gltf_production returns 0 new entries after switch (old records preserved)
  • Turntable MP4 plays without texture or material pop artifacts

Priority 6 — Admin and Product Surface Simplification

Goal: Remove the geometry-GLB vs production-GLB mental model from admin, product detail, and repair flows.

Milestones:

  • M1: Admin tessellation settings collapsed from 4 knobs to scene_* + preview_* (2 or 3 knobs max)
  • M2: Bulk actions renamed from generate-missing-geometry-glbsgenerate-missing-canonical-scenes
  • M3: ProductDetail shows single canonical scene status card, not dual-GLB

File targets:

Action Path
MODIFY backend/app/api/routers/admin.py — rename/collapse tessellation settings; new bulk action labels
CREATE migration backend/alembic/versions/06x_rename_tessellation_settings.py — UPDATE system_settings SET key = 'scene_linear_deflection' WHERE key = 'gltf_production_linear_deflection'
MODIFY frontend/src/pages/Admin.tsx — simplified tessellation panel: scene quality + optional preview override
MODIFY frontend/src/pages/ProductDetail.tsx — single canonical scene card (status, regenerate button)
MODIFY backend/app/domains/media/models.py — deprecate gltf_geometry / gltf_production (keep values, add deprecated=True metadata)

Acceptance gates:

  • Admin Settings page has max 3 tessellation quality fields (down from 4)
  • ProductDetail page: one "Canonical Scene" status section, no separate geometry/production rows
  • GET /api/admin/settings no longer exposes gltf_preview_linear_deflection key (replaced by scene_linear_deflection)

Priority 7 — Render Job Tracking and Structured Logging

Goal: Fix broken render job cancellation (synthetic render-{line_id} ID never matches real Celery task ID) and establish structured per-step logging.

Status: Done, aside from any follow-up polish.

Milestones:

  • M1: RenderJobDocument schema + migration; tasks write real self.request.id to DB
  • M2: Cancel endpoint reads celery_task_id from job doc and calls revoke() — actually stops task
  • M3: PipelineLogger integrated in all task files; every step emits [STEP] start/done/error with duration

File targets:

Action Path
CREATE backend/app/domains/rendering/job_document.pyRenderJobDocument Pydantic model, update_step(), set_state()
CREATE backend/app/core/pipeline_logger.pyPipelineLogger(step_start/done/error) writing to logging + Redis SSE
DONE backend/alembic/versions/048_render_job_document.py — adds render_job_doc JSONB to order_lines
MODIFY backend/app/domains/pipeline/tasks/render_order_line.py — write celery_task_id + step events to job doc
MODIFY backend/app/domains/pipeline/tasks/render_thumbnail.py — same
MODIFY backend/app/api/routers/orders.py — cancel reads render_job_doc.celery_task_id, calls celery.control.revoke()

Acceptance gates:

  • Start a 60s render task → click Cancel → celery inspect active shows task is gone within 15s
  • GET /api/orders/{id}/lines/{line_id} response includes render_job_doc.steps[] with per-step duration_s
  • Worker log shows [THUMBNAIL] done in 34.2s format (not bare f-strings)

Priority 8 — Tenant Isolation Completion

Goal: Finish tenant isolation hardening, especially for non-HTTP execution paths.

Status: In progress. HTTP-side RLS enforcement is now real; task-side propagation is the remaining gap.

Milestones:

  • M1: TenantContextMiddleware registered; all HTTP requests set RLS context from JWT
  • M2: All Celery tasks call set_tenant_context(db, tenant_id) at task start
  • M3: global_admin + tenant_admin roles in DB; require_admin()require_global_admin()

File targets:

Action Path
CREATE backend/app/core/middleware.pyTenantContextMiddleware(BaseHTTPMiddleware)
MODIFY backend/app/main.pyapp.add_middleware(TenantContextMiddleware)
MODIFY backend/app/utils/auth.pycreate_access_token() embeds tenant_id in JWT claims
MODIFY backend/app/tasks/pipeline/thumbnail.py, extract.py, stills.py, turntable.pyset_tenant_context() at start
DONE backend/alembic/versions/049_role_hierarchy.py — adds global_admin and tenant_admin
PARTIAL Routers are mixed between new require_global_admin() usage and backward-compatible require_admin() aliases

Acceptance gates:

  • Login as tenant A user → GET /api/products returns 0 results when tenant A has no products, even if tenant B has 50
  • Verify via: SELECT count(*) FROM products WHERE tenant_id != '<tenantA_id>' returns 0 from within tenant A session
  • Celery task logs show [TENANT] context set: tenant_id=<uuid> at start
  • GET /api/admin/users returns 403 for tenant_admin role (only global_admin can list all users)

Priority 9 — Hash-Based Scene Conversion Caching

Goal: Extend the existing STEP-hash plumbing so canonical scene + preview derivatives can skip unnecessary re-tessellation.

Status: Partial foundation already exists.

Milestones:

  • M1: Existing cad_files.step_file_hash is reused or renamed for canonical-scene caching
  • M2: Export task checks hash before processing — returns cached asset UUID on hit
  • M3: Hash invalidated correctly when admin forces reprocess or deflection settings change

File targets:

Action Path
DONE backend/app/domains/products/models.pyCadFile.step_file_hash: str | None already exists
DONE backend/app/domains/products/cache_service.pycompute_step_hash(file_path) already exists
OPEN backend/app/domains/pipeline/tasks/export_glb.py (or future USD task) — hash check before subprocess call
OPEN optional migration to rename step_file_hashstep_hash only if naming consistency is worth the churn

Acceptance gates:

  • Upload same STEP file twice → second task completes in < 2s (cache hit logged: [CACHE] hash match, skipping tessellation)
  • Change deflection setting → force reprocess → new export runs fresh (hash same but settings changed, cache bypassed)
  • GET /api/cad/{id} response includes step_hash field

Priority 10 — UI/UX Polish

Goal: Address independent UI items from visual-audit-report.md that require no backend changes.

Status: Partial. Some milestones are already shipped and should be removed from the active queue.

Milestones:

  • M1: Tooltip/help text on every Admin settings input — mostly done
  • M2: Empty state messages in MediaBrowser, ProductLibrary, Orders — partially done
  • M3: Notification batching — group per-render noise into job summaries
  • M4: Mobile navigation — hamburger menu at < 768px — done
  • M5: Kanban rejection flow — drag-to-reject with reason field

File targets:

Action Path
MODIFY frontend/src/pages/Admin.tsxtitle attributes on all inputs; help text below complex settings
MODIFY frontend/src/pages/MediaBrowser.tsx — empty state: "No assets yet — upload a STEP file to get started"
MODIFY frontend/src/pages/ProductLibrary.tsx — empty state
MODIFY frontend/src/pages/Orders.tsx — empty state
MODIFY frontend/src/components/layout/Layout.tsx — hamburger + slide-in nav for mobile
MODIFY frontend/src/pages/OrderDetail.tsx — reject button with reason modal
CREATE frontend/src/components/shared/Tooltip.tsx — reusable tooltip wrapper

Acceptance gates:

  • All Admin settings inputs have visible help text or tooltip (manual check: no input label without explanation)
  • Mobile viewport (375px): no horizontal scroll, nav accessible via hamburger
  • Submit a render → NotificationCenter shows one "Render complete (3 files)" summary, not 3 individual toasts

Dependency Graph

Priority 1 (Pipeline Cleanup Foundation)
  └── Priority 2 (USD Foundation Without Viewer Regression)
        └── Priority 4 (Viewer Migration to Canonical Part Identity)
              └── Priority 5 (Canonical USD Export and Render Migration)
                    └── Priority 6 (Admin and Product Surface Simplification)
                          └── Priority 9 (Hash-Based Scene Conversion Caching)

Priority 3 (Tessellation and Topology Quality)
  └── Priority 5 (Canonical USD Export and Render Migration)

Priority 8 remaining work (Celery tenant context)      — can run in parallel
Priority 10 remaining polish                           — independent

What To Do Next

Recommended execution path:

  1. Finish the remaining Priority 1 work first: remove STL-era dead code and split blender_render.py.
  2. Start Priority 2 immediately after that cleanup baseline is stable: add partKey, assignment layers, and scene manifest without changing browser UX.
  3. Run Priority 3 in parallel only if cylinder tessellation is actively blocking confidence in seam/sharp payload work; otherwise keep it behind Priority 2.
  4. Treat Priority 8 as a short parallel hardening task: add Celery-side tenant context propagation.
  5. Use docs/plans/0001-step-to-usd-implementation.md as the execution checklist for the USD workstream.

Parallel sprint option (2 agents):

  • Agent 1: Priority 1 remainder (dead-code cleanup + blender_render.py split)
  • Agent 2: Priority 8 remainder or Priority 3, depending on whether tessellation quality is currently blocking work

Parallel sprint option (3 agents):

  • Agent 1: Priority 1 remainder
  • Agent 2: Priority 2 groundwork (usd_master, part_key_service, scene-manifest)
  • Agent 3: Priority 8 remainder or targeted Priority 10 polish

Do not defer anymore:

  • canonical partKey
  • part-keyed browser material overrides
  • scene manifest / preview contract

These are now considered implementation prerequisites for the long-term refactor, not optional strategy work.


Archive

Old planning files are kept for reference but superseded by this document:

  • PLAN.md — original Phase AF plan (Phases AE complete, Phase F = Priority 9 here)
  • PLAN_REFACTOR.md — 1,173-line architectural plan (Phases 18 mapped to Priorities 28 above)
  • plan.md — GMSH tessellation implementation plan (Priority 3)
  • docs/rfcs/0001-step-to-usd-workflow.md — USD RFC (Priorities 2, 4, and 5)
  • docs/plans/0001-step-to-usd-implementation.md — actionable USD implementation plan
  • review-report.md — latest code review results
  • visual-audit-report.md — UX audit results