# Plan: Deduplication for GLB/USD Generation + Two Review Fixes ## Context Two problems to solve: **1. Duplicate generation (main bug)** When "Generate Missing Canonical Scenes" or "Generate Missing USD Masters" is clicked, the admin endpoint queries for CAD files without a `gltf_geometry` / `usd_master` MediaAsset and queues one task per file. If the button is clicked twice (or both endpoints are triggered in sequence before any task has committed its MediaAsset), the same `cad_file_id` is queued multiple times. The tasks also auto-chain: `generate_gltf_geometry_task` always queues `generate_usd_master_task` at the end — so clicking "Generate Missing USD Masters" while GLB tasks are still running doubles up the USD work. The existing cache check (`step_file_hash`) only short-circuits tessellation when a MediaAsset already exists — it does not prevent two concurrent tasks from both starting the expensive subprocess on the same file. Two processes writing to `_geometry.glb` simultaneously causes corruption / wasted compute. **Solution**: Apply the same Redis `SET NX EX` dedup lock that `process_step_file` uses (lock key `step_processing_lock:{id}`, released in `finally`). Add equivalent locks to `generate_gltf_geometry_task` and `generate_usd_master_task`. **2. Review fix A — `eng.dispose()` inside `with Session` block** `export_glb.py` line 89: `eng.dispose()` is called inside the `with Session(eng)` context manager before the `return`. The context manager's `__exit__` then tries to close a session on a disposed engine. Safe in practice (no exception raised) but fragile and misleading. Move `eng.dispose()` to after the `with` block exits. **3. Review fix B — `save_manual_material_overrides` missing `updated_at`** `cad.py` line 537: `cad.manual_material_overrides = body.overrides` is committed without updating `cad.updated_at`. The parallel endpoint `save_part_materials` (line 430) does call `cad.updated_at = datetime.utcnow()`. Add the same line to `save_manual_material_overrides`. ## Affected Files | File | Change | |---|---| | `backend/app/domains/pipeline/tasks/export_glb.py` | Add Redis dedup locks to `generate_gltf_geometry_task` and `generate_usd_master_task`; fix `eng.dispose()` placement | | `backend/app/api/routers/cad.py` | Add `cad.updated_at = datetime.utcnow()` in `save_manual_material_overrides` | ## Tasks (in order) ### [x] Task 1: Add Redis dedup locks to `generate_gltf_geometry_task` - **File**: `backend/app/domains/pipeline/tasks/export_glb.py` - **What**: At the top of `generate_gltf_geometry_task`, after `pl.step_start(...)`, acquire a Redis lock using the same pattern as `extract_metadata.py`: ```python import redis as _redis_lib _lock_key = f"glb_geometry_lock:{cad_file_id}" _r = _redis_lib.from_url(app_settings.redis_url) _acquired = _r.set(_lock_key, "1", nx=True, ex=1800) # 30-min TTL if not _acquired: logger.warning("generate_gltf_geometry_task: %s already in-flight — skipping duplicate", cad_file_id) pl.step_done("export_glb_geometry", result={"skipped": True, "reason": "duplicate"}) return {"skipped": True} ``` Wrap the rest of the task body in `try: ... finally: _r.delete(_lock_key)`. Note: `app_settings` is already imported inside the function. Import `redis` at the top of the `try` block as `import redis as _redis_lib` (same pattern as `extract_metadata.py` which imports it locally). - **Acceptance gate**: Trigger "Generate Missing Canonical Scenes" twice in quick succession — worker logs show `"already in-flight — skipping duplicate"` for the second batch; no file ends up being tessellated twice. - **Dependencies**: none - **Risk**: Low — same pattern as `process_step_file`, TTL 30min covers worst-case tessellation time. ### [x] Task 2: Add Redis dedup lock to `generate_usd_master_task` - **File**: `backend/app/domains/pipeline/tasks/export_glb.py` - **What**: Same pattern at the top of `generate_usd_master_task`, after `pl.step_start(...)`: ```python import redis as _redis_lib _lock_key = f"usd_master_lock:{cad_file_id}" _r = _redis_lib.from_url(app_settings.redis_url) _acquired = _r.set(_lock_key, "1", nx=True, ex=1800) # 30-min TTL if not _acquired: logger.warning("generate_usd_master_task: %s already in-flight — skipping duplicate", cad_file_id) pl.step_done("usd_master", result={"skipped": True, "reason": "duplicate"}) return {"skipped": True} ``` Wrap the rest of the function body in `try: ... finally: _r.delete(_lock_key)`. - **Acceptance gate**: Trigger "Generate Missing USD Masters" while GLB tasks are still running — worker logs show USD tasks skipping duplicates instead of starting a second tessellation. - **Dependencies**: Task 1 - **Risk**: Low ### [x] Task 3: Fix `eng.dispose()` placement in cache-hit early-return path - **File**: `backend/app/domains/pipeline/tasks/export_glb.py` - **What**: In `generate_gltf_geometry_task`, the cache-hit path (lines 86–95) calls `eng.dispose()` at line 89 while still inside the `with Session(eng)` block, then returns. Move `eng.dispose()` to *after* the `with` block exits. Current (broken): ```python with Session(eng) as session: ... if existing_geo: pl.step_done(...) eng.dispose() # ← inside with block try: generate_usd_master_task.delay(cad_file_id) ... return {"cached": True, ...} eng.dispose() # normal path ``` Fixed: remove the `eng.dispose()` at line 89, and move the `generate_usd_master_task.delay()` + `return` to after the `with` block: ```python _cache_hit_asset_id: str | None = None with Session(eng) as session: ... if existing_geo: logger.info("[CACHE] hash match — skipping geometry GLB tessellation for %s", cad_file_id) pl.step_done("export_glb_geometry", result={"cached": True, "asset_id": str(existing_geo.id)}) _cache_hit_asset_id = str(existing_geo.id) eng.dispose() if _cache_hit_asset_id is not None: try: generate_usd_master_task.delay(cad_file_id) except Exception: logger.debug("Could not queue generate_usd_master_task from cache-hit path (non-fatal)") return {"cached": True, "asset_id": _cache_hit_asset_id} # ... rest of function (tessellation path) ``` - **Acceptance gate**: `docker compose exec render-worker python3 -c "import app"` (no import errors); cache-hit path still skips tessellation and chains USD master. - **Dependencies**: none - **Risk**: Low — pure refactor, no logic change. ### [x] Task 4: Add `updated_at` in `save_manual_material_overrides` - **File**: `backend/app/api/routers/cad.py` - **What**: In `save_manual_material_overrides` (around line 537), add `cad.updated_at = datetime.utcnow()` before `await db.commit()`: ```python cad.manual_material_overrides = body.overrides cad.updated_at = datetime.utcnow() # ← add this line await db.commit() ``` - **Acceptance gate**: `PUT /api/cad/{id}/manual-material-overrides` → `GET /api/cad/{id}` shows updated `updated_at` timestamp. - **Dependencies**: none - **Risk**: None ## Migration Check No migration required — no new columns or tables. ## Order Recommendation Tasks 3 and 4 are independent cleanup items — implement first (low risk). Tasks 1 and 2 are the core dedup fix — implement after. Order: Task 4 → Task 3 → Task 1 → Task 2 ## Risks / Open Questions - Redis TTL of 30 minutes: if a task crashes hard (OOM, SIGKILL) without running `finally`, the lock stays for 30 minutes. This is the same tradeoff as `process_step_file`. Acceptable. - `generate_usd_master_task` is also queued by the cache-hit path in `generate_gltf_geometry_task` — that chained call will be deduplicated by the lock too if the primary USD task is already running. Correct behaviour. - The auto-chain from `generate_gltf_geometry_task → generate_usd_master_task` is still desirable (keeps canonical scene up-to-date after a fresh GLB). The lock prevents the *duplicate*, not the *legitimate* chain.