- Add per-file Redis SET NX EX 1800 locks to generate_gltf_geometry_task and generate_usd_master_task — concurrent duplicates (e.g. double-click of bulk action buttons) now log a warning and return immediately instead of running two expensive OCC tessellation subprocesses on the same file - Fix eng.dispose() called inside with Session() block in cache-hit path of both tasks — moved to after the with block exits (Tasks 3+4 from plan) - Add cad.updated_at = datetime.utcnow() in save_manual_material_overrides (was missing vs parallel save_part_materials endpoint) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.0 KiB
Plan: Deduplication for GLB/USD Generation + Two Review Fixes
Context
Two problems to solve:
1. Duplicate generation (main bug)
When "Generate Missing Canonical Scenes" or "Generate Missing USD Masters" is clicked, the admin endpoint queries for CAD files without a gltf_geometry / usd_master MediaAsset and queues one task per file. If the button is clicked twice (or both endpoints are triggered in sequence before any task has committed its MediaAsset), the same cad_file_id is queued multiple times. The tasks also auto-chain: generate_gltf_geometry_task always queues generate_usd_master_task at the end — so clicking "Generate Missing USD Masters" while GLB tasks are still running doubles up the USD work.
The existing cache check (step_file_hash) only short-circuits tessellation when a MediaAsset already exists — it does not prevent two concurrent tasks from both starting the expensive subprocess on the same file. Two processes writing to _geometry.glb simultaneously causes corruption / wasted compute.
Solution: Apply the same Redis SET NX EX dedup lock that process_step_file uses (lock key step_processing_lock:{id}, released in finally). Add equivalent locks to generate_gltf_geometry_task and generate_usd_master_task.
2. Review fix A — eng.dispose() inside with Session block
export_glb.py line 89: eng.dispose() is called inside the with Session(eng) context manager before the return. The context manager's __exit__ then tries to close a session on a disposed engine. Safe in practice (no exception raised) but fragile and misleading. Move eng.dispose() to after the with block exits.
3. Review fix B — save_manual_material_overrides missing updated_at
cad.py line 537: cad.manual_material_overrides = body.overrides is committed without updating cad.updated_at. The parallel endpoint save_part_materials (line 430) does call cad.updated_at = datetime.utcnow(). Add the same line to save_manual_material_overrides.
Affected Files
| File | Change |
|---|---|
backend/app/domains/pipeline/tasks/export_glb.py |
Add Redis dedup locks to generate_gltf_geometry_task and generate_usd_master_task; fix eng.dispose() placement |
backend/app/api/routers/cad.py |
Add cad.updated_at = datetime.utcnow() in save_manual_material_overrides |
Tasks (in order)
[x] Task 1: Add Redis dedup locks to generate_gltf_geometry_task
-
File:
backend/app/domains/pipeline/tasks/export_glb.py -
What: At the top of
generate_gltf_geometry_task, afterpl.step_start(...), acquire a Redis lock using the same pattern asextract_metadata.py:import redis as _redis_lib _lock_key = f"glb_geometry_lock:{cad_file_id}" _r = _redis_lib.from_url(app_settings.redis_url) _acquired = _r.set(_lock_key, "1", nx=True, ex=1800) # 30-min TTL if not _acquired: logger.warning("generate_gltf_geometry_task: %s already in-flight — skipping duplicate", cad_file_id) pl.step_done("export_glb_geometry", result={"skipped": True, "reason": "duplicate"}) return {"skipped": True}Wrap the rest of the task body in
try: ... finally: _r.delete(_lock_key).Note:
app_settingsis already imported inside the function. Importredisat the top of thetryblock asimport redis as _redis_lib(same pattern asextract_metadata.pywhich imports it locally). -
Acceptance gate: Trigger "Generate Missing Canonical Scenes" twice in quick succession — worker logs show
"already in-flight — skipping duplicate"for the second batch; no file ends up being tessellated twice. -
Dependencies: none
-
Risk: Low — same pattern as
process_step_file, TTL 30min covers worst-case tessellation time.
[x] Task 2: Add Redis dedup lock to generate_usd_master_task
-
File:
backend/app/domains/pipeline/tasks/export_glb.py -
What: Same pattern at the top of
generate_usd_master_task, afterpl.step_start(...):import redis as _redis_lib _lock_key = f"usd_master_lock:{cad_file_id}" _r = _redis_lib.from_url(app_settings.redis_url) _acquired = _r.set(_lock_key, "1", nx=True, ex=1800) # 30-min TTL if not _acquired: logger.warning("generate_usd_master_task: %s already in-flight — skipping duplicate", cad_file_id) pl.step_done("usd_master", result={"skipped": True, "reason": "duplicate"}) return {"skipped": True}Wrap the rest of the function body in
try: ... finally: _r.delete(_lock_key). -
Acceptance gate: Trigger "Generate Missing USD Masters" while GLB tasks are still running — worker logs show USD tasks skipping duplicates instead of starting a second tessellation.
-
Dependencies: Task 1
-
Risk: Low
[x] Task 3: Fix eng.dispose() placement in cache-hit early-return path
-
File:
backend/app/domains/pipeline/tasks/export_glb.py -
What: In
generate_gltf_geometry_task, the cache-hit path (lines 86–95) callseng.dispose()at line 89 while still inside thewith Session(eng)block, then returns. Moveeng.dispose()to after thewithblock exits.Current (broken):
with Session(eng) as session: ... if existing_geo: pl.step_done(...) eng.dispose() # ← inside with block try: generate_usd_master_task.delay(cad_file_id) ... return {"cached": True, ...} eng.dispose() # normal pathFixed: remove the
eng.dispose()at line 89, and move thegenerate_usd_master_task.delay()+returnto after thewithblock:_cache_hit_asset_id: str | None = None with Session(eng) as session: ... if existing_geo: logger.info("[CACHE] hash match — skipping geometry GLB tessellation for %s", cad_file_id) pl.step_done("export_glb_geometry", result={"cached": True, "asset_id": str(existing_geo.id)}) _cache_hit_asset_id = str(existing_geo.id) eng.dispose() if _cache_hit_asset_id is not None: try: generate_usd_master_task.delay(cad_file_id) except Exception: logger.debug("Could not queue generate_usd_master_task from cache-hit path (non-fatal)") return {"cached": True, "asset_id": _cache_hit_asset_id} # ... rest of function (tessellation path) -
Acceptance gate:
docker compose exec render-worker python3 -c "import app"(no import errors); cache-hit path still skips tessellation and chains USD master. -
Dependencies: none
-
Risk: Low — pure refactor, no logic change.
[x] Task 4: Add updated_at in save_manual_material_overrides
-
File:
backend/app/api/routers/cad.py -
What: In
save_manual_material_overrides(around line 537), addcad.updated_at = datetime.utcnow()beforeawait db.commit():cad.manual_material_overrides = body.overrides cad.updated_at = datetime.utcnow() # ← add this line await db.commit() -
Acceptance gate:
PUT /api/cad/{id}/manual-material-overrides→GET /api/cad/{id}shows updatedupdated_attimestamp. -
Dependencies: none
-
Risk: None
Migration Check
No migration required — no new columns or tables.
Order Recommendation
Tasks 3 and 4 are independent cleanup items — implement first (low risk). Tasks 1 and 2 are the core dedup fix — implement after.
Order: Task 4 → Task 3 → Task 1 → Task 2
Risks / Open Questions
- Redis TTL of 30 minutes: if a task crashes hard (OOM, SIGKILL) without running
finally, the lock stays for 30 minutes. This is the same tradeoff asprocess_step_file. Acceptable. generate_usd_master_taskis also queued by the cache-hit path ingenerate_gltf_geometry_task— that chained call will be deduplicated by the lock too if the primary USD task is already running. Correct behaviour.- The auto-chain from
generate_gltf_geometry_task → generate_usd_master_taskis still desirable (keeps canonical scene up-to-date after a fresh GLB). The lock prevents the duplicate, not the legitimate chain.