Files
HartOMat/.claude/commands/debug-render.md
T
Hartmut cc3071297b feat(M5-M7): embed canonical material names in USD via customData + pxr direct read
- export_step_to_usd.py: accept --material_map CLI arg, write
  schaeffler:canonicalMaterialName as customData on each Mesh prim,
  fix geometry transform (strip shape Location before face exploration,
  apply both face_loc and shape_loc sequentially)
- import_usd.py: after Blender USD import, use pxr to read customData
  directly from the USD file — builds {part_key: material_name} lookup
  (Blender ignores STRING primvars and customData, but pxr reads both)
- _blender_materials.py: add apply_material_library_direct() for exact
  dict-based material assignment without name-matching heuristics
- _blender_scene_setup.py: prefer direct USD lookup, fall back to
  name-matching for legacy USD files without material metadata
- export_glb.py (generate_usd_master_task): resolve material_map via
  material_service.resolve_material_map() and pass to subprocess;
  include material hash in cache key for invalidation
- ROADMAP.md: update P5 status, add M5-M7 milestones

Tested: 3/3 parts matched (ans_lfs120), 172/175 parts matched
(F-802007.TR4-D1-H122AG). Previous: 0/25 matched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:04:26 +01:00

5.8 KiB

Debug Render Agent

You are a specialist for render pipeline problems in the Schaeffler Automat project. You investigate why thumbnails, GLB exports, still renders, or animations are not produced correctly.

Architecture Overview (current)

Upload STEP
  ↓
process_step_file  [queue: step_processing, worker container]
  → backend/app/domains/pipeline/tasks/extract_metadata.py
  → parses STEP objects, stores parsed_objects
  → queues render_step_thumbnail

render_step_thumbnail  [queue: asset_pipeline, render-worker container]
  → backend/app/domains/pipeline/tasks/render_thumbnail.py
  → subprocess: export_step_to_gltf.py  (OCC/GMSH tessellation → geometry GLB)
  → subprocess: export_gltf.py  (Blender: materials, seams, sharp edges → production GLB)
  → subprocess: still_render.py  (Blender still render → PNG thumbnail)
  → MediaAsset stored in MinIO
  → status: completed / failed

No HTTP blender-renderer service — there is no port 8100 endpoint. All rendering is Celery-based.

Step 1: Check DB Status

# CadFile status
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT id, original_name, processing_status, step_file_hash,
       render_job_doc->>'state' AS job_state
FROM cad_files WHERE id = '[cad_file_id]';"

# MediaAssets for a CadFile
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT asset_type, storage_key, file_size_bytes, is_archived, created_at
FROM media_assets WHERE cad_file_id = '[cad_file_id]'
ORDER BY created_at DESC;"

# OrderLine render status and job document
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT id, render_status, render_backend_used,
       render_job_doc->>'celery_task_id' AS celery_id,
       render_job_doc->>'state' AS job_state,
       render_job_doc->'steps' AS steps
FROM order_lines WHERE id = '[order_line_id]';"

# Material alias lookup
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT m.name AS canonical, ma.alias FROM materials m
JOIN material_aliases ma ON ma.material_id = m.id
WHERE lower(ma.alias) = lower('[material_name]');"

Step 2: Check Logs

# render-worker logs (Blender calls)
docker compose logs --tail=100 render-worker

# step-processing worker logs
docker compose logs --tail=100 worker

# Search for a specific CadFile
docker compose logs render-worker | grep "[cad_file_id]"

# Python tracebacks only
docker compose logs render-worker 2>&1 | grep -A 10 "Traceback"

# Celery task errors
docker compose logs render-worker 2>&1 | grep "ERROR\|FAILED\|Exception"

Step 3: Check Filesystem / MinIO

# Files in upload directory for a CadFile
docker compose exec render-worker ls -lah /app/uploads/[cad_file_id]/

# STEP file present?
docker compose exec render-worker find /app/uploads/[cad_file_id]/ -name "*.stp" -o -name "*.step"

# GLB files present?
docker compose exec render-worker find /app/uploads/[cad_file_id]/ -name "*.glb"

# MinIO contents (via mc alias)
docker compose exec minio mc ls local/schaeffler/[cad_file_id]/

Step 4: Test Export Scripts Directly

# Test OCC tessellation (geometry GLB)
docker compose exec render-worker python3 /render-scripts/export_step_to_gltf.py \
  --step_path /app/uploads/[cad_file_id]/[filename].stp \
  --output_path /tmp/test_geom.glb \
  --linear_deflection 0.03 \
  --angular_deflection 0.05

# Test Blender production GLB export
docker compose exec render-worker /opt/blender/blender --background \
  --python /render-scripts/export_gltf.py -- \
  --glb_path /tmp/test_geom.glb \
  --output_path /tmp/test_prod.glb \
  --smooth_angle 30

# Test Blender still render
docker compose exec render-worker /opt/blender/blender --background \
  --python /render-scripts/still_render.py -- \
  --glb_path /tmp/test_prod.glb \
  --output_path /tmp/test_thumb.png

# Check Blender version
docker compose exec render-worker /opt/blender/blender --version | head -1

Step 5: Re-queue a Single CadFile

docker compose exec backend python -c "
from app.tasks.celery_app import celery_app
celery_app.send_task(
    'app.domains.pipeline.tasks.render_thumbnail.render_step_thumbnail',
    args=['[cad_file_id]'],
    queue='asset_pipeline'
)"

Common Problems and Root Causes

Symptom Likely Cause Fix
Status failed, no thumbnail render-worker container crashed or OOM Check docker compose ps render-worker, restart if stopped
No module named 'pxr' usd-core not installed docker compose build render-worker
No module named 'gmsh' gmsh not installed docker compose build render-worker
Material not replaced Material name not in aliases Add alias in Admin → Materials, or seed aliases
GLB viewer shows old file Cache-bust URL missing ?v=... Check get_download_url() in media/service.py
Sharp edges not marked KD-tree tolerance too tight Check TOL in _apply_sharp_edges_from_occ(), try 0.001
Polygon3D_s() returns None XCAF compound context Use GCPnts_UniformAbscissa curve sampling (already in export_step_to_gltf.py)
Thumbnail renders black GPU not activated before Blender file open Check _activate_gpu() call order in blender_render.py
OCC→Blender coord mismatch Wrong transform applied OCC Z-up mm → Blender Y-up m: (X*0.001, -Z*0.001, Y*0.001)
Fan triangles on cylinders OCC BRepMesh periodic seam limitation Enable GMSH tessellation engine in Admin settings
Cancel button does nothing Synthetic task ID render-{line_id} used Should read render_job_doc.celery_task_id for revoke()

Root Cause Report Format

Problem: [What was the symptom?]
Root Cause: [What was the actual cause?]
Fix: [What was changed / needs to be changed?]
Prevention: [How to avoid this in the future?]
Pipeline stage: [Which script/task/service was the failure point?]