HartOMat

Author	SHA1	Message	Date
Hartmut	3e810c74a3	chore: snapshot workflow migration progress	2026-04-12 11:49:04 +02:00
Hartmut	f13cb489c1	fix: migrate runtime data to native hartomat storage	2026-04-06 18:09:51 +02:00
Hartmut	8990b80abf	fix: restore existing runtime data volumes	2026-04-06 18:01:15 +02:00
Hartmut	448996b546	fix: stabilize HartOMat runtime startup	2026-04-06 13:10:51 +02:00
Hartmut	b795f0e6d6	refactor: rebrand project to HartOMat	2026-04-06 12:45:47 +02:00
Hartmut	daad2c64f3	fix: revert dual queue to single GPU — light worker caused 2x regression Root cause: render-worker and render-worker-light shared the same GPU, causing contention. Complex TRB renders went from 17s → 36s (2x slower). Changes: - Thumbnails back to asset_pipeline queue (not asset_pipeline_light) - Dispatch routing always uses asset_pipeline (no queue splitting) - render-worker-light gated behind "multi-gpu" profile — only starts with: docker compose --profile multi-gpu up -d - For single-GPU setups: all rendering is sequential on one worker The dual queue approach is correct for multi-GPU machines where each worker gets its own GPU. On single-GPU, serial execution is faster than concurrent GPU contention. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 12:33:26 +01:00
Hartmut	5a148554c0	perf: dual queue, GLB caching, WebP output, persistent BVH Task 4: Dual render queue - render-worker: heavy (asset_pipeline, concurrency=1) — HQ 2048x2048, animations - render-worker-light: light (asset_pipeline_light, concurrency=2) — thumbnails, <=1024 - Thumbnails routed to light queue automatically - Order line renders routed by resolution at dispatch time Task 5: GLB caching (skip re-tessellation) - Before tessellating, check if gltf_geometry MediaAsset exists for the cad_file_id - If found, copy to expected path — render_blender.py finds it and skips tessellation - Saves 7-11s per re-render of the same product Task 6: WebP output format - New 'webp' option in output_format (OutputType admin) - Blender renders PNG intermediate, Pillow converts to WebP (quality=90, method=4) - 50-70% smaller files with no visible quality loss - Correct MIME type (image/webp) in MediaAsset Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 12:07:12 +01:00
Hartmut	8c65c52271	fix: persist OptiX kernel cache — 9x faster first render after restart The OptiX cache was mounted at /root/.nv but NVIDIA writes to /var/tmp/OptixCache_root/optix7cache.db (28MB). Fixed volume mount. Before: first render after container restart = 181s (OptiX recompilation) After: first render after container restart = 20s (cached kernels) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 09:57:11 +01:00
Hartmut	1321ef2bd4	refactor: rename thumbnail_rendering queue to asset_pipeline The queue handles far more than thumbnails: OCC tessellation, USD master generation, GLB production, order line renders, and workflow renders. asset_pipeline better reflects its role as the render-worker's primary queue. Updated all references in: task decorators, celery_app.py, beat_tasks.py, docker-compose.yml worker command, worker.py MONITORED_QUEUES, admin.py, CLAUDE.md, LEARNINGS.md, Dockerfile, helpTexts.ts, test files, and all .claude/commands/*.md skill files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 22:28:38 +01:00
Hartmut	ac48d359e6	fix(render): persist OptiX BVH cache across render-worker rebuilds Mount named volume optix-cache:/root/.nv so the OptiX ComputeCache survives docker compose rebuild. Without this every rebuild wiped the BVH acceleration structure, causing the first render of any complex scene (~175 parts) to take 130–150s instead of 22s while OptiX recompiles kernels and rebuilds the BVH from scratch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:15:45 +01:00
Hartmut	34f89cc225	feat(gpu): GPU health check + RENDER_DEVICE_USED token + strict mode - gpu_probe.py: Blender script that probes OPTIX/CUDA/HIP/ONEAPI and exits 1 on no GPU — used at startup + on-demand from Admin UI - blender_render.py, still_render.py, turntable_render.py: emit RENDER_DEVICE_USED: engine=CYCLES device=GPU\|CPU compute_type=... after GPU activation; exit 2 when CYCLES_DEVICE=gpu and CPU fallback - render_blender.py: parse RENDER_DEVICE_USED token into render_log (device_used, compute_type, gpu_fallback); handle exit code 2 as explicit GPU strict-mode failure - check_version.py: check_gpu() runs gpu_probe.py at container startup; CYCLES_DEVICE=gpu aborts startup if no GPU found - docker-compose.yml: CYCLES_DEVICE=${CYCLES_DEVICE:-auto} env var - gpu_tasks.py: probe_gpu Celery task on thumbnail_rendering queue; saves result to system_settings.gpu_probe_last_result; beat every 30min - worker.py: POST /probe/gpu (trigger) + GET /probe/gpu/result (last result) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 20:57:36 +01:00
Hartmut	07e3d1e026	feat(phase8.1-8.2): dynamic worker concurrency via worker_configs - Migration 054: worker_configs table (queue_name PK, max/min_concurrency, enabled, updated_at); seeds step_processing(8/2), thumbnail_rendering(1/1), ai_validation(4/1) - WorkerConfig SQLAlchemy model - apply_worker_concurrency beat task: reads enabled configs, broadcasts pool_grow to all Celery workers every 5min - GET/PUT /api/worker/configs (admin): list + update per-queue concurrency - docker-compose.yml: worker uses --autoscale=${MAX_CONCURRENCY:-8},${MIN_CONCURRENCY:-2}; render-worker uses --autoscale=1,1 --concurrency=1 - WorkerManagement.tsx: "Concurrency Settings" section with +/- steppers and Save button per queue Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 20:41:57 +01:00
Hartmut	a70cb55d01	feat(N): workflow pipeline, 3D viewer, worker management, QC tests - workflow_builder.py: fix broken stubs, add render_order_line_still_task (resolves step_path from DB instead of passing order_line_id as step_path) - domains/rendering/tasks.py: add render_order_line_still_task, export_gltf_for_order_line_task, export_blend_for_order_line_task, generate_gltf_geometry_task (trimesh STL→GLB, no Blender needed) - tasks/step_tasks.py: add generate_gltf_geometry_task for CadFile GLB export - cad router: POST /{id}/generate-gltf-geometry endpoint (admin/PM) - worker router: GET /celery-workers + POST /scale (docker compose subprocess) - Dockerfile: pip install -e "[dev]" to enable pytest - docker-compose.yml: docker socket + compose file mount on backend - ThreeDViewer.tsx: mode toggle (geometry/production), wireframe, env presets, download buttons (GLB + .blend) - CadPreview.tsx: load gltf_geometry/gltf_production/blend_production assets from MediaAsset table and pass URLs to ThreeDViewer - ProductDetail.tsx: "View 3D" button → /cad/:id, "Generate GLB" button - media router/service: cad_file_id filter on GET /api/media - WorkerManagement.tsx: new page with worker status, queue depth, scale controls - App.tsx + Layout.tsx: /workers route + sidebar link (admin/PM) - tests: test_rendering_service.py, test_orders_service.py (backend) - tests: WorkerActivity.test.tsx, WorkerManagement.test.tsx (frontend) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 22:56:53 +01:00
Hartmut	ab3f9c734a	fix: render pipeline + multi-tenancy bugs (B-Fix-1 through B-Fix-9) - Remove worker-thumbnail (no Blender, was competing on thumbnail_rendering) - Move render_order_line_task to thumbnail_rendering queue (render-worker) - Restore template_service.py real implementation (fix circular import shim) - Thread tenant_id through STEP upload, Excel import, product create - Make system tables (output_types, materials, etc.) tenant_id nullable - Fix tenants frontend 307-redirect: use trailing slash /tenants/ - Remove Flamenco + Three.js from Admin UI (unsupported) - Set all output_types render_backend to celery (was flamenco) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 19:34:20 +01:00
Hartmut	f19a6ccde8	feat(F-G-H-I): STL cache, invoices, import validation, notification settings Phase F — STL Hash Cache: - Migration 041: step_file_hash column on cad_files - cache_service.py: SHA256 hash + MinIO-backed STL cache (check/store) - render_step_thumbnail: compute+persist hash before render - generate_stl_cache: check MinIO cache before cadquery conversion, store after Phase G — Invoices: - Migration 042: invoices + invoice_lines tables with RLS - Invoice/InvoiceLine models + schemas - billing service: generate_invoice_number (INV-YYYY-NNNN), create/list/get/delete/PDF - WeasyPrint PDF generation; backend Dockerfile + pyproject.toml deps - invoice_router with 6 endpoints; registered in main.py - frontend: Billing.tsx page + api/billing.ts; route + nav link Phase H — Import Sanity Check: - Migration 043: import_validations table - ImportValidation model + schemas - run_sanity_check: material fuzzy-match (cutoff=0.8), STEP availability, duplicate detection - validate_excel_import Celery task (queue: step_processing) - uploads.py: create ImportValidation on /excel, fire task, expose GET /validations/{id} - frontend: Upload.tsx polling ValidationDialog with Ampel status indicators Phase I — Notification Settings: - Migration 044: notification_configs table (user×event×channel toggles) - NotificationConfig model + seeds (in_app=true, email=false) - get/upsert/reset config endpoints on /notifications/config - frontend: NotificationSettings.tsx page + api/notifications.ts extensions Infrastructure: - docker-compose.yml: add worker-thumbnail service (concurrency=1, Q=thumbnail_rendering) - Fix Dockerfile: libgdk-pixbuf-2.0-0 (correct Debian bookworm package name) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 18:05:01 +01:00
Hartmut	7706c514c8	fix(deploy): fix render-worker build context + migration 040 idempotency - docker-compose.yml: change render-worker build context from ./render-worker to . (project root) so pyproject.toml is accessible; update dockerfile path - render-worker/Dockerfile: update COPY paths for new build context; install Python 3.11 via deadsnakes PPA (Ubuntu 22.04 ships 3.10 which fails the >=3.11 requirement in pyproject.toml) - 040_media_assets.py: rewrite upgrade() with raw idempotent SQL (CREATE TYPE inside DO $$ EXCEPTION WHEN duplicate_object $$; CREATE TABLE IF NOT EXISTS; CREATE INDEX IF NOT EXISTS) to handle pre-existing enum from partial runs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 17:32:42 +01:00
Hartmut	c728358fb6	feat(A4): add MinIO service + storage abstraction (core/storage.py) - Add MinIO service to docker-compose.yml (port 9000 API, 9001 console) - Add minio-data volume for persistent object storage - Create backend/app/core/storage.py: MinIOStorage + LocalStorage abstraction - MinIOStorage: boto3-based, auto-creates bucket, upload/download/exists/delete/presign - LocalStorage: fallback for dev (UPLOAD_DIR filesystem, backward compat) - get_storage() singleton: auto-selects based on MINIO_URL env var - Add MINIO_URL/USER/PASSWORD/BUCKET env vars to all service definitions - backend/pyproject.toml: docker>=6.1.0 → boto3>=1.34.0 - Add docker-compose.worker.yml: external render-worker for remote machines - Fix .gitignore: 'core' rule was too broad, now only matches root /core dump - Update .env.example: MinIO connection vars documented Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 15:51:06 +01:00
Hartmut	9d1a820295	refactor(A2): replace blender-renderer HTTP service with render-worker Celery container - Create render-worker/ with Dockerfile (Ubuntu + cadquery + Blender via host mount) - Add render-worker/check_version.py: verifies Blender >= 5.0.1 at startup, Exit 1 on failure - Add render-worker/scripts/: blender_render.py, still_render.py, turntable_render.py - Create backend/app/services/render_blender.py: direct subprocess rendering - convert_step_to_stl() and export_per_part_stls() using cadquery - render_still(): STEP → STL → PNG via Blender subprocess - is_blender_available(): detects BLENDER_BIN env for render-worker context - Create backend/app/domains/rendering/tasks.py: render_still_task + render_turntable_task - Update step_processor.py: use subprocess path when BLENDER_BIN env is set (render-worker) - Update step_tasks.py: generate_stl_cache uses direct cadquery instead of HTTP - Remove blender-renderer and threejs-renderer from docker-compose.yml - Replace worker-thumbnail with render-worker (Ubuntu + cadquery + Blender mount) - Remove Docker SDK from backend Dockerfile (was only for flamenco scaling) - Update .env.example: BLENDER_VERSION=5.0.1 documented - Update celery_app.py: include domains.rendering.tasks in autodiscover Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 15:48:46 +01:00
Hartmut	1d6864fb64	refactor(A1): remove Flamenco, simplify render pipeline to Celery-only - Remove flamenco-manager and flamenco-worker from docker-compose.yml - Delete flamenco_client.py, flamenco_tasks.py, docker_scaler.py - Simplify render_dispatcher.py to Celery-only (removes ~300 lines) - Remove Flamenco beat schedule from celery_app.py - Clean admin.py: remove flamenco settings, endpoints, threejs validation - Clean orders.py cancel-render: Celery revoke only - Clean worker.py: remove flamenco_job_id from activity response - Migration 032: cancel lingering flamenco jobs, remove flamenco settings - PLAN.md: mark all decisions confirmed, status IN UMSETZUNG Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 15:38:37 +01:00
Hartmut	bce762a783	feat: initial commit	2026-03-05 22:12:38 +01:00

20 Commits