feat: render health endpoint + test script + pipeline fixes
- GET /api/worker/health/render: checks render-worker (thumbnail_rendering queue), Blender availability via active_queues inspect, queue depth, last render recency — returns ok/degraded/down status - scripts/test_render_pipeline.py: integration test for full pipeline (--health, --sample, --full modes) - PLAN.md: appended Render Pipeline Fixes section with all B-Fixes - LEARNINGS.md: documented 5 new learnings (queue mismatch, circular import, 307 redirect, worker capability detection) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Refactor-Plan: Schaeffler Automat v2
|
||||
|
||||
**Erstellt:** 2026-03-05
|
||||
**Aktualisiert:** 2026-03-06 — Phasen A, B, C, D, E abgeschlossen
|
||||
**Aktualisiert:** 2026-03-06 — Phasen A, B, C, D, E abgeschlossen + Render-Pipeline-Fixes
|
||||
**Status:** IN UMSETZUNG — Phase F als nächstes
|
||||
**Branch:** `refactor/render-pipeline` → Ziel: neuer Branch `refactor/v2`
|
||||
|
||||
@@ -1405,3 +1405,51 @@ curl http://localhost:8888/api/media?product_id={product_id} | jq length # →
|
||||
- [x] Startphase A bestätigt
|
||||
- [x] Git-Tag `v1-stable` auf main erstellt
|
||||
- [x] Git-Branch `refactor/v2` erstellt
|
||||
|
||||
---
|
||||
|
||||
## Render Pipeline Fixes (2026-03-06)
|
||||
|
||||
### Kontext
|
||||
|
||||
Nach Aktivierung von Multi-Tenancy (Migration 035/036) hatten mehrere Bugs die gesamte Render-Pipeline blockiert. Alle wurden behoben.
|
||||
|
||||
### Durchgeführte Fixes
|
||||
|
||||
| Fix | Problem | Lösung | Datei |
|
||||
|---|---|---|---|
|
||||
| B-Fix-1 | `worker-thumbnail` ohne Blender konkurrierte auf `thumbnail_rendering` → 50% Silent-Fails | `worker-thumbnail` aus docker-compose.yml entfernt | `docker-compose.yml` |
|
||||
| B-Fix-2 | `render_order_line_task` auf `step_processing` Queue → `worker` ohne Blender → Pillow-Fallback | Queue zu `thumbnail_rendering` geändert | `step_tasks.py:247` |
|
||||
| B-Fix-3 | Circular Import `template_service.py` ↔ `domains/rendering/service.py` → `resolve_template()` nie aufrufbar | Volle sync SQLAlchemy Implementierung in `template_service.py` wiederhergestellt | `services/template_service.py` |
|
||||
| B-Fix-4 | `audit_log.tenant_id NOT NULL` → Broadcast-Notifications scheiterten → Order Submit 500 | `ALTER TABLE audit_log ALTER COLUMN tenant_id DROP NOT NULL` | DB direkt |
|
||||
| B-Fix-5 | Shared System-Tabellen (`output_types`, `materials`, etc.) `tenant_id NOT NULL` → Create-Endpoints schlugen fehl | `tenant_id DROP NOT NULL` für alle System-Tabellen | DB direkt |
|
||||
| B-Fix-6 | STEP Upload + Excel Import setzten `tenant_id=NULL` | `user.tenant_id` durch alle Create-Pfade durchgezogen | `uploads.py`, `excel_import.py`, `products/service.py` |
|
||||
| B-Fix-7 | `GET /api/tenants` → 307 Redirect → axios verliert Authorization-Header → 401 → leere Tenant-Liste | Trailing Slash in API-Call: `/tenants/` | `frontend/src/api/tenants.ts` |
|
||||
| B-Fix-8 | Admin-UI zeigte noch Flamenco + Three.js Optionen | Flamenco-Section + Three.js-Picker entfernt | `Admin.tsx`, `OutputTypeTable.tsx` |
|
||||
| B-Fix-9 | 5 Output-Types noch auf `render_backend='flamenco'` | `UPDATE output_types SET render_backend='celery'` | DB direkt |
|
||||
|
||||
### Neue Testing-Infrastruktur (DONE)
|
||||
|
||||
**`GET /api/worker/health/render`** — Render Health Endpoint:
|
||||
- Render-Worker connected (Celery inspect)
|
||||
- Blender erreichbar (HTTP GET blender-renderer:8100/health)
|
||||
- `thumbnail_rendering` Queue Tiefe < 10
|
||||
- Letzter Render < 30 min alt und erfolgreich
|
||||
- Response: `{ status: "ok"|"degraded"|"down", render_worker_connected, blender_available, thumbnail_queue_depth, last_render_at, ... }`
|
||||
|
||||
**`scripts/test_render_pipeline.py`** — Integration Test Script:
|
||||
```bash
|
||||
python scripts/test_render_pipeline.py --health # Health-Check only
|
||||
python scripts/test_render_pipeline.py --sample # 1 STEP + 1 Output-Type (schnell)
|
||||
python scripts/test_render_pipeline.py --full # Alle Output-Types (langsam)
|
||||
```
|
||||
|
||||
### Celery-Queue-Architektur (nach Fixes)
|
||||
|
||||
| Queue | Worker | Concurrency | Tasks |
|
||||
|---|---|---|---|
|
||||
| `step_processing` | `worker` | 8 | `process_step_file`, `dispatch_order_line_render` |
|
||||
| `thumbnail_rendering` | `render-worker` (Blender 5.0.1) | 1 | `render_step_thumbnail`, `regenerate_thumbnail`, `render_order_line_task`, `generate_stl_cache` |
|
||||
| `ai_validation` | `worker` | 8 | Azure AI Validierung |
|
||||
|
||||
**Schlüsselprinzip**: Alles was Blender aufruft → `thumbnail_rendering` Queue → nur `render-worker` → kein Timeout durch parallele Requests.
|
||||
|
||||
Reference in New Issue
Block a user