Files
HartOMat/.claude/commands/tenant-audit.md
T
Hartmut cc3071297b feat(M5-M7): embed canonical material names in USD via customData + pxr direct read
- export_step_to_usd.py: accept --material_map CLI arg, write
  schaeffler:canonicalMaterialName as customData on each Mesh prim,
  fix geometry transform (strip shape Location before face exploration,
  apply both face_loc and shape_loc sequentially)
- import_usd.py: after Blender USD import, use pxr to read customData
  directly from the USD file — builds {part_key: material_name} lookup
  (Blender ignores STRING primvars and customData, but pxr reads both)
- _blender_materials.py: add apply_material_library_direct() for exact
  dict-based material assignment without name-matching heuristics
- _blender_scene_setup.py: prefer direct USD lookup, fall back to
  name-matching for legacy USD files without material metadata
- export_glb.py (generate_usd_master_task): resolve material_map via
  material_service.resolve_material_map() and pass to subprocess;
  include material hash in cache key for invalidation
- ROADMAP.md: update P5 status, add M5-M7 milestones

Tested: 3/3 parts matched (ans_lfs120), 172/175 parts matched
(F-802007.TR4-D1-H122AG). Previous: 0/25 matched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:04:26 +01:00

215 lines
6.5 KiB
Markdown

# Tenant Audit Agent
You are a specialist for tenant isolation correctness in the Schaeffler Automat project. You verify that PostgreSQL Row-Level Security (RLS) is enforced for a given endpoint or Celery task, and fix any gaps.
## Current Isolation State (ROADMAP Priority 8)
| Layer | Status |
|---|---|
| HTTP requests | `TenantContextMiddleware` sets `SET LOCAL app.current_tenant_id` from JWT |
| JWT claims | `tenant_id` embedded by `create_access_token()` |
| Role hierarchy | `global_admin` > `tenant_admin` > `project_manager` > `client` |
| Celery tasks | **Gap**: `set_tenant_context()` not yet called in all tasks — this is the primary open work |
| RLS policies | Defined in migration 036 for core tables |
## How RLS Works in This Project
```sql
-- RLS policy example (from migration 036):
CREATE POLICY tenant_isolation ON products
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
-- Set context for a session:
SET LOCAL app.current_tenant_id = 'uuid-here';
-- After this, all queries on `products` only see rows for that tenant.
-- global_admin bypasses RLS:
SET LOCAL app.current_tenant_id = 'global';
-- Or: SET LOCAL app.bypass_rls = 'true';
```
## Audit: HTTP Endpoint
For a given FastAPI endpoint, verify the full chain:
### Step 1: Check middleware registration
```bash
grep -n "TenantContextMiddleware" backend/app/main.py
```
Expected: `app.add_middleware(TenantContextMiddleware)` present.
### Step 2: Check JWT contains tenant_id
```bash
grep -n "tenant_id" backend/app/utils/auth.py | head -10
```
Expected: `tenant_id` in `create_access_token()` payload.
### Step 3: Verify RLS policy exists for the table
```bash
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT schemaname, tablename, policyname, cmd, qual
FROM pg_policies
WHERE tablename = '[tablename]';"
```
### Step 4: Live cross-tenant leak test
```bash
# Get tenant A and tenant B IDs
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT id, name FROM tenants LIMIT 5;"
# Count rows visible to tenant A
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SET LOCAL app.current_tenant_id = '[tenant_a_id]';
SELECT COUNT(*) FROM [tablename];"
# Count total rows (bypass RLS)
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT COUNT(*) FROM [tablename];"
# If visible count == total count when tenant B has data → RLS not enforced
```
### Step 5: API-level verification
```bash
# Login as tenant A user, call endpoint, check count
TOKEN=$(curl -s -X POST http://localhost:8888/api/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"[tenant_a_user]","password":"[password]"}' | jq -r '.access_token')
curl -s http://localhost:8888/api/products \
-H "Authorization: Bearer $TOKEN" | jq 'length'
# Should return count of tenant A's products, not total across all tenants
```
## Audit: Celery Task
For a given task, verify tenant context propagation:
### Step 1: Check task for set_tenant_context call
```bash
grep -n "set_tenant_context\|tenant_id" backend/app/domains/pipeline/tasks/[task_file].py
```
Expected: `set_tenant_context(db, tenant_id)` called near the start of the task function.
### Step 2: Check tenant_id passed to task
Trace back from the Celery `.delay()` call to verify `tenant_id` is in the arguments:
```bash
grep -n "\.delay\|\.apply_async" backend/app/domains/pipeline/tasks/*.py | grep "[task_name]"
```
### Step 3: Add tenant context to a task (fix pattern)
```python
# In the Celery task function:
@celery_app.task(bind=True, queue='asset_pipeline')
def render_step_thumbnail(self, cad_file_id: str, tenant_id: str | None = None):
from app.database import SyncSessionLocal
from app.utils.tenant import set_tenant_context
with SyncSessionLocal() as db:
if tenant_id:
set_tenant_context(db, tenant_id)
logger.info(f"[TENANT] context set: tenant_id={tenant_id}")
# ... rest of task ...
```
And in the caller:
```python
render_step_thumbnail.delay(
str(cad_file_id),
tenant_id=str(current_user.tenant_id) if current_user.tenant_id else None,
)
```
## Fix: TenantContextMiddleware (if missing)
```python
# backend/app/core/middleware.py
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from app.utils.auth import decode_token
class TenantContextMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
token = request.headers.get("Authorization", "").removeprefix("Bearer ")
if token:
try:
payload = decode_token(token)
request.state.tenant_id = payload.get("tenant_id")
except Exception:
pass
return await call_next(request)
```
The actual DB context (`SET LOCAL`) is set inside the DB dependency via:
```python
# In database.py get_db():
if hasattr(request.state, 'tenant_id') and request.state.tenant_id:
await db.execute(text(f"SET LOCAL app.current_tenant_id = '{request.state.tenant_id}'"))
```
## Tables with RLS Policies (from migration 036)
Verify these tables have policies:
```bash
docker compose exec postgres psql -U schaeffler -d schaeffler -c "
SELECT tablename, COUNT(*) as policies
FROM pg_policies
GROUP BY tablename
ORDER BY tablename;"
```
Key tables that must have RLS: `products`, `cad_files`, `orders`, `order_lines`, `media_assets`, `order_items`.
## Role Permission Matrix
| Permission | global_admin | tenant_admin | project_manager | client |
|---|---|---|---|---|
| All tenants data | ✅ bypass RLS | ❌ own tenant only | ❌ | ❌ |
| System settings | ✅ | ✅ | ❌ | ❌ |
| Trigger renders | ✅ | ✅ | ✅ | ❌ |
| Create/view own orders | ✅ | ✅ | ✅ | ✅ |
| Manage users (all tenants) | ✅ | ❌ | ❌ | ❌ |
| Manage users (own tenant) | ✅ | ✅ | ❌ | ❌ |
## Audit Report Format
```
## Tenant Isolation Audit: [endpoint or task name]
Date: [today]
### Result: ✅ Isolated / ⚠️ Partial / ❌ Leaking
### Findings
#### HTTP layer
- Middleware: [present/missing]
- JWT tenant_id: [present/missing]
- RLS policy on table: [present/missing for each table]
- Cross-tenant leak test: [pass/fail with counts]
#### Celery layer (if applicable)
- set_tenant_context called: [yes/no]
- tenant_id passed in .delay(): [yes/no]
### Fix Required
[Exact code change needed, or "None — fully isolated"]
```
## Completion
After completing an audit or fix: "Tenant audit complete. Result: [✅/⚠️/❌]. [Summary of findings and changes]."