security: Unicode-aware prompt-injection guard (#39)

checkPromptInjection now NFKD-normalises, strips zero-width / combining chars, and folds common Cyrillic / Greek homoglyphs before matching. 10 documented bypass examples (fullwidth, ZWJ, ZWSP, soft-hyphen, Cyrillic е/о, combining marks, LRM, BOM) are covered by unit tests. Security docs explicitly mark the guard as defense-in-depth — real boundary is per-tool requirePermission. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 08:53:38 +02:00
parent 03030639d7
commit c2d05b4b99
3 changed files with 198 additions and 18 deletions
@@ -24,13 +24,13 @@

 Five-level role hierarchy:

-| Role | Level | Capabilities |
-|------|-------|-------------|
-| ADMIN | 5 | Full system access, user management, system settings |
-| MANAGER | 4 | Project management, resource allocation, vacation approval |
-| CONTROLLER | 3 | Financial views, budget management, reporting |
-| USER | 2 | Self-service (own vacations, own resource profile) |
-| VIEWER | 1 | Read-only access to permitted areas |
+| Role       | Level | Capabilities                                               |
+| ---------- | ----- | ---------------------------------------------------------- |
+| ADMIN      | 5     | Full system access, user management, system settings       |
+| MANAGER    | 4     | Project management, resource allocation, vacation approval |
+| CONTROLLER | 3     | Financial views, budget management, reporting              |
+| USER       | 2     | Self-service (own vacations, own resource profile)         |
+| VIEWER     | 1     | Read-only access to permitted areas                        |

 ### Per-User Permission Overrides

@@ -94,6 +94,27 @@ publicProcedure
  - Size limit (10 MB client-side, 4 MB server-side after compression)
  - Magic byte verification (actual file content matched against declared MIME)

+### Prompt-Injection Guard (defense-in-depth only)
+
+`packages/api/src/lib/prompt-guard.ts` runs a short regex list against every
+free-text user prompt sent to an AI tool (assistant chat + project-cover
+DALL-E prompt). Input is normalised before the regex runs:
+
+1. Unicode NFKD decomposition (collapses fullwidth / compatibility forms and
+   splits diacritics from their base letter).
+2. Strip zero-width / directional / combining code points that attackers use
+   to break contiguous substring matches.
+3. Fold a small set of Cyrillic / Greek homoglyphs to their Latin
+   equivalents.
+
+This guard is **defense-in-depth, not an authorisation boundary**. The actual
+security boundary for AI-initiated actions is the per-tool
+`requirePermission(ctx, PermissionKey.*)` check inside every assistant tool —
+an LLM that has been successfully jailbroken still cannot perform an action
+its caller's role does not allow. Motivated adversaries **will** find prompts
+that defeat the regex layer; its purpose is to raise the cost of casual
+injection attempts and to surface them as audit-log entries.
+
 ## 6. Audit Logging

 ### Activity History System
@@ -118,15 +139,15 @@ publicProcedure

 Configured in `next.config.ts`:

-| Header | Value |
-|--------|-------|
+| Header                    | Value                                          |
+| ------------------------- | ---------------------------------------------- |
 | Strict-Transport-Security | `max-age=63072000; includeSubDomains; preload` |
-| Content-Security-Policy | Restrictive CSP with nonce-based script-src |
-| X-Frame-Options | `DENY` |
-| X-Content-Type-Options | `nosniff` |
-| X-XSS-Protection | `1; mode=block` |
-| Referrer-Policy | `strict-origin-when-cross-origin` |
-| Permissions-Policy | Camera, microphone, geolocation disabled |
+| Content-Security-Policy   | Restrictive CSP with nonce-based script-src    |
+| X-Frame-Options           | `DENY`                                         |
+| X-Content-Type-Options    | `nosniff`                                      |
+| X-XSS-Protection          | `1; mode=block`                                |
+| Referrer-Policy           | `strict-origin-when-cross-origin`              |
+| Permissions-Policy        | Camera, microphone, geolocation disabled       |

 ## 8. Rate Limiting