# Dispo Import Implementation **Date:** 2026-03-14 **Purpose:** Canonical implementation document for replacing the current CapaKraken planning dataset with a clean-slate import from the Dispo v2 Excel workbooks. ## Scope This document defines how CapaKraken should ingest and normalize the following source workbooks: - `/samples/Dispov2/MandatoryDispoCategories_V3.xlsx` - `/samples/Dispov2/DISPO_2026.xlsx` - `/samples/Dispov2/20260309_Bi-Weekly_Chargeability_Reporting_Content_Production_V0.943_4Hartmut.xlsx` - `/samples/Dispov2/MV_DispoRoster.xlsx` - `/samples/Dispov2/Resource Roster_MASTER_FY26_CJ_20251201.xlsx` The goal is not a raw workbook archive. The goal is a normalized CapaKraken dataset that: - wipes existing database data and starts from a clean baseline - imports canonical reference data first - stages operational Dispo data before commit - creates real projects, assignments, vacations, availability rules, and reporting inputs - does not create fake bookings for unassigned time - handles public holidays through the vacation planner ## Agreed Business Rules These rules are fixed unless superseded by a later decision record: - There is only one canonical person identifier. `EID` and `Enterprise ID` are the same source identity. - Existing database data should be wiped completely before the new import. - `[BMW]` is the client token. - `[11035763]` is the stable WBS/project key. - `{CH80}` means utilization category `CH` and `winProbability = 80`. - `_HB` and `_SB` suffixes can be ignored. - `TBD` means unassigned time. It must not create a fake project or fake booking. - `[tbd]` means unresolved project identity and should remain staged until resolved. - `2D` and `3D` map to chapter `Digital Content Production`. - `PM` maps to chapter `Project Management`. - `AD` maps to chapter `Art Direction`. - People booked on projects need the corresponding project role assigned on the booking. - Internal work categories should be mapped as normalized internal project/utilization buckets. - Public holidays should be modeled properly in the vacation planner. - Project start and end dates should be inferred from earliest and latest imported assignment dates. - Assignment granularity should normalize 50% to 4 hours and 100% to 8 hours. - Part-time import must reduce weekday availability on the days the person is not available. ## Source Workbook Roles ### 1. `MandatoryDispoCategories_V3.xlsx` Use as the source of: - reference/master-data vocabulary - client and project attribute glossary - calendar and SAH rules - validation guidance for resource attributes Do not treat it as the primary source of transactional planning rows. ### 2. `DISPO_2026.xlsx` Use as the primary source of: - operational planning matrix - project bookings - internal work bookings - absences - public-holiday source hints - part-time hints - unassigned capacity This workbook requires a parser. It is not directly importable as row-based CRUD input. ### 3. `Bi-Weekly Chargeability Reporting...xlsx` Use as the source of: - target and forecast reconciliation - resource enrichment when missing elsewhere - aggregate validation after commit Do not treat PTD/MTD/YTD outputs as canonical source-of-truth records when CapaKraken can derive them from normalized data. ### 4. `MV_DispoRoster.xlsx` Use as the source of: - operational resource master rows keyed by `EID` - `SAP_data` enrichment for real display names and email addresses - chapter, department, client-unit, FTE, and activity-window enrichment - pseudo-demand row filtering before any resource commit Do not import `Demand_*` rows as real resources. ## Verified Source Constraint The workbook set now includes both a row-based resource master source and a cost-rate source. What is actually present: - `MandatoryDispoCategories_V3.xlsx` `EID-Attr` is a glossary/reference sheet - `Bi-Weekly Chargeability Reporting...xlsx` provides roster identity, FTE, management group, chapter, metro city, and client unit - `MV_DispoRoster.xlsx` `DispoRoster` provides operational roster rows and `SAP_data` provides real email/display-name coverage for many resources - `Resource Roster_MASTER_FY26_CJ_20251201.xlsx` provides per-person `LCR` / `UCR` rows plus management-level averages for fallback resolution - `DISPO_2026.xlsx` provides transactional planning, vacation, holiday, and part-time signals Implementation consequence: - staging can be completed from the current workbook set - strict source-data blockers are cleared for the supplied workbook set - resources missing a source email are imported with fallback email `@accenture.com` - `demand_*` planning identities are ignored during staging and readiness checks - unresolved `[tbd]` project rows still remain staged by design and must not be auto-committed as final projects ## Current Implementation Status Implemented in the application layer: - reference workbook parser and stager - chargeability parser and stager - planning parser and stager - project resolution staging - roster parser and stager with `DispoRoster` + `SAP_data` merge logic - cost-rate parser and roster-rate merge logic with exact and management-level fallback resolution - readiness assessment over merged resource sources - batch staging orchestration across all current workbook inputs Remaining import blocker for a strict production-grade commit: - unresolved `[tbd]` project references that are intentionally kept out of final project commit ## Target Domain Model The import commits into the existing planning model: - `Country` - `MetroCity` - `OrgUnit` - `UtilizationCategory` - `Client` - `ManagementLevelGroup` - `ManagementLevel` - `Role` - `Resource` - `ResourceRole` - `Project` - `DemandRequirement` - `Assignment` - `Vacation` - `VacationEntitlement` Relevant current schema anchors: - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L178) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L235) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L334) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L372) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L460) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L754) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L780) - [schema.prisma](/home/hartmut/Documents/Copilot/capakraken/packages/db/prisma/schema.prisma#L815) ## Required Implementation Changes ### 1. Canonical Person Identity CapaKraken currently stores both `eid` and `enterpriseId` on `Resource`. The import should operate on a single canonical identity. Recommendation: - choose `enterpriseId` as the canonical external person key - keep `eid` synchronized to the same value during transition, or remove its operational significance in the import path - reject imports that produce conflicting person rows for the same canonical identity This avoids duplicate matching logic throughout staging and commit. ### 2. Staging Layer Add dedicated staging tables or equivalent durable import records. The staging layer is required. Recommended staging entities: - `ImportBatch` - `StagedResource` - `StagedClient` - `StagedProject` - `StagedAssignment` - `StagedVacation` - `StagedAvailabilityRule` - `StagedUnresolvedRecord` Each staged record should retain: - source workbook name - sheet name - row number - original raw value - normalized parsed fields - parser warnings - resolution status The staging layer is the review boundary between workbook parsing and final commit. ### 3. Clean-Slate Reset Before final import, wipe the existing database contents. Recommended reset scope: - business data - auth/session data - derived snapshots - previous import artifacts After reset, reseed immediately: - admin user - access roles and permissions required to sign in - essential platform defaults - import reference seed data only when not supplied from the workbooks Implementation note: - take a full backup before reset - make the reset command idempotent in non-production development environments - require an explicit force flag in scripts ## Two-Step Import Flow ### Step A. Stage ### Goals - parse all source files - normalize tokens - build deterministic staging rows - surface conflicts before any planning data is committed ### Responsibilities 1. ingest workbook files 2. parse reference/master data 3. parse planning matrix cells into typed staging records 4. detect conflicts and unresolved cases 5. generate a review report ### Stage Output - staged resources - staged roles - staged chapters/org mapping - staged clients - staged projects - staged assignments - staged vacations/public holidays - staged part-time availability rules - unresolved `[tbd]` project references - unresolved identity conflicts ### Step B. Commit ### Goals - write only validated data - preserve traceability to staging - block unresolved project identities ### Responsibilities 1. create reference data 2. create resources and roles 3. create projects and internal buckets 4. create assignments 5. create vacations/public holidays 6. apply availability overrides 7. run reconciliation against chargeability workbook aggregates ### Commit Rules - unresolved `[tbd]` project rows must not silently create final projects - unassigned time must not create assignments - weekend markers must not create vacation rows - all project bookings must resolve to a role and utilization category - all resources must resolve to a canonical person ID ## Field Mapping ### Reference Data Mapping | Source | Field/Token | Target | Notes | | ---------------------------------- | ------------------------ | --------------------------------------------------------------------- | -------------------------------------------- | | `MandatoryDispoCategories_V3.xlsx` | `Country/Territory` | `Country` | source for country master data and SAH rules | | `MandatoryDispoCategories_V3.xlsx` | `Metro City` | `MetroCity` | child of country | | `MandatoryDispoCategories_V3.xlsx` | `Chapter` / org labels | `OrgUnit` and `Resource.chapter` | normalize level mapping during stage | | `MandatoryDispoCategories_V3.xlsx` | `Management Level Group` | `ManagementLevelGroup` | includes target percentage reference | | `MandatoryDispoCategories_V3.xlsx` | `Management Level` | `ManagementLevel` | child of management level group | | `MandatoryDispoCategories_V3.xlsx` | `WBS Master Client` | `Client` parent | master client node | | `MandatoryDispoCategories_V3.xlsx` | `WBS Client Name` | `Client` child | legal/sub-client node | | `MandatoryDispoCategories_V3.xlsx` | SAH rules | `Country.dailyWorkingHours`, schedule logic, holiday generation rules | not all rules map 1:1 yet | ### Resource Mapping | Source | Field | Target | Notes | | ----------------------------------------- | ------------------------- | ----------------------------------------------------- | ------------------------------------------------------ | | `EID-Attr`, `ChgFC`, `Dispo` row metadata | `Enterprise ID` / `EID` | canonical resource key | one identity only | | `ChgFC` | `FTE` | `Resource.fte` | baseline contract capacity | | `ChgFC` | `Management Level Group` | `Resource.managementLevelGroupId` | reference lookup | | `EID-Attr` | `Management Level` | `Resource.managementLevelId` | reference lookup | | `ChgFC` | `Metro City` | `Resource.metroCityId` | resolve via master data | | `EID-Attr` | `Country/Territory` | `Resource.countryId` | resolve via master data | | `ChgFC` | `MV Org Unit 1 / Chapter` | `Resource.chapter` / `Resource.orgUnitId` | normalize by agreed mapping | | `EID-Attr` | `Unit (Client Unit)` | `Resource.clientUnitId` | lookup to client tree if modeled there | | `EID-Attr` | `Ressource Type` | `Resource.resourceType` | enum mapping required | | `EID-Attr` | `LCR` | `Resource.lcrCents` | normalize currency to cents | | `EID-Attr` | `UCR` | `Resource.ucrCents` | normalize currency to cents | | `ChgFC` | `Target (per Level)` | `Resource.chargeabilityTarget` or group target source | use management group target as primary when consistent | Rate resolution during staging: - `Resource Roster_MASTER_FY26_CJ_20251201.xlsx` is the primary cost-rate source. - Exact `EID`/enterprise matches populate `lcrCents` and `ucrCents` directly. - If a person is missing in the cost workbook, the importer falls back to same-management-level averages from that workbook. - Only resources with neither an exact row nor a usable management-level fallback remain readiness blockers for LCR/UCR. ### Chapter and Role Mapping Resource chapter mapping: | Token | Resource chapter | | ----- | ---------------------------- | | `2D` | `Digital Content Production` | | `3D` | `Digital Content Production` | | `PM` | `Project Management` | | `AD` | `Art Direction` | Assignment role mapping: | Token | Assignment role | | ----- | ----------------- | | `2D` | `2D Artist` | | `3D` | `3D Artist` | | `PM` | `Project Manager` | | `AD` | `Art Director` | Implementation guidance: - chapter is a resource-organizational dimension - role is an assignment-level delivery dimension - assign both when both are known ### Project Mapping | Source token | Target | Notes | | ------------------------ | -------------------------------------------------------------- | ------------------------------------------ | | `[BMW]` | `Project.clientId` | resolve to client | | `[11035763]` | stable WBS key | recommended source for `Project.shortCode` | | `{CH80}` | `Project.utilizationCategoryId`, `Project.winProbability = 80` | parser splits prefix and numeric suffix | | project text body | `Project.name` | cleaned of helper suffixes | | earliest assignment date | `Project.startDate` | inferred | | latest assignment date | `Project.endDate` | inferred | | `[tbd]` | unresolved staged project | do not auto-commit as final project | ### Utilization and Planning Token Mapping | Token family | Meaning | Commit target | | ------------ | ---------------------------------- | --------------------------------------------------------------------- | | `CH` | chargeable client work | project assignment | | `MO` | management and operations | internal project/bucket assignment | | `MD` | market development / initiative | internal project/bucket assignment | | `PD` | personal development / recruitment | internal project/bucket assignment | | `AB` | absence | vacation row | | `NA` | non-bookable / calendar state | usually availability or derived calendar rule, not project assignment | | `UN` | unassigned | no assignment commit | ### Assignment Mapping Assignments should be written only when a project or internal bucket is resolved. | Source | Target | Notes | | --------------------- | --------------------------------------- | --------------------------------------------------------- | | project booking cell | `Assignment` | resource, project, role, start/end, hours/day, percentage | | internal `MO` cell | `Assignment` to internal project/bucket | utilization category `M&O` | | internal `MD` cell | `Assignment` to internal project/bucket | utilization category `MD&I` | | internal `PD` cell | `Assignment` to internal project/bucket | utilization category `PD&R` | | `[_UN] unassign {UN}` | no final assignment | capacity remains free | ### Vacation and Holiday Mapping | Source token | Target | Notes | | ------------------------------- | ------------------------------- | -------------------------------------------------------- | | `[_AB] ... {AB}` | `Vacation` | approved absence row | | `[_NA] Public Holiday ... {NA}` | `Vacation(type=PUBLIC_HOLIDAY)` | preferred source of truth is geography-driven generation | | `[_NA] Weekend {NA}` | no vacation row | derive from calendar | Public holiday implementation should integrate with the existing vacation planner and batch holiday support in [vacation.ts](/home/hartmut/Documents/Copilot/capakraken/packages/api/src/router/vacation.ts#L425). ### Availability and Part-Time Mapping Part-time markers must alter resource availability instead of creating fake bookings. Rules: - `100%` maps to 8 available hours on a standard full working day - `50%` maps to 4 available hours on a standard full working day - part-time markers should reduce weekday availability on non-working or reduced-working days Recommended commit target: - `Resource.fte` remains the contractual baseline - `Resource.availability` stores actual weekday available hours Examples: - 75% with one day fully off per week: - availability sets one weekday to `0` - 80% with reduced time across five weekdays: - availability distributes reduced hours across weekdays If the workbook does not encode the weekday pattern explicitly: - stage the record as `availabilityPattern = unresolved` - commit only the `fte` - require manual review before applying weekday reductions ## Parser Design The `DISPO_2026.xlsx` parser should operate at calendar-slot level and emit normalized staging records. ### Parser Inputs - workbook name - sheet name - row identity - day/column identity - source token text - resource roster metadata from the row ### Parser Outputs - normalized booking type - resource canonical ID - date - slot fraction - hours per day - percentage - client token - WBS - utilization category - win probability - role token - chapter token - ignore flags - unresolved reason when parsing fails ### Token Normalization Rules 1. strip ignored suffixes like `_HB`, `_SB` 2. extract client from `[CLIENT]` 3. extract WBS from `[12345678]` when numeric 4. extract unresolved project marker from `[tbd]` 5. extract utilization and win probability from `{CH80}` 6. extract role prefix from leading `2D`, `3D`, `PM`, `AD` 7. classify special bracket tokens `[_AB]`, `[_NA]`, `[_UN]`, `[_MO]`, `[_MD]`, `[_PD]` ## Internal Project Strategy Internal work should not be left as free text. Recommendation: - seed canonical internal projects or internal planning buckets for: - `M&O` - `MD&I` - `PD&R` - assign them consistent utilization categories - keep source token text in assignment metadata for traceability This allows planning, reporting, and chargeability to remain normalized. ## `[tbd]` Resolution Strategy `[tbd]` is not a valid final project identity. Recommended behavior: - stage as unresolved project demand - do not auto-create final project rows during commit - allow reviewer resolution by: - matching to existing WBS - creating a new real project - converting the row into intentional unassigned capacity - converting the row into a `DemandRequirement` when a role need is known but no confirmed project exists This is the only area where a manual review gate is required by default. ## Public Holiday Strategy Public holidays should be driven by geography and stored in the vacation planner. Recommended approach: 1. resolve each resource to country and, where relevant, metro city/federal state 2. generate public holidays for the applicable calendar year 3. commit them as approved `Vacation(type=PUBLIC_HOLIDAY)` rows 4. reconcile workbook holiday markers against generated rows Known implementation gap: - the chargeability forecast currently passes an empty `publicHolidays` list into SAH calculation in [chargeability-report.ts](/home/hartmut/Documents/Copilot/capakraken/packages/api/src/router/chargeability-report.ts#L167) Required follow-up: - make forecast logic consume public-holiday vacation rows or geography-derived holiday dates ## Reconciliation and Acceptance Checks After commit, run reconciliation against the chargeability workbook. Required checks: - resource count matches expected staged resource count - FTE totals by org unit match workbook aggregates within tolerance - chargeability targets by management group match expected values - project/client/WBS relationships are complete for all committed assignments - unassigned capacity is visible only as free capacity, not as fake bookings - vacations and public holidays are visible in the vacation planner - part-time resources show reduced weekday availability where resolved ## Suggested Implementation Order 1. add import staging schema and storage 2. add full reset + reseed command 3. add reference-data importer for `MandatoryDispoCategories_V3.xlsx` 4. add canonical resource importer and identity normalization 5. add role and chapter normalization 6. add `DISPO_2026.xlsx` parser 7. add staged project resolver for WBS and `[tbd]` 8. add assignment commit flow 9. add vacation and public-holiday import flow 10. add part-time availability overlay logic 11. add reconciliation report against `ChgFC` and aggregate sheets 12. add rollback/replay support for repeated dry runs ## Operator Commands For the clean-slate import workflow, use dedicated DB scripts instead of ad-hoc SQL: - reset and bootstrap a disposable environment: - `pnpm --filter @capakraken/db db:reset:dispo -- --force` - reset without `pg_dump` backup only in an intentionally disposable environment: - `pnpm --filter @capakraken/db db:reset:dispo -- --force --skip-backup` - seed Dispo v2 reference vocabulary after reset: - `pnpm --filter @capakraken/db db:seed:dispo-v2` The reset command: - backs up the database with `pg_dump` by default - truncates all public tables except Prisma migration history - recreates a bootstrap admin user and baseline `SystemSettings` - leaves workbook-derived reference and transactional data to the import pipeline ## Decision Log The following decisions are locked for implementation and should be treated as the canonical baseline for all downstream tickets. ### D1. Canonical Resource Identity - `enterpriseId` is the canonical external person key for all staged and committed import logic. - `eid` remains in the schema as a compatibility alias during the transition period. - the importer must mirror the canonical identity into both `enterpriseId` and `eid` on committed `Resource` rows unless a later cleanup ticket removes `eid` - conflicting source rows for the same canonical identity must remain staged and unresolved ### D2. Reset And Bootstrap Scope - the clean-slate reset wipes business data, auth/session data, notifications, audit logs, estimate data, and previous import artifacts - the reset must reseed a bootstrap admin user, baseline platform permissions, and required singleton settings immediately after wipe - imported workbook data is not the source of truth for platform access bootstrapping - reset remains an explicit operator action with backup-first workflow ### D3. `[tbd]` Commit Policy - `[tbd]` rows never auto-create final `Project` rows - `[tbd]` rows remain staged and unresolved by default - a reviewer may explicitly resolve a `[tbd]` row into: - an existing real project - a newly created real project - intentional unassigned capacity - a `DemandRequirement` - automatic commit logic must skip unresolved `[tbd]` rows ### D4. Default Part-Time Fallback - when the workbook encodes an explicit weekday pattern, the importer must commit that exact reduced availability pattern - when only an FTE/percentage is known, the importer should distribute the reduced weekly capacity evenly across Monday to Friday - any such fallback must be marked with a staging warning so operators can refine the weekday pattern later - examples: - `100%` => `8/8/8/8/8` - `50%` => `4/4/4/4/4` - `80%` => `6.4/6.4/6.4/6.4/6.4` ### D5. Internal Project Buckets Seed canonical internal planning projects/buckets as follows: | Source token | Project short code | Project name | Utilization category | | ------------ | ------------------ | ---------------------------------- | -------------------- | | `MO` | `INT-MO` | `Management & Operations` | `M&O` | | `MD` | `INT-MD` | `Market Development & Initiatives` | `MD&I` | | `PD` | `INT-PD` | `People Development & Recruitment` | `PD&R` | The importer should preserve the original source token in staged and committed metadata for traceability. ### D6. Seeded Role Vocabulary The baseline role seed list required for Dispo import is: - `2D Artist` - `3D Artist` - `Project Manager` - `Art Director` These roles are assignment-level delivery roles. They do not replace chapter/org ownership on the resource profile. ## Recommendation Proceed with a staged importer and treat this document as the implementation baseline. The critical architectural choices are: - clean-slate reset - canonical single-ID resource matching - staged parsing before commit - no fake unassigned bookings - explicit assignment roles separate from resource chapters - geography-driven public holidays in the vacation planner - manual resolution gate only for `[tbd]` and unresolved availability patterns