Files
CapaKraken/docs/dispo-import-implementation.md
T

28 KiB

Dispo Import Implementation

Date: 2026-03-14 Purpose: Canonical implementation document for replacing the current CapaKraken planning dataset with a clean-slate import from the Dispo v2 Excel workbooks.

Scope

This document defines how CapaKraken should ingest and normalize the following source workbooks:

  • /samples/Dispov2/MandatoryDispoCategories_V3.xlsx
  • /samples/Dispov2/DISPO_2026.xlsx
  • /samples/Dispov2/20260309_Bi-Weekly_Chargeability_Reporting_Content_Production_V0.943_4Hartmut.xlsx
  • /samples/Dispov2/MV_DispoRoster.xlsx
  • /samples/Dispov2/Resource Roster_MASTER_FY26_CJ_20251201.xlsx

The goal is not a raw workbook archive. The goal is a normalized CapaKraken dataset that:

  • wipes existing database data and starts from a clean baseline
  • imports canonical reference data first
  • stages operational Dispo data before commit
  • creates real projects, assignments, vacations, availability rules, and reporting inputs
  • does not create fake bookings for unassigned time
  • handles public holidays through the vacation planner

Agreed Business Rules

These rules are fixed unless superseded by a later decision record:

  • There is only one canonical person identifier. EID and Enterprise ID are the same source identity.
  • Existing database data should be wiped completely before the new import.
  • [BMW] is the client token.
  • [11035763] is the stable WBS/project key.
  • {CH80} means utilization category CH and winProbability = 80.
  • _HB and _SB suffixes can be ignored.
  • TBD means unassigned time. It must not create a fake project or fake booking.
  • [tbd] means unresolved project identity and should remain staged until resolved.
  • 2D and 3D map to chapter Digital Content Production.
  • PM maps to chapter Project Management.
  • AD maps to chapter Art Direction.
  • People booked on projects need the corresponding project role assigned on the booking.
  • Internal work categories should be mapped as normalized internal project/utilization buckets.
  • Public holidays should be modeled properly in the vacation planner.
  • Project start and end dates should be inferred from earliest and latest imported assignment dates.
  • Assignment granularity should normalize 50% to 4 hours and 100% to 8 hours.
  • Part-time import must reduce weekday availability on the days the person is not available.

Source Workbook Roles

1. MandatoryDispoCategories_V3.xlsx

Use as the source of:

  • reference/master-data vocabulary
  • client and project attribute glossary
  • calendar and SAH rules
  • validation guidance for resource attributes

Do not treat it as the primary source of transactional planning rows.

2. DISPO_2026.xlsx

Use as the primary source of:

  • operational planning matrix
  • project bookings
  • internal work bookings
  • absences
  • public-holiday source hints
  • part-time hints
  • unassigned capacity

This workbook requires a parser. It is not directly importable as row-based CRUD input.

3. Bi-Weekly Chargeability Reporting...xlsx

Use as the source of:

  • target and forecast reconciliation
  • resource enrichment when missing elsewhere
  • aggregate validation after commit

Do not treat PTD/MTD/YTD outputs as canonical source-of-truth records when CapaKraken can derive them from normalized data.

4. MV_DispoRoster.xlsx

Use as the source of:

  • operational resource master rows keyed by EID
  • SAP_data enrichment for real display names and email addresses
  • chapter, department, client-unit, FTE, and activity-window enrichment
  • pseudo-demand row filtering before any resource commit

Do not import Demand_* rows as real resources.

Verified Source Constraint

The workbook set now includes both a row-based resource master source and a cost-rate source.

What is actually present:

  • MandatoryDispoCategories_V3.xlsx EID-Attr is a glossary/reference sheet
  • Bi-Weekly Chargeability Reporting...xlsx provides roster identity, FTE, management group, chapter, metro city, and client unit
  • MV_DispoRoster.xlsx DispoRoster provides operational roster rows and SAP_data provides real email/display-name coverage for many resources
  • Resource Roster_MASTER_FY26_CJ_20251201.xlsx provides per-person LCR / UCR rows plus management-level averages for fallback resolution
  • DISPO_2026.xlsx provides transactional planning, vacation, holiday, and part-time signals

Implementation consequence:

  • staging can be completed from the current workbook set
  • strict source-data blockers are cleared for the supplied workbook set
  • resources missing a source email are imported with fallback email <EID>@accenture.com
  • demand_* planning identities are ignored during staging and readiness checks
  • unresolved [tbd] project rows still remain staged by design and must not be auto-committed as final projects

Current Implementation Status

Implemented in the application layer:

  • reference workbook parser and stager
  • chargeability parser and stager
  • planning parser and stager
  • project resolution staging
  • roster parser and stager with DispoRoster + SAP_data merge logic
  • cost-rate parser and roster-rate merge logic with exact and management-level fallback resolution
  • readiness assessment over merged resource sources
  • batch staging orchestration across all current workbook inputs

Remaining import blocker for a strict production-grade commit:

  • unresolved [tbd] project references that are intentionally kept out of final project commit

Target Domain Model

The import commits into the existing planning model:

  • Country
  • MetroCity
  • OrgUnit
  • UtilizationCategory
  • Client
  • ManagementLevelGroup
  • ManagementLevel
  • Role
  • Resource
  • ResourceRole
  • Project
  • DemandRequirement
  • Assignment
  • Vacation
  • VacationEntitlement

Relevant current schema anchors:

Required Implementation Changes

1. Canonical Person Identity

CapaKraken currently stores both eid and enterpriseId on Resource. The import should operate on a single canonical identity.

Recommendation:

  • choose enterpriseId as the canonical external person key
  • keep eid synchronized to the same value during transition, or remove its operational significance in the import path
  • reject imports that produce conflicting person rows for the same canonical identity

This avoids duplicate matching logic throughout staging and commit.

2. Staging Layer

Add dedicated staging tables or equivalent durable import records. The staging layer is required.

Recommended staging entities:

  • ImportBatch
  • StagedResource
  • StagedClient
  • StagedProject
  • StagedAssignment
  • StagedVacation
  • StagedAvailabilityRule
  • StagedUnresolvedRecord

Each staged record should retain:

  • source workbook name
  • sheet name
  • row number
  • original raw value
  • normalized parsed fields
  • parser warnings
  • resolution status

The staging layer is the review boundary between workbook parsing and final commit.

3. Clean-Slate Reset

Before final import, wipe the existing database contents.

Recommended reset scope:

  • business data
  • auth/session data
  • derived snapshots
  • previous import artifacts

After reset, reseed immediately:

  • admin user
  • access roles and permissions required to sign in
  • essential platform defaults
  • import reference seed data only when not supplied from the workbooks

Implementation note:

  • take a full backup before reset
  • make the reset command idempotent in non-production development environments
  • require an explicit force flag in scripts

Two-Step Import Flow

Step A. Stage

Goals

  • parse all source files
  • normalize tokens
  • build deterministic staging rows
  • surface conflicts before any planning data is committed

Responsibilities

  1. ingest workbook files
  2. parse reference/master data
  3. parse planning matrix cells into typed staging records
  4. detect conflicts and unresolved cases
  5. generate a review report

Stage Output

  • staged resources
  • staged roles
  • staged chapters/org mapping
  • staged clients
  • staged projects
  • staged assignments
  • staged vacations/public holidays
  • staged part-time availability rules
  • unresolved [tbd] project references
  • unresolved identity conflicts

Step B. Commit

Goals

  • write only validated data
  • preserve traceability to staging
  • block unresolved project identities

Responsibilities

  1. create reference data
  2. create resources and roles
  3. create projects and internal buckets
  4. create assignments
  5. create vacations/public holidays
  6. apply availability overrides
  7. run reconciliation against chargeability workbook aggregates

Commit Rules

  • unresolved [tbd] project rows must not silently create final projects
  • unassigned time must not create assignments
  • weekend markers must not create vacation rows
  • all project bookings must resolve to a role and utilization category
  • all resources must resolve to a canonical person ID

Field Mapping

Reference Data Mapping

Source Field/Token Target Notes
MandatoryDispoCategories_V3.xlsx Country/Territory Country source for country master data and SAH rules
MandatoryDispoCategories_V3.xlsx Metro City MetroCity child of country
MandatoryDispoCategories_V3.xlsx Chapter / org labels OrgUnit and Resource.chapter normalize level mapping during stage
MandatoryDispoCategories_V3.xlsx Management Level Group ManagementLevelGroup includes target percentage reference
MandatoryDispoCategories_V3.xlsx Management Level ManagementLevel child of management level group
MandatoryDispoCategories_V3.xlsx WBS Master Client Client parent master client node
MandatoryDispoCategories_V3.xlsx WBS Client Name Client child legal/sub-client node
MandatoryDispoCategories_V3.xlsx SAH rules Country.dailyWorkingHours, schedule logic, holiday generation rules not all rules map 1:1 yet

Resource Mapping

Source Field Target Notes
EID-Attr, ChgFC, Dispo row metadata Enterprise ID / EID canonical resource key one identity only
ChgFC FTE Resource.fte baseline contract capacity
ChgFC Management Level Group Resource.managementLevelGroupId reference lookup
EID-Attr Management Level Resource.managementLevelId reference lookup
ChgFC Metro City Resource.metroCityId resolve via master data
EID-Attr Country/Territory Resource.countryId resolve via master data
ChgFC MV Org Unit 1 / Chapter Resource.chapter / Resource.orgUnitId normalize by agreed mapping
EID-Attr Unit (Client Unit) Resource.clientUnitId lookup to client tree if modeled there
EID-Attr Ressource Type Resource.resourceType enum mapping required
EID-Attr LCR Resource.lcrCents normalize currency to cents
EID-Attr UCR Resource.ucrCents normalize currency to cents
ChgFC Target (per Level) Resource.chargeabilityTarget or group target source use management group target as primary when consistent

Rate resolution during staging:

  • Resource Roster_MASTER_FY26_CJ_20251201.xlsx is the primary cost-rate source.
  • Exact EID/enterprise matches populate lcrCents and ucrCents directly.
  • If a person is missing in the cost workbook, the importer falls back to same-management-level averages from that workbook.
  • Only resources with neither an exact row nor a usable management-level fallback remain readiness blockers for LCR/UCR.

Chapter and Role Mapping

Resource chapter mapping:

Token Resource chapter
2D Digital Content Production
3D Digital Content Production
PM Project Management
AD Art Direction

Assignment role mapping:

Token Assignment role
2D 2D Artist
3D 3D Artist
PM Project Manager
AD Art Director

Implementation guidance:

  • chapter is a resource-organizational dimension
  • role is an assignment-level delivery dimension
  • assign both when both are known

Project Mapping

Source token Target Notes
[BMW] Project.clientId resolve to client
[11035763] stable WBS key recommended source for Project.shortCode
{CH80} Project.utilizationCategoryId, Project.winProbability = 80 parser splits prefix and numeric suffix
project text body Project.name cleaned of helper suffixes
earliest assignment date Project.startDate inferred
latest assignment date Project.endDate inferred
[tbd] unresolved staged project do not auto-commit as final project

Utilization and Planning Token Mapping

Token family Meaning Commit target
CH chargeable client work project assignment
MO management and operations internal project/bucket assignment
MD market development / initiative internal project/bucket assignment
PD personal development / recruitment internal project/bucket assignment
AB absence vacation row
NA non-bookable / calendar state usually availability or derived calendar rule, not project assignment
UN unassigned no assignment commit

Assignment Mapping

Assignments should be written only when a project or internal bucket is resolved.

Source Target Notes
project booking cell Assignment resource, project, role, start/end, hours/day, percentage
internal MO cell Assignment to internal project/bucket utilization category M&O
internal MD cell Assignment to internal project/bucket utilization category MD&I
internal PD cell Assignment to internal project/bucket utilization category PD&R
[_UN] unassign {UN} no final assignment capacity remains free

Vacation and Holiday Mapping

Source token Target Notes
[_AB] ... {AB} Vacation approved absence row
[_NA] Public Holiday ... {NA} Vacation(type=PUBLIC_HOLIDAY) preferred source of truth is geography-driven generation
[_NA] Weekend {NA} no vacation row derive from calendar

Public holiday implementation should integrate with the existing vacation planner and batch holiday support in vacation.ts.

Availability and Part-Time Mapping

Part-time markers must alter resource availability instead of creating fake bookings.

Rules:

  • 100% maps to 8 available hours on a standard full working day
  • 50% maps to 4 available hours on a standard full working day
  • part-time markers should reduce weekday availability on non-working or reduced-working days

Recommended commit target:

  • Resource.fte remains the contractual baseline
  • Resource.availability stores actual weekday available hours

Examples:

  • 75% with one day fully off per week:
    • availability sets one weekday to 0
  • 80% with reduced time across five weekdays:
    • availability distributes reduced hours across weekdays

If the workbook does not encode the weekday pattern explicitly:

  • stage the record as availabilityPattern = unresolved
  • commit only the fte
  • require manual review before applying weekday reductions

Parser Design

The DISPO_2026.xlsx parser should operate at calendar-slot level and emit normalized staging records.

Parser Inputs

  • workbook name
  • sheet name
  • row identity
  • day/column identity
  • source token text
  • resource roster metadata from the row

Parser Outputs

  • normalized booking type
  • resource canonical ID
  • date
  • slot fraction
  • hours per day
  • percentage
  • client token
  • WBS
  • utilization category
  • win probability
  • role token
  • chapter token
  • ignore flags
  • unresolved reason when parsing fails

Token Normalization Rules

  1. strip ignored suffixes like _HB, _SB
  2. extract client from [CLIENT]
  3. extract WBS from [12345678] when numeric
  4. extract unresolved project marker from [tbd]
  5. extract utilization and win probability from {CH80}
  6. extract role prefix from leading 2D, 3D, PM, AD
  7. classify special bracket tokens [_AB], [_NA], [_UN], [_MO], [_MD], [_PD]

Internal Project Strategy

Internal work should not be left as free text.

Recommendation:

  • seed canonical internal projects or internal planning buckets for:
    • M&O
    • MD&I
    • PD&R
  • assign them consistent utilization categories
  • keep source token text in assignment metadata for traceability

This allows planning, reporting, and chargeability to remain normalized.

[tbd] Resolution Strategy

[tbd] is not a valid final project identity.

Recommended behavior:

  • stage as unresolved project demand
  • do not auto-create final project rows during commit
  • allow reviewer resolution by:
    • matching to existing WBS
    • creating a new real project
    • converting the row into intentional unassigned capacity
    • converting the row into a DemandRequirement when a role need is known but no confirmed project exists

This is the only area where a manual review gate is required by default.

Public Holiday Strategy

Public holidays should be driven by geography and stored in the vacation planner.

Recommended approach:

  1. resolve each resource to country and, where relevant, metro city/federal state
  2. generate public holidays for the applicable calendar year
  3. commit them as approved Vacation(type=PUBLIC_HOLIDAY) rows
  4. reconcile workbook holiday markers against generated rows

Known implementation gap:

  • the chargeability forecast currently passes an empty publicHolidays list into SAH calculation in chargeability-report.ts

Required follow-up:

  • make forecast logic consume public-holiday vacation rows or geography-derived holiday dates

Reconciliation and Acceptance Checks

After commit, run reconciliation against the chargeability workbook.

Required checks:

  • resource count matches expected staged resource count
  • FTE totals by org unit match workbook aggregates within tolerance
  • chargeability targets by management group match expected values
  • project/client/WBS relationships are complete for all committed assignments
  • unassigned capacity is visible only as free capacity, not as fake bookings
  • vacations and public holidays are visible in the vacation planner
  • part-time resources show reduced weekday availability where resolved

Suggested Implementation Order

  1. add import staging schema and storage
  2. add full reset + reseed command
  3. add reference-data importer for MandatoryDispoCategories_V3.xlsx
  4. add canonical resource importer and identity normalization
  5. add role and chapter normalization
  6. add DISPO_2026.xlsx parser
  7. add staged project resolver for WBS and [tbd]
  8. add assignment commit flow
  9. add vacation and public-holiday import flow
  10. add part-time availability overlay logic
  11. add reconciliation report against ChgFC and aggregate sheets
  12. add rollback/replay support for repeated dry runs

Operator Commands

For the clean-slate import workflow, use dedicated DB scripts instead of ad-hoc SQL:

  • reset and bootstrap a disposable environment:
    • pnpm --filter @capakraken/db db:reset:dispo -- --force
  • reset without pg_dump backup only in an intentionally disposable environment:
    • pnpm --filter @capakraken/db db:reset:dispo -- --force --skip-backup
  • seed Dispo v2 reference vocabulary after reset:
    • pnpm --filter @capakraken/db db:seed:dispo-v2

The reset command:

  • backs up the database with pg_dump by default
  • truncates all public tables except Prisma migration history
  • recreates a bootstrap admin user and baseline SystemSettings
  • leaves workbook-derived reference and transactional data to the import pipeline

Decision Log

The following decisions are locked for implementation and should be treated as the canonical baseline for all downstream tickets.

D1. Canonical Resource Identity

  • enterpriseId is the canonical external person key for all staged and committed import logic.
  • eid remains in the schema as a compatibility alias during the transition period.
  • the importer must mirror the canonical identity into both enterpriseId and eid on committed Resource rows unless a later cleanup ticket removes eid
  • conflicting source rows for the same canonical identity must remain staged and unresolved

D2. Reset And Bootstrap Scope

  • the clean-slate reset wipes business data, auth/session data, notifications, audit logs, estimate data, and previous import artifacts
  • the reset must reseed a bootstrap admin user, baseline platform permissions, and required singleton settings immediately after wipe
  • imported workbook data is not the source of truth for platform access bootstrapping
  • reset remains an explicit operator action with backup-first workflow

D3. [tbd] Commit Policy

  • [tbd] rows never auto-create final Project rows
  • [tbd] rows remain staged and unresolved by default
  • a reviewer may explicitly resolve a [tbd] row into:
    • an existing real project
    • a newly created real project
    • intentional unassigned capacity
    • a DemandRequirement
  • automatic commit logic must skip unresolved [tbd] rows

D4. Default Part-Time Fallback

  • when the workbook encodes an explicit weekday pattern, the importer must commit that exact reduced availability pattern
  • when only an FTE/percentage is known, the importer should distribute the reduced weekly capacity evenly across Monday to Friday
  • any such fallback must be marked with a staging warning so operators can refine the weekday pattern later
  • examples:
    • 100% => 8/8/8/8/8
    • 50% => 4/4/4/4/4
    • 80% => 6.4/6.4/6.4/6.4/6.4

D5. Internal Project Buckets

Seed canonical internal planning projects/buckets as follows:

Source token Project short code Project name Utilization category
MO INT-MO Management & Operations M&O
MD INT-MD Market Development & Initiatives MD&I
PD INT-PD People Development & Recruitment PD&R

The importer should preserve the original source token in staged and committed metadata for traceability.

D6. Seeded Role Vocabulary

The baseline role seed list required for Dispo import is:

  • 2D Artist
  • 3D Artist
  • Project Manager
  • Art Director

These roles are assignment-level delivery roles. They do not replace chapter/org ownership on the resource profile.

Recommendation

Proceed with a staged importer and treat this document as the implementation baseline.

The critical architectural choices are:

  • clean-slate reset
  • canonical single-ID resource matching
  • staged parsing before commit
  • no fake unassigned bookings
  • explicit assignment roles separate from resource chapters
  • geography-driven public holidays in the vacation planner
  • manual resolution gate only for [tbd] and unresolved availability patterns