feat(categories): categoryMappingService 4-pass algo (#119) #128

Merged
maximus merged 1 commit from issue-119-category-mapping-service into main 2026-04-21 01:07:16 +00:00
Owner

Fixes #119

Summary

Pure function computeMigrationPlan(profileData): MigrationPlan that computes a v2 → v1 category migration plan using a 4-pass algorithm.

Algorithm

Pass Input Confidence Reason
1 Keyword on v2 category matches a KEYWORD_TO_V1 rule high keyword
2 Transaction description or supplier name matches a KEYWORD_TO_V1 rule medium supplier
3 DEFAULT_MAPPINGS entry for the v2 id (single or split) high / medium / low default
4 Nothing matched none review
  • Structural v2 parents (ids 1–6) are skipped.
  • Custom v2 categories (not in v2 seed) go into plan.preserved[].
  • Splits (e.g. Transport en commun → Autobus + Train) are exposed via splits[] when Pass 3 triggers.

Files

  • src/services/categoryMappingService.ts — service + types.
  • src/services/categoryMappingService.test.ts — 20 unit tests (one per pass, custom, splits, stats, pass priority).

Checklist

  • Pure function, no DB, no Tauri I/O
  • Types: MigrationPlan, MappingRow, ConfidenceBadge
  • Mapping table encoded from mapping-old-to-new.md
  • Custom categories → preserved bucket
  • Splits detection (Transport, Voiture, Assurances, Voyage, Sports, Électroménagers & Meubles, Jeux/Films/Livres, Internet & Télécom)
  • Unit tests covering each pass
  • npx tsc --noEmit clean
  • npm test clean (168/168, +20 new)
  • npm run build clean

Test plan

  • Pass 1 priority: keyword wins over default
  • Pass 2: supplier propagation from description + from supplier name
  • Pass 3: direct / medium / low / split with primary target
  • Pass 4: Projets (73) with no keyword → review
  • Custom: 9001 → preserved
  • Structural parents (1–6) skipped
  • Stats aggregate correctly across a mixed profile
Fixes #119 ## Summary Pure function `computeMigrationPlan(profileData): MigrationPlan` that computes a v2 → v1 category migration plan using a 4-pass algorithm. ### Algorithm | Pass | Input | Confidence | Reason | |------|-------|------------|--------| | 1 | Keyword on v2 category matches a KEYWORD_TO_V1 rule | high | `keyword` | | 2 | Transaction description or supplier name matches a KEYWORD_TO_V1 rule | medium | `supplier` | | 3 | DEFAULT_MAPPINGS entry for the v2 id (single or split) | high / medium / low | `default` | | 4 | Nothing matched | none | `review` | - Structural v2 parents (ids 1–6) are skipped. - Custom v2 categories (not in v2 seed) go into `plan.preserved[]`. - Splits (e.g. Transport en commun → Autobus + Train) are exposed via `splits[]` when Pass 3 triggers. ### Files - `src/services/categoryMappingService.ts` — service + types. - `src/services/categoryMappingService.test.ts` — 20 unit tests (one per pass, custom, splits, stats, pass priority). ### Checklist - [x] Pure function, no DB, no Tauri I/O - [x] Types: `MigrationPlan`, `MappingRow`, `ConfidenceBadge` - [x] Mapping table encoded from `mapping-old-to-new.md` - [x] Custom categories → preserved bucket - [x] Splits detection (Transport, Voiture, Assurances, Voyage, Sports, Électroménagers & Meubles, Jeux/Films/Livres, Internet & Télécom) - [x] Unit tests covering each pass - [x] `npx tsc --noEmit` clean - [x] `npm test` clean (168/168, +20 new) - [x] `npm run build` clean ### Test plan - [x] Pass 1 priority: keyword wins over default - [x] Pass 2: supplier propagation from description + from supplier name - [x] Pass 3: direct / medium / low / split with primary target - [x] Pass 4: Projets (73) with no keyword → review - [x] Custom: 9001 → preserved - [x] Structural parents (1–6) skipped - [x] Stats aggregate correctly across a mixed profile
maximus added 1 commit 2026-04-21 01:02:20 +00:00
feat(categories): add categoryMappingService (4-pass algo) (#119)
All checks were successful
PR Check / rust (push) Successful in 22m33s
PR Check / frontend (push) Successful in 2m18s
PR Check / rust (pull_request) Successful in 21m36s
PR Check / frontend (pull_request) Successful in 2m13s
be3cda1556
Pure function that computes a v2 → v1 category migration plan from a
snapshot of the profile data. The 4-pass algorithm (keyword → supplier
propagation → default fallback → needs review) produces a MigrationPlan
with confidence badges (high/medium/low/none) per row, exposes split
targets for categories that ventilate across multiple v1 leaves (e.g.
Transport en commun → Autobus + Train), and preserves user-custom
categories in a dedicated bucket for later placement under
"Catégories personnalisées (migration)".

- Mapping tables encoded from
  .spikes/archived/seed-standard/code/mapping-old-to-new.md
- No DB I/O: the caller hands us categories, keywords, transactions and
  optional suppliers; the service stays testable and side-effect-free.
- 20 unit tests cover every pass, custom preservation, split exposure,
  stats aggregation and pass priority.

Prepares the ground for #121 (migration writer UI).
Author
Owner

Self-review — categoryMappingService

Note: Forgejo blocks formal self-approval; posting as a comment.

Security

  • Pure TS service: no DB, no SQL, no Tauri IPC, no file I/O, no regex compiled from user input. Mapping tables are static constants.
  • normalizeForMatch mirrors the proven normalizeDescription in categorizationService. No ReDoS surface.
  • No PII / secret exposure.

Correctness

  • 4 passes applied in required order per issue body: keyword → supplier → default → review.
  • Pass 1 checks user keywords against KEYWORD_TO_V1 (high confidence) and falls back to v1 leaf name lookup.
  • Pass 2 checks transaction descriptions AND supplier names (medium confidence).
  • Pass 3 applies DEFAULT_MAPPINGS entries (single / split / none) with confidence from the mapping-old-to-new source.
  • Pass 4 marks unresolved rows with confidence: none, reason: review.
  • Splits ONLY exposed when Pass 3 fires — Pass 1/2 resolve to a single v1 leaf and skip splits (verified by test Pass 1 wins over Pass 3 on split categories).
  • Structural v2 parents (1–6) skipped. Custom v2 categories routed to plan.preserved[].

Quality

  • Public types exported: MigrationPlan, MappingRow, ConfidenceBadge, MappingReason, ProfileData, V1Target.
  • Test helper __resetMappingServiceCachesForTests follows the double-underscore convention.
  • toV1Target throws loudly if DEFAULT_MAPPINGS drifts away from categoryTaxonomyV1.json — prevents silent corruption.
  • All comments in English, per project convention.

Data integrity

  • Mapping encoded 1:1 from .spikes/archived/seed-standard/code/mapping-old-to-new.md:
    • Confidence badges (🟢 high / 🟡 medium / 🟠 low / 🔴 none) preserved.
    • Splits covered: 26 (Jeux/Films/Livres), 28 (Transport commun), 29 (Télécom), 31 (Assurances), 40 (Voiture), 47 (Voyage), 48 (Sports), 53 (Électroménagers & Meubles).
    • Level-3 Assurance children (310/311/312) also mapped for profiles that already split them.
  • Primary split target matches the "reste → X par défaut" rationale.

Tests

  • 20 unit tests, one per pass + priority + custom + structural + stats + splits.
  • Full suite: 168/168 passing (+20 new, 148 unchanged).
  • npx tsc --noEmit clean; npm run build clean.

Scope compliance

  • No DB writes. No UI. No i18n changes. No CHANGELOG entry (infrastructure for #121).

Verdict: APPROVE

## Self-review — categoryMappingService _Note: Forgejo blocks formal self-approval; posting as a comment._ ### Security - Pure TS service: no DB, no SQL, no Tauri IPC, no file I/O, no regex compiled from user input. Mapping tables are static constants. - `normalizeForMatch` mirrors the proven `normalizeDescription` in `categorizationService`. No ReDoS surface. - No PII / secret exposure. ### Correctness - 4 passes applied in required order per issue body: keyword → supplier → default → review. - Pass 1 checks user keywords against KEYWORD_TO_V1 (high confidence) and falls back to v1 leaf name lookup. - Pass 2 checks transaction descriptions AND supplier names (medium confidence). - Pass 3 applies DEFAULT_MAPPINGS entries (single / split / none) with confidence from the mapping-old-to-new source. - Pass 4 marks unresolved rows with `confidence: none, reason: review`. - Splits ONLY exposed when Pass 3 fires — Pass 1/2 resolve to a single v1 leaf and skip splits (verified by test `Pass 1 wins over Pass 3 on split categories`). - Structural v2 parents (1–6) skipped. Custom v2 categories routed to `plan.preserved[]`. ### Quality - Public types exported: `MigrationPlan`, `MappingRow`, `ConfidenceBadge`, `MappingReason`, `ProfileData`, `V1Target`. - Test helper `__resetMappingServiceCachesForTests` follows the double-underscore convention. - `toV1Target` throws loudly if DEFAULT_MAPPINGS drifts away from `categoryTaxonomyV1.json` — prevents silent corruption. - All comments in English, per project convention. ### Data integrity - Mapping encoded 1:1 from `.spikes/archived/seed-standard/code/mapping-old-to-new.md`: - Confidence badges (🟢 high / 🟡 medium / 🟠 low / 🔴 none) preserved. - Splits covered: 26 (Jeux/Films/Livres), 28 (Transport commun), 29 (Télécom), 31 (Assurances), 40 (Voiture), 47 (Voyage), 48 (Sports), 53 (Électroménagers & Meubles). - Level-3 Assurance children (310/311/312) also mapped for profiles that already split them. - Primary split target matches the "reste → X par défaut" rationale. ### Tests - 20 unit tests, one per pass + priority + custom + structural + stats + splits. - Full suite: 168/168 passing (+20 new, 148 unchanged). - `npx tsc --noEmit` clean; `npm run build` clean. ### Scope compliance - No DB writes. No UI. No i18n changes. No CHANGELOG entry (infrastructure for #121). **Verdict: APPROVE**
maximus merged commit 1640a73499 into main 2026-04-21 01:07:16 +00:00
maximus deleted branch issue-119-category-mapping-service 2026-04-21 01:07:17 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: maximus/Simpl-Resultat#128
No description provided.