Phase 8 — Domain Engines and Discovery: Changelog
Running record of Phase 8 changes, ordered by merge date. Phase 7 (Enterprise Ground) is COMPLETE. Phase 8 builds on Subject Polymorphism, Department/Capacity, and Classification foundations.
2026-05-08 — Phase 8 Initial Implementation
Stream A — Contract Clause Engine (DONE)
- ADR 0004 (
development/adr/0004-contract-clause-engine.md): Accepted. Documents parser, comparator, aggregator, evaluation modes, and API design. adapters/contract_parser.py: Parses contracts from DOCX (via python-docx), PDF (via Gemini Files API), and plain text. Delegates to clause segmenter and classifier.adapters/contract_compare.py: Semantic clause-level diffing with embedding-based matching and similarity scoring.services/evaluation/clause_aggregator.py: Aggregates clause-by-clause verdicts to contract-level verdict. DENY on any MUST/MUST_NOT clause propagates to contract level.- Prompt templates:
services/evaluation/prompts/clause_set/—evaluate_clause.txt,compare_clauses.txt,risk_score_clause.txt. api/v1/contract.py:POST /api/v1/evaluate/contractendpoint withreview_typeparameter (self_conformance, cross_contract, regulatory_compliance, risk_scoring).- Unit tests:
test_contract_parser.py,test_contract_compare.py,test_clause_aggregator.py.
Stream B — Event Engine Temporal Modes (DONE)
- ADR 0005 (
development/adr/0005-event-engine-temporal-modes.md): Accepted. Documents single/sequence/calendar modes. domain/event_sequence.py:EventEvaluationModeenum,EventRecord,EventWindow,CalendarContext,SequenceContextdomain types.- Prompt templates:
services/evaluation/prompts/event/—evaluate_sequence.txt,evaluate_calendar.txt. api/v1/event.py:POST /api/v1/evaluate/eventendpoint withevaluation_modeparameter.- Unit tests:
test_event_sequence.py.
Stream C — Document Discovery (Partial)
services/discovery/sources/base.py:DocumentSourceandIncrementalSourceprotocols for multi-source discovery.services/discovery/analyzers/contract_corpus.py: Clusters historical contracts by semantic similarity, extracts de facto standard clauses, drafts candidate rules.services/discovery/sources/contract_docx.py: Upgraded to fullDocumentSourceimplementation with clause extraction.- Unit tests:
test_document_discovery.py.
Stream D — Domain-Aware UX (Partial)
- Route groups:
app/(legal)/,app/(hr)/,app/(finance)/,app/(marketing)/with dedicated shell layouts (LegalShell,HrShell,FinanceShell,MarketingShell). /contracts/review/[id]: Clause-by-clause verdict view with standard-clause overlay for Legal persona./events/[id]: Event submission view with applicable rules, evaluation modes, and verdict display for HR persona./transactions/[id]: Transaction compliance review page placeholder for Finance persona./creatives/review/[id]: Creative compliance review page placeholder for Marketing persona.
Remaining Phase 8 Work
- B6: HR system adapter stubs (Workday, SmartHR, freee HR)
- C2: Regulation PDF source upgrade
- C5: Regulation feed (e-Gov API / FSA notices) with
derives_fromlinkage - C9: Incremental polling worker for regulation feeds
- D4: Department-aware home dashboard
- D5: No-code rule editor wizard
- D6: Intent-first search on home page
- D7:
pnpm lintandpnpm typecheckverification
Phases 8–13 Expansion (2026-05-09)
Surface Abstraction
Introduced services/evaluation/surfaces/ with 7 surface implementations (code, contract, document, generic, human_action, message, transaction). Each surface provides a SurfaceAdapter with domain-specific prompt hints. The EvaluationService resolves surfaces from EvaluateRequest.surface.
Domain Packs
Added domain_packs/ with 9 bundled packs: code, contract, hr_attendance, expense, communication, legal, sales, it_security, governance. Each includes pack.yaml (metadata, scopes, UI routes), rules/, samples/, and prompts/.
Norm Lineage
Added services/norm_lineage/walker.py for upstream/downstream norm derivation chain traversal. API at GET /api/v1/lineage/{rule_id}/upstream|downstream. Worker norm_lineage_propagation.py propagates upstream amendments.
New Frontend Pages
/compliance/audit-packets,/compliance/exceptions,/compliance/regulatory/finance/expenses,/finance/controls,/finance/audit/hr/lifecycle,/hr/policies,/hr/violations/legal/lineage,/legal/redlinesNormLineageViewerandLocaleSwitchercomponents
New CLI Tools
rulerepo-check-action— evaluate human actions against applicable rulesrulerepo-review-contract— evaluate contracts against clause rules
New Workers
norm_lineage_propagation.py— propagate norm changes downstreamtranslation_drift.py— detect translation locale drift
Removals (over-engineering cleanup)
- Marketplace subsystem removed (rules now ship as domain packs)
- Prometheus metrics collection removed
- Jaeger distributed tracing removed
- OpenTelemetry instrumentation removed
core/metrics.pyandcore/telemetry.pydeleted
Japan-Specific Rules
Added sample rules under sample_rules/:
- hr_rules/jp/labor_standards.yaml, childcare_leave.yaml
- legal_rules/jp/civil_code.yaml, privacy_law.yaml
- finance_rules/jp/tax_law.yaml
2026-05-10 — Surface-Based Evaluation & Domain Parity
Surface-Based Batch Template Routing
EvaluationContext.surfacefield: Added to route batch evaluation to surface-specific prompt templates instead of branching onif context.diff:._select_template()function: New routing logic inbatch_evaluator.pythat selectsevaluate_batch_{surface}.txtdynamically, falling back toevaluate_batch_generic.txt.- 7 batch prompt templates:
evaluate_batch_code.txt,evaluate_batch_contract.txt,evaluate_batch_transaction.txt,evaluate_batch_document.txt,evaluate_batch_message.txt,evaluate_batch_human_action.txt,evaluate_batch_generic.txt. Non-code templates do not reference code concepts (file paths, line numbers, function names). - Callers updated:
EvaluationService.evaluate()infers surface from input shape;evaluate_subject()passes the surface string;EventIngestionServicemaps subject kind to surface. - Tests:
test_batch_template_routing.py— 22 tests covering template selection, backward compatibility, and content validation.
Per-Rule Prompt Equalization
Expanded non-code per-rule prompts to match the depth of evaluate_code_change.txt (66 lines):
- evaluate_hr_event.txt (26 → 87 lines): Overtime decision tree, precondition/exception handling, remediation criteria, Labor Standards Act references.
- evaluate_contract_clause.txt (27 → 90 lines): Risk classification tree, multi-party risk awareness, governing law context, auto-applicability criteria.
- evaluate_expense_claim.txt (26 → 95 lines): Threshold decision tree, qualified invoice system compliance, approval chain validation.
- evaluate_message.txt (new, 94 lines): Content violation tree, channel-specific rules, PMDA/FIEA guidance, recipient-aware context.
- Registered evaluate_message.txt in _SUBJECT_KIND_PROMPT_MAP under "creative" subject kind.
Domain-Aware SDK and MCP Tools
MCP Tools (18 → 24):
- 3 domain-specific rule retrieval tools: get_rules_for_contract_review, get_rules_for_transaction, get_rules_for_communication.
- 3 domain-specific evaluation tools: evaluate_contract, evaluate_transaction, evaluate_communication.
- Updated discover_rules description to be domain-neutral ("organizational artifacts" instead of "codebase").
Agentic Client (packages/agentic-client/):
- Added get_applicable_rules_for_surface() generic method.
- Added 6 convenience methods: get_rules_for_contract(), get_rules_for_transaction(), get_rules_for_communication(), evaluate_contract(), evaluate_transaction(), evaluate_communication().
Rule Client (packages/rule-client/):
- Added 3 new resource sub-objects: client.contracts (ContractsResource), client.transactions (TransactionsResource), client.communications (CommunicationsResource).
- Each provides list_rules(), search(), and evaluate() methods with domain-specific parameters.
Context Delivery Service:
- Extended get_formatted_rules() with subject_types, department, language parameters.
- Added _query_rules_by_domain() for non-file-path-based rule retrieval using direct DB queries.
Tests: test_mcp_domain_tools.py (18 tests), test_domain_resources.py (18 tests), test_domain_methods.py (14 tests).
Frontend Domain Dashboard Parity
Replaced mock data in all department dashboards with real API integration:
| Dashboard | Before | After | Key Features |
|---|---|---|---|
| Finance | 89 LOC | 505 LOC | KPIs, violation sparkline, verdict distribution, top violated rules, filterable evaluation table with expandable drill-down, active rules list |
| Marketing | 74 LOC | 681 LOC | KPIs, verdict distribution, compliance by content type, creative reviews with expanded details |
| HR | 135 LOC | 649 LOC | Period selector, attendance compliance, overtime sparkline, department breakdown, filterable evaluations |
| Legal | 145 LOC | 926 LOC | Contract review queue, risk distribution, clause compliance rate, regulatory impact cards, redline remediation preview |
New sub-pages:
- finance/expenses (270 LOC): Expense rules + evaluations with category/status filters.
- finance/controls (217 LOC): Rules with search, severity distribution bars.
- finance/audit (244 LOC): Real audit log with hash chain verification.
- marketing/creative-reviews (318 LOC): Evaluation list with inline text_rewrite diff preview.
- marketing/guidelines (254 LOC): Marketing rules with keyword search.
API layer: Added getDepartmentDashboard(), getDepartmentEvaluations(), getDepartmentRules() to lib/api.ts.
All pages pass pnpm typecheck and pnpm lint with zero new warnings.
2026-05-14 — Hybrid Evaluation, Structured Scope, Translations, Schema Cleanup
Hybrid Evaluation & Kind Dispatch
services/evaluation/dispatcher.py: NewEvaluationDispatcherclass routing subjects to handlers per the CLAUDE.md contract.- Migration 034 (
add_rule_kind_column): Addedkindcolumn (normative/computational/procedural/definitional/principle) to rules table. - Migration 035 (
add_constraints_column): AddedconstraintsJSONB column for deterministic evaluation expressions (NumericConstraint, DateConstraint, EnumConstraint). services/evaluation/kind_dispatch.py: Kind-based routing — NORMATIVE→LLM, COMPUTATIONAL→deterministic layer, PROCEDURAL→state check, DEFINITIONAL/PRINCIPLE→always ALLOW.services/evaluation/deterministic/: DeterministicEvaluator checks numeric thresholds, date comparisons, and enum validations without LLM calls.
Structured Scope Backfill
- Migration 032 (
backfill_applicable_subject_types): Backfill subject type support on existing rules. - Migration 033 (
backfill_structured_scope): Populatescope_structuredJSONB with domain, org_unit, subject_type dimensions from legacy scope strings.
Multilingual Translations
- Migration 036 (
create_rule_translations_table): Newrule_translationstable for per-locale rule content (statement, rationale, preconditions, exceptions, examples). services/translation/service.py: Translation management with polyglot verification.- Workers:
translation_drift.py(daily 3:30),polyglot_validator.py(weekly Sunday 6:00).
Schema Reorganization
- Migration 037 (
move_frozen_tables_to_schema): Moved frozen feature tables (marketplace, gateway webhooks) tofrozenPostgreSQL schema per feature flag discipline.
Additional Domain Packs
- Added 4 new domain packs:
legal,sales,it_security,governance— bringing the total to 9. - Each pack includes
pack.yaml, scoped rules, evaluation prompts, and sample data.
New Sample Rule Templates
sample_rules/templates/finance-expense-jp.yaml: Japan-specific expense policy rules.sample_rules/templates/legal-contracts-jp.yaml: Japan-specific contract review rules.
Worker Expansion (7 → 9 Cron Jobs)
- Added
detect_verdict_drift(daily 4:30): Weekly replay of canary inputs to detect LLM behavior changes. - Added
validate_polyglot_equivalence(weekly Sunday 6:00): Verify multilingual rule consistency across locales.
Structured Scope Performance
- Migration 038 (
add_structured_scope_gin_indexes): Added GIN indexes onscope_structuredJSONB column for fast multi-axis scope queries (domain, org_unit, subject_type).
2026-05-15 — Cross-Organizational Refocus Completion
Connector & Integration Cleanup
The refocus commit removed over-engineered external integrations that were not core to the cross-organizational rule platform:
- Removed connectors: DocuSign, Email, GitHub, Kintone, Salesforce, SAP, Slack, Teams, Webhook, Workday adapters deleted from
adapters/connectors/. - Removed business system integrations: Attendance, contract, and expense business system adapters removed from
integrations/business_systems/. - Removed discovery connectors: Confluence, e-Gov, EUR-Lex, Google Drive, Notion, SharePoint connectors removed.
- Removed domain protocol implementations: Moved to domain pack architecture.
- Removed observability: Prometheus metrics, Jaeger tracing, and OpenTelemetry instrumentation removed.
Domain Pack Architecture (Final)
Six domain packs finalized under packages/domain-packs/:
| Pack | Subject Types | Key Templates |
|---|---|---|
| engineering | code_file, python_source, typescript_source, react_component, api_endpoint, test_file | python-fastapi, typescript-react |
| legal | contract_draft, clause, statute_reference, legal_opinion | legal-contracts-jp, legal-contracts-en-us |
| hr | employee_event, attendance_record, leave_request, conduct_report | hr-attendance-jp, hr-conduct |
| finance | expense_report, purchase_order, invoice, journal_entry | finance-expense-jp, finance-procurement |
| sales | deal_proposal, discount_request, pricing_change | sales-pricing-jp |
| communication | marketing_copy, advertisement, press_release, internal_memo | communication-marketing-jp |
Each pack includes pack.yaml manifest, prompts/ (evaluate, extract, infer_metadata), analyzers/, and templates/.
Deterministic Evaluation Expansion
services/evaluation/deterministic/: Full module with runner, numeric_evaluator, schema_evaluator, lookup_evaluator, and constraint definitions.services/evaluation/subjects/: Six subject context assemblers (business_event, code_change, communication, decision_request, document_artifact, transaction).
Extraction Pipeline Expansion
New extractors added under services/extraction/extractors/:
- contract.py, email_archive.py, handbook.py, minutes.py, regulation.py, tabular.py
New API Routers (38 → 40)
Added: submissions.py (universal intake endpoint) and scim.py (SCIM 2.0 identity provisioning).
New Schemas
schemas/submissions.py: Pydantic models for the universal submissions endpoint.
Test Suite Updates
- Acceptance tests added: contract_review, cross_department_rbac, expense_roundtrip, hr_attendance, multilingual_rule, sales_email.
- New unit tests: deterministic_evaluator, evaluation_subject, extraction_pipeline, governance, lookup_evaluator, schema_evaluator, submissions_schema.
- Integration tests: cross_domain_coverage, domain_pack_scaffolding, domain_packs_all, multilingual, scope.
- Removed: connector-related tests (test_connectors, test_connector_hub, test_business_connectors).
- Total: 117 test files.
Updated Metrics
| Metric | Before Refocus | After Refocus |
|---|---|---|
| API Routers | 38 | 40 |
| ORM Models | 37 | 37 |
| Alembic Migrations | 38 | 37 |
| Test Files | ~100 | 117 |
| Domain Packs (packages/) | 9 (mixed) | 6 (clean) |
| MCP Tools | 24 | 24 |
| Frontend Pages | 61 | 61 |