Correction-to-Rule Flywheel
The Rule Repository learns from human behavior. Every time a human corrects agent-generated code, the system captures the correction, analyzes the pattern, clusters similar corrections, and auto-drafts rule proposals for one-click approval.
How It Works
- Agent writes code — evaluated against rules, gets ALLOW
- Human reviews — corrects something the rules missed
- Correction captured — automatically from PR merge or manually submitted
- Analyzer classifies — is this a new rule, an improvement, or a scope gap?
- Daily worker clusters —
cluster_correctionsgroups similar corrections by embedding similarity - Rule auto-drafted — Gemini drafts a rule statement from the cluster evidence
- Maintainer approves — one-click approval creates a rule in experimental maturity (shadow mode)
- Rule graduates — as the rule proves accurate, it auto-promotes to stable, then proven
- Agent receives rule — via MCP, writes compliant code from now on
Capture Methods
| Method | Trigger | Effort |
|---|---|---|
| Auto PR capture | PR merged with changes beyond evaluated diff | Zero (automatic) |
| Manual submission | User fills in original vs corrected diff on /feedback |
Low |
| Agent hook | rulerepo-hook posthoc detects human edits after agent writes |
Zero (if hooks configured) |
Correction Analysis
Each correction is classified:
new_rule— No existing rule covers this pattern. A new rule candidate is generated.improve_existing— A rule exists but its wording is ambiguous. A rewrite is suggested.adjust_scope— A rule exists but wasn't delivered to the agent. The scope mapping is widened.
Clustering and Auto-Drafting
The cluster_corrections background worker (runs daily at 5am) implements the flywheel:
- Fetch: loads all pending corrections from the last 14 days
- Embed: generates embeddings for each correction's delta summary using Gemini
- Cluster: groups corrections by cosine similarity (threshold 0.8)
- Filter: keeps clusters with 3+ corrections and average confidence > 0.8
- Draft: calls Gemini to generate a structured rule (statement, modality, severity, scope, rationale)
- Store: creates
DraftRuleProposalModelentries with status "pending"
Proposal Review
Proposals are available at:
GET /api/v1/feedback/proposals— list pending proposals with evidencePOST /api/v1/feedback/proposals/{id}/approve— creates a rule withmaturity_level=experimentalPOST /api/v1/feedback/proposals/{id}/dismiss— marks as dismissed
Approved proposals create rules in shadow mode (experimental maturity), meaning they observe but don't block until they prove accurate.
Effectiveness Tracking
After a rule is created from corrections, the system tracks:
- Did corrections in that pattern decrease?
- What's the false positive rate? (tracked by false_positive_count / true_positive_count on RuleModel)
- Is the rule actually being evaluated? (health scoring activity dimension)
The auto_promote_rules worker automatically graduates rules from experimental → stable → proven based on accuracy.
Data Reconciliation
If Elasticsearch or Neo4j fall behind PostgreSQL:
uv run python scripts/reindex_elasticsearch.py # Rebuild ES index
uv run python scripts/reconcile_graph.py # Rebuild Neo4j graph
PostgreSQL is always the source of truth.
See Also
- Feedback Loop -- correction submission and analysis
- Health Scoring -- rule health dimensions including activity
- Background Workers -- cron job details