Testing Strategy¶

Three-Layer Testing¶

Layer 1: Unit Tests (< 5s total)¶

Pure logic tests. No filesystem, no network, no subprocess.

What to test: - internal/model/: type validation, sorting, ParseFailure helper - internal/score/: scoring algorithm with various rule/finding combinations - internal/score/metrics.go: metric computations with property-based tests - internal/config/: config loading and validation - internal/rule/: registry operations, engine evaluation with fake FactStore - internal/fix/: plan generation, change application, rollback logic - internal/policy/: policy enforcement with table tests - Individual resolvers against in-memory FactStore

Conventions: - Table-driven tests with ≥ 3 cases per function - Subtests named descriptively: t.Run("empty_repo_produces_no_findings", ...) - No testify or assertion libraries — standard testing package only - Property-based tests for scoring and metrics (generate random inputs, assert invariants)

Layer 2: Pack Tests (< 20s total)¶

Each rule runs against its fixture repo. Output is diffed against expected.json.

Structure:

packs/core/
├── fixtures/
│   ├── P1.LOC.001/
│   │   ├── input/          # minimal repo that triggers the rule
│   │   │   └── (no CLAUDE.md or AGENTS.md)
│   │   └── expected.json   # expected finding shape
│   ├── P1.LOC.002/
│   │   ├── input/
│   │   │   ├── services/
│   │   │   │   └── billing/
│   │   │   │       └── main.go
│   │   │   └── CLAUDE.md
│   │   └── expected.json
│   └── ...
└── pack_test.go            # table tests iterating fixtures

How pack tests work: 1. Build a FactStore from the fixture's input/ directory 2. Run the rule's resolver against that FactStore 3. Canonicalize findings (sort, strip non-deterministic fields) 4. Compare against expected.json

Golden test updates:

make update-golden    # regenerate expected.json files
git diff              # review EVERY change carefully

Never commit golden updates and code changes in the same commit without a review note.

Layer 3: End-to-End (< 60s, CI only by default)¶

Full archfit scan on controlled repos in testdata/e2e/.

Structure:

testdata/e2e/
├── golden_clean/       # repo that passes all rules
│   ├── input/
│   └── expected.json
├── golden_findings/    # repo with known findings
│   ├── input/
│   └── expected.json
└── e2e_test.go

Asserts: full JSON output shape, overall score, exit code.

Test Helpers¶

Fake FactStore¶

// Used in unit tests for resolvers
type fakeFactStore struct {
    repo    model.RepoFacts
    git     model.GitFacts
    gitOK   bool
    schemas model.SchemaFacts
}

Build with helper functions:

facts := newFakeFactStore(model.RepoFacts{
    Root: "/fake",
    ByBase: map[string][]string{
        "claude.md": {"CLAUDE.md"},
    },
})

Fake exec.Runner¶

runner := exec.NewFake(map[string]exec.FakeResult{
    "git log": {Stdout: "abc123 feat: something\n", ExitCode: 0},
})

Fake llm.Client¶

client := llm.NewFake(llm.FakeConfig{
    Response: "This is a fake LLM response",
    Model:    "fake-model",
})

What NOT to Test¶

Do not test Go standard library behavior
Do not test exact string formatting of terminal output (test structure instead)
Do not pin LLM output as golden (it's non-deterministic by design)
Do not test with real API keys, real git remotes, or real network

Running Tests¶

make test          # unit + pack tests with -race
make test-short    # fast subset
make e2e           # end-to-end (CI default)
make self-scan     # archfit on itself

All tests must pass with -race flag. Non-determinism is a bug.

Adding Tests for a New Rule¶

Create packs/<pack>/fixtures/<rule-id>/input/ with a minimal repo that triggers the rule
Run the resolver manually to get the expected output
Save as packs/<pack>/fixtures/<rule-id>/expected.json
Add a row to the table test in pack_test.go
Verify: go test -run TestPack/<rule-id> ./packs/<pack>/

AST-Dependent Rules¶

Rules that consume ASTFacts require an additional parse-failure fixture. This fixture contains a file that is syntactically invalid (e.g., truncated Go source) and asserts that the resolver produces a severity: warn / evidence_strength: strong finding via model.ParseFailure rather than silently returning zero findings. This prevents AST-dependent rules from appearing to pass on repos where the collector could not parse the source.

Adding Tests for a New Fixer¶

Create a FactStore that produces a finding for the target rule
Call fixer.Plan(ctx, finding, facts) and assert the proposed changes
Apply changes to an in-memory filesystem (map of path → content)
Re-run the resolver against the updated facts
Assert the finding is gone