Harness & Safety¶
The harness system is AgenticAPI's core safety layer. Every piece of LLM-generated code passes through a multi-stage pipeline before execution.
Pipeline¶
Generated Code
-> Policy Evaluation (CodePolicy, DataPolicy, ResourcePolicy, RuntimePolicy)
-> Static AST Analysis (check_code_safety)
-> Approval Check (ApprovalWorkflow, if configured)
-> Sandbox Execution (ProcessSandbox)
-> Post-Execution Monitors (ResourceMonitor, OutputSizeMonitor)
-> Post-Execution Validators (OutputTypeValidator, ReadOnlyValidator)
-> Audit Recording (AuditRecorder + ExecutionTrace)
HarnessEngine¶
from agenticapi import HarnessEngine, CodePolicy, DataPolicy
harness = HarnessEngine(
policies=[
CodePolicy(denied_modules=["os", "subprocess"]),
DataPolicy(deny_ddl=True),
],
)
See Policies, Approval, and Security for details on each stage.
Four Policy Types¶
CodePolicy¶
Controls what Python constructs are allowed in generated code:
CodePolicy(
denied_modules=["os", "subprocess", "sys", "shutil"], # Blocked imports
deny_eval_exec=True, # No eval() or exec()
deny_dynamic_import=True, # No __import__()
allow_network=False, # No socket/urllib/requests
max_code_lines=500, # Max generated code size
)
DataPolicy¶
Controls SQL data access patterns:
DataPolicy(
readable_tables=["orders", "products"],
writable_tables=["orders", "cart"],
restricted_columns=["password_hash", "ssn"],
deny_ddl=True,
max_result_rows=1000,
)
ResourcePolicy¶
Limits computational resources:
RuntimePolicy¶
Limits code complexity:
Static Analysis¶
Before sandbox execution, code is parsed into an AST and checked for:
- Forbidden module imports
eval()/exec()calls__import__()calls- Dangerous builtins (
compile,globals,locals,vars,getattr,setattr,delattr) - File I/O (
open()) - Syntax errors
Audit Trail¶
Every execution is recorded as an ExecutionTrace:
records = harness.audit_recorder.get_records(endpoint_name="orders", limit=50)
for trace in records:
print(f"[{trace.timestamp}] {trace.intent_raw} -> {trace.execution_duration_ms}ms")
Choosing a recorder¶
Two recorders ship in the box, both satisfying the same AuditRecorderProtocol:
| Recorder | Storage | When to use |
|---|---|---|
InMemoryAuditRecorder |
Process memory, bounded ring buffer | Dev, tests, single-process demos |
SqliteAuditRecorder |
Local SQLite file (stdlib sqlite3) |
Production, dashboards, post-mortems |
from agenticapi.harness import HarnessEngine
from agenticapi.harness.audit import SqliteAuditRecorder
recorder = SqliteAuditRecorder(path="./audit.sqlite", max_traces=100_000)
harness = HarnessEngine(audit_recorder=recorder, policies=[...])
SqliteAuditRecorder serializes writes behind an asyncio.Lock and dispatches blocking SQLite work through asyncio.to_thread, so it's safe to share across concurrent requests without blocking the event loop. Two indices are created on first use — (timestamp DESC) and (endpoint_name) — keeping the standard dashboard queries fast.
Additional helpers beyond the shared protocol: iter_since(datetime) for streaming recent traces to a dashboard, vacuum_older_than(cutoff) for retention policies, and count() / clear() / close() for administrative workflows.
See API Reference → Audit for the full signatures.