Architecture¶
Note
This page describes the public architecture. For implementation caveats around BudgetPolicy, native tool calling, and typed-intent schema enforcement, see Development -> Current State.
Layer Structure¶
HTTP Request (ASGI)
-> Trace Extraction W3C traceparent header -> OpenTelemetry Context (if OTel installed)
-> Authentication Scheme extraction + verify function (if auth= configured)
-> Interface Layer Intent parsing (incl. Intent[T] typed payloads), sessions, response formatting
-> Dependencies FastAPI-style Depends() resolution with async teardown
-> Harness Engine Policy evaluation, static analysis, approval workflows
-> Agent Runtime Tool-first dispatch or LLM code generation, context assembly, tool registry
-> Sandbox Isolated process execution with resource limits
-> Audit In-memory or SqliteAuditRecorder persistent storage
-> Observability OpenTelemetry spans, Prometheus metrics, partial automatic cost attribution
-> Response Structured result with generated code, reasoning, trace ID
All requests flow top-to-bottom. Each layer is independently testable and dependencies are strictly one-directional.
Module Dependency Graph¶
agenticapi.dependencies -> (none — standalone injection layer)
agenticapi.interface -> agenticapi.harness, agenticapi.runtime, agenticapi.dependencies
agenticapi.harness -> agenticapi.runtime (interface portion only)
agenticapi.runtime -> external dependencies only (LLM SDK, httpx, etc.)
agenticapi.observability-> optional external (opentelemetry, prometheus), no-op otherwise
agenticapi.application -> agenticapi.runtime, agenticapi.harness
agenticapi.ops -> agenticapi.runtime, agenticapi.harness, agenticapi.application
agenticapi.cli -> all modules
agenticapi.testing -> all modules
Prohibited dependencies:
- runtime -> interface (runtime must not know about the interface)
- harness -> interface (harness must not know about the interface)
- harness -> application (harness must not know about the application layer)
Request Processing Pipeline¶
HTTP POST /agent/{endpoint_name}
|
v
Authentication (if auth= configured) -> 401 if invalid
|
v
Parse request body:
|-- JSON: {"intent": "...", "session_id": "..."}
|-- Multipart: intent form field + file fields -> UploadedFiles
|
v
AgenticApp.process_intent()
|
v
Resolve endpoint -> AgentEndpointDef
|
v
Get or create session -> Session (with TTL-based expiration)
|
v
IntentParser.parse() -> Intent (or Intent[T] with validated payload)
|-- With LLM: structured JSON extraction via prompt; for Intent[T], schema is forwarded
|-- Without LLM: keyword-based classification
|
v
Check IntentScope.matches(intent) -> PolicyViolation if denied
|
v
Build AgentContext (trace_id, endpoint_name, session_id, auth_user)
|
v
Resolve Depends() tree using the precomputed InjectionPlan (per-request cache, async generators)
|
v
Execute intent:
|
|-- [LLM + Harness path]:
| 1. Try tool-first dispatch when tools are registered and the LLM returns exactly one ToolCall
| 2. Pre-fetch tool data (call registered tools)
| 3. CodeGenerator.generate() -> Python code (with data sample in prompt)
| 4. HarnessEngine.execute():
| a. PolicyEvaluator.evaluate() (CodePolicy, DataPolicy, ResourcePolicy, RuntimePolicy)
| b. Static AST analysis (imports, eval/exec, dangerous builtins, file I/O)
| c. ApprovalWorkflow check (raise ApprovalRequired if needed)
| d. ProcessSandbox.execute() (isolated subprocess with timeout)
| e. Post-execution monitors (resource usage, output size)
| f. Post-execution validators (JSON serializable, read-only check)
| h. AuditRecorder.record() (full ExecutionTrace)
| 5. Return ExecutionResult -> AgentResponse
|
|-- [Direct handler path]:
| 1. Inject AgentTasks, UploadedFiles, HtmxHeaders, Depends() values as declared
| 2. Call handler(intent, context, ...)
| 3. If result is Response/FileResult -> pass through (file download)
| 4. Otherwise -> wrap result in AgentResponse
|
v
Emit observability signals: span events, latency histogram, and any LLM metrics recorded on that path
|
v
Execute background tasks (AgentTasks) if any
|
v
Update session with result summary
|
v
Return response:
|-- AgentResponse -> JSON (HTTP 200, 202, 4xx, 5xx)
|-- Response/FileResponse/StreamingResponse -> direct passthrough
Mapping to FastAPI/Starlette¶
| FastAPI/Starlette | AgenticAPI | Notes |
|---|---|---|
FastAPI |
AgenticApp |
Main ASGI application |
@app.get("/path") |
@app.agent_endpoint(name=...) |
Endpoint registration |
APIRouter |
AgentRouter |
Endpoint grouping with prefix/tags |
Request |
Intent / Intent[T] |
Input (natural language -> structured, optionally typed) |
Response |
AgentResponse |
Output with result, reasoning, trace |
Depends() |
Depends() |
FastAPI-compatible dependency injection |
BackgroundTasks |
AgentTasks |
Post-response task execution |
UploadFile |
UploadedFiles |
File upload via multipart |
FileResponse |
FileResult |
File download helper |
HTMLResponse |
HTMLResult |
HTML response |
PlainTextResponse |
PlainTextResult |
Plain text response |
| — | HtmxHeaders |
HTMX request header detection (auto-injected) |
| Typed request schema | Intent[T] |
Typed intent payload parsing |
@app.get(..., response_model=T) |
@app.agent_endpoint(..., response_model=T) |
Typed response validation and OpenAPI publication |
| Security schemes | Authenticator |
API key, Bearer, Basic auth |
app.add_middleware() |
app.add_middleware() |
Starlette middleware (CORS, compression) |
| Middleware stack | + DynamicPipeline |
DynamicPipeline is for agent context enrichment inside handlers |
| Pydantic model | Pydantic model | Schema definitions |
| OpenTelemetry via middleware | agenticapi.observability |
Built-in tracing + metrics, no-op if not installed |
| ASGI interface | ASGI interface | Direct uvicorn compatibility |
Approval Resolution Flow¶
When a write operation requires human approval, the harness raises ApprovalRequired (HTTP 202). The client must resolve the approval before the operation can proceed.
1. Client sends intent: POST /agent/orders {"intent": "delete cancelled orders"}
2. Server responds HTTP 202:
{
"status": "pending_approval",
"error": "Approval required by rule 'write_gate'",
"approval_request": {
"request_id": "abc123",
"approvers": ["admin"]
}
}
3. Approver resolves (programmatically or via admin UI):
await workflow.resolve("abc123", approved=True, approver="admin@example.com")
4. Client retries the same intent — now the write action executes.
Key points:
- request_id uniquely identifies the pending approval
- ApprovalWorkflow.resolve() is called programmatically (no built-in HTTP endpoint yet)
- After approval, the client must re-submit the original intent
- Approvals expire after timeout_seconds (default 3600s)
- Rejected approvals raise ApprovalDenied (HTTP 403)
- Expired approvals raise ApprovalTimeout (HTTP 408)
Capability Discovery¶
External agents can discover what an AgenticAPI service offers:
GET /capabilities -> {
"title": "My Service",
"version": "0.1.0",
"endpoints": [
{
"name": "orders.query",
"description": "Query order information",
"autonomy_level": "auto",
"intent_scope": {
"allowed_intents": ["order.*", "*.read"],
"denied_intents": []
}
}
]
}
This enables agents to programmatically discover endpoints, understand what intents are accepted, and adapt their requests accordingly.
OpenAPI / Swagger / ReDoc¶
AgenticApp automatically generates OpenAPI 3.1.0 documentation:
GET /openapi.json— OpenAPI schemaGET /docs— Swagger UIGET /redoc— ReDoc
Disable with AgenticApp(docs_url=None, redoc_url=None, openapi_url=None).
MCP (Model Context Protocol) Support¶
Agent endpoints can be exposed as MCP tools for use by Claude Desktop, Cursor, and other MCP clients. Requires pip install agentharnessapi[mcp].
@app.agent_endpoint(name="search", enable_mcp=True)
async def search(intent, context):
...
# Expose MCP-enabled endpoints at /mcp
from agenticapi.interface.compat import expose_as_mcp
expose_as_mcp(app)
Test with MCP Inspector: npx @modelcontextprotocol/inspector http://localhost:8000/mcp
Safety Architecture (Defense in Depth)¶
Layer 1: Prompt Design XML-escaped user input, explicit safety instructions
Layer 2: Static AST Analysis Forbidden imports, eval/exec, getattr, file I/O
Layer 3: Policy Evaluation CodePolicy, DataPolicy, ResourcePolicy, RuntimePolicy
Layer 4: Approval Workflow Human-in-the-loop for sensitive operations
Layer 5: Process Sandbox Isolated subprocess, timeout, base64 code transport
Layer 6: Post-Execution Resource monitors, output validators
Layer 7: Audit Trail Full ExecutionTrace for every operation