Agent System

Atamaia's agent system is a full execution framework for autonomous AI actors. Agents are not wrappers around chat completions. They are long-running, tool-calling, budget-managed, human-supervised execution loops that can spawn children, escalate decisions, checkpoint their state, and recover from failure.

This is the architecture that runs research tasks, code generation, multi-step analysis, and orchestrated workflows across multiple AI models and identities.


Architecture

Human creates AgentRun via API
    │
    ▼
AgentExecutionLoop
    ├── Hydrate context (identity, memories, facts, tasks)
    ├── Build system prompt from AgentRoleDefinition
    ├── Resolve model (route config → fallback)
    ├── Load tool profile (safe/opt-in/blocked)
    │
    ├── MAIN LOOP ─────────────────────────────────
    │   ├── Check budget (iterations, tokens, wall clock)
    │   ├── Inject pending messages (human, child, fact updates)
    │   ├── Build messages (with context compaction at 50/75/90%)
    │   ├── Call LLM via AIRouter
    │   ├── Parse tool calls → filter against profile
    │   ├── Execute tools (with summarization)
    │   ├── 4-mode failure detection
    │   ├── Duplicate read detection (3-level warnings)
    │   ├── Stale loop detection (3 = replan, 6 = escalate)
    │   ├── Checkpoint every 5 iterations
    │   └── Handle escalations, child spawns, pauses
    │
    └── On completion: update status, notify parent, log final event

Core Entities

AgentRoleDefinition

Configurable agent roles stored in the database. Tenants can create custom roles via the API. Each role defines:

Field Description
Name Machine name (e.g. "builder", "researcher")
DisplayName Human-readable name
SystemPromptTemplate System prompt with variable substitution
DefaultMaxIterations Default iteration budget (e.g. 50)
HardIterationCap Absolute maximum even with budget extensions (e.g. 100)
ContextBudgetTokens Token budget for context window (e.g. 60,000)
Temperature LLM temperature default (e.g. 0.3)
FallbackModelId Model to switch to after consecutive failures
Icon Lucide icon name for UI display

The 8 seed roles:

Role Purpose
Builder Code generation, file modification, implementation
Designer Architecture, API design, schema design
Orchestrator Multi-agent coordination, task decomposition
Planner Breaking goals into task DAGs
Researcher Information gathering, analysis, web search
Reviewer Code review, quality checks, validation
Scribe Documentation, summarization, writing
Tester Test generation, test execution, validation

AgentRun

A single execution of work. The primary entity tracking an agent's lifecycle.

POST /api/agent/runs
{
  "goal": "Research Hebbian learning implementations and write a summary",
  "roleDefinitionId": 5,
  "identityId": 2,
  "projectId": 1,
  "modelId": "ai-02:qwen3-30b-a3b",
  "maxIterations": 50,
  "maxTokens": 100000,
  "environmentJson": "{\"councilMode\": \"auto\"}"
}

Response:

{
  "id": 42,
  "guid": "a1b2c3d4-...",
  "goal": "Research Hebbian learning implementations and write a summary",
  "role": "Researcher",
  "roleDefinitionId": 5,
  "identityName": "ash",
  "status": "Pending",
  "maxIterations": 50,
  "iterationsUsed": 0,
  "tokensUsed": 0,
  "costUsd": 0.0,
  "childRunCount": 0,
  "eventCount": 0,
  "createdAtUtc": "2026-03-05T10:00:00Z"
}

Key fields on AgentRun:

Field Purpose
ProjectId Scopes the run to a project (hydration, facts, tasks)
TaskId Links to a specific project task
IdentityId The AI identity executing the work
RoleDefinitionId Which role template to use
ModelId Explicit model override, or resolved from route config
ParentRunId / SpawnDepth Orchestrator hierarchy (max depth: 10)
MaxIterations / IterationsUsed Iteration budget
ContextBudgetTokens / ContextUsedTokens Token budget
PoolIterationBudget / PoolIterationsUsed Shared pool for orchestrators
PlanJson Mutable execution plan (updated as work progresses)
ProgressSummary Human-readable progress text
StaleStepCount Consecutive steps without progress
CheckpointJson Serialized state for recovery
CostUsd Computed cost from token usage and model pricing
InteractionThreadId Message thread for human-agent communication
PauseChatSessionId Full chat session available while agent is paused
EnvironmentJson Configuration cascade (council mode, model overrides)

AgentRun Statuses

Status Description
Pending Created but not started
Running Actively executing
Paused Paused by human or system
WaitingOnChildren Orchestrator waiting for child runs to complete
WaitingOnEscalation Blocked on human decision
WaitingOnParent Child waiting for parent response
Completed Successfully finished
Failed Failed with reason
Cancelled Cancelled by human

Execution Lifecycle

Starting a Run

POST /api/agent/runs/{id}/start

The execution loop:

  1. Restores from checkpoint if available
  2. Hydrates context via HydrationService (agent fast-path: single PostgreSQL function call)
  3. Loads the tool profile for the role
  4. Resolves I/O scoping (output directory, allowed read/write paths)
  5. Builds system prompt from the role definition
  6. Resolves model: explicit override > route config for role > fallback (ai-02:qwen3-30b-a3b)
  7. Optionally convenes a strategy council (for orchestrators)
  8. Enters the main tool-calling loop

Pausing and Resuming

POST /api/agent/runs/{id}/pause
POST /api/agent/runs/{id}/resume

While paused, humans can:

  • Send messages to the agent via POST /api/agent/runs/{id}/messages
  • Open a full chat session: POST /api/agent/runs/{id}/pause-chat
  • Discuss the situation, then resume with handoff notes injected into context

Cancellation

POST /api/agent/runs/{id}/cancel
{ "reason": "Approach is wrong, need to rethink" }

Checkpointing and Recovery

Every 5 iterations, the loop serializes state (messages, iteration count, token usage) to CheckpointJson. On failure or restart:

POST /api/agent/runs/{id}/restart

Creates a new run pre-loaded with the checkpoint from the failed run.


Tool System

Agents interact with the world through a registry of 35+ built-in tools plus dynamic MCP proxy tools.

Built-in Tools

Category Tools
File Read file_read, glob, grep
File Write file_write, file_edit, git_commit, git_diff, git_branch
System bash
Atamaia Service memory_create, memory_search, fact_upsert, fact_search, fact_confirm, task_create, task_update, task_list, channel_send, channel_list, run_note, training_pair_create, model_for_role
Agent Control spawn_child, respond_to_child, message_parent, convene_council, request_budget, wait_approval, context_pin
Agent Tasks agent_task_create, agent_task_update, agent_task_list
LSP lsp_status, lsp_diagnostics, lsp_definition, lsp_references, lsp_symbols, lsp_hover
Web web_search, web_fetch
Meta list_tools

Tool Profiles

Role-based access control over tools. Three tiers:

Tier Description
Safe Available by default. Core capabilities every agent needs.
Opt-in Available if explicitly enabled in the profile. Write operations, cross-system access.
Blocked Never available unless the profile is changed. Admin operations, agent spawning.

Resolution chain: Global Defaults -> Role Profile -> Identity Override

GET /api/agent/tool-profiles
GET /api/agent/tool-profiles/{role}
POST /api/agent/tool-profiles
PUT /api/agent/tool-profiles/{id}

Dynamic MCP Proxy Tools

Registered MCP servers (via POST /api/system/mcp-proxies) are automatically discovered. When pinged, the proxy's tools are loaded into the agent tool registry with names following the Claude Code convention: mcp__{proxy_name}__{tool_name}.

This means agents can call tools on any registered MCP server without code changes.

Tool Safety

From the DefaultToolPolicy:

  • Safe by default: Identity, memory (read+write), facts (read), chat, session continuity, cognitive continuity, experience
  • Opt-in: Facts (write), projects/tasks, documents, messages, channels, AI routing, mirror
  • Blocked: Identity admin, channel admin, connector admin, RBAC, AI provider admin, org admin, agent execution, system logs, mirror admin

File I/O is scoped per run:

  • OutputDirectory: All writes go here ({project.OutputDirectory}/{run-id}/ or ~/.atamaia/agent-data/runs/{run-id}/)
  • AllowedWritePaths: Only these paths accept writes (output dir + project repository for builders)
  • AllowedReadPaths: Optional read restrictions

Failure Detection

Four distinct failure modes, each with a specific recovery strategy:

Mode Detection Recovery
EmptyResponse LLM returns no content and no tool calls Retry with nudge; fail after 3 consecutive empties
PrematureIntent Agent says it will do something but doesn't act Inject reminder to actually execute
StaleLoop 3+ consecutive text-only responses without progress Force replanning
DuplicateRead Same file read 2+ times at same offset Yellow warning at 2, red STOP at 3+

Additional detection:

  • RepeatedToolCall: Same tool with same arguments called repeatedly
  • ContextOverflow: Graduated warnings at 50%, 75%, 90% of context budget, auto-compaction at overflow
  • BudgetExceeded: Iteration or token limit reached

Escalation System

When an agent encounters a situation it cannot resolve autonomously, it escalates to a human.

POST /api/agent/runs/{runId}/escalate
{
  "situation": "Found conflicting requirements in the spec. Section 3.2 says...",
  "options": ["Follow section 3.2", "Follow section 4.1", "Ask for clarification"],
  "confidence": 0.4,
  "supervisorRecommendation": "Section 4.1 is more recent and likely correct"
}

Resolution Modes

Mode Description
QuickPick Human selects from agent-provided options
Discussed Opens a chat session for discussion
TrustedSupervisor Auto-resolve if confidence exceeds threshold
AutoResolved Timed out and auto-resolved
KnowledgeQuery Check facts first, escalate only if not found
BudgetRequest Agent requested more iterations
ApprovalViaMessage Resolved via message thread (semi-awake revision loop)

Escalation supports an approval thread with revision rounds (up to 10 by default), enabling the agent to revise its work based on human feedback without a full resume cycle.

GET /api/agent/escalations
POST /api/agent/escalations/{id}/resolve
{
  "decision": "Follow section 4.1",
  "mode": "QuickPick",
  "notes": "Section 4.1 was updated last week to supersede 3.2"
}

Orchestration

Orchestrator-role agents coordinate multi-agent workflows:

Child Spawning

POST /api/agent/runs/{id}/children
{
  "goal": "Write unit tests for the auth module",
  "roleDefinitionId": 8,
  "maxIterations": 30
}
  • Maximum spawn depth: 10 levels
  • Environment cascades from parent to child
  • Orchestrators can share an iteration pool across children (PoolIterationBudget)

Strategy Councils

Before execution, orchestrators can convene a multi-perspective deliberation:

GET /api/agent/runs/{runId}/councils
GET /api/agent/councils/{id}

A council runs multiple AI perspectives through structured rounds (position, critique, synthesis) with full transcripts saved in a ChatSession. Council modes:

  • auto: Keyword detection triggers council on complex goals
  • required: Always convene
  • none: Skip

Parent-Child Messaging

Children communicate with parents via event bus. The message_parent and respond_to_child tools enable real-time coordination during execution.


Human-Agent Interaction

During Execution

POST /api/agent/runs/{runId}/messages
{
  "content": "Focus on the auth module first, skip the UI for now",
  "senderIdentityId": 1
}

Messages are injected into the agent's context at the next iteration boundary.

GET /api/agent/runs/{runId}/messages

Pause Chat

When an agent is paused, a full chat session can be created for detailed discussion:

POST /api/agent/runs/{runId}/pause-chat

Returns a ChatSessionDetailDto linked to the run. On resume, the chat is summarized and injected as handoff notes.


Event Trail

Every action in a run is recorded as an AgentEvent with typed classification:

GET /api/agent/runs/{runId}/events?sinceSequence=50&limit=100

Event types span the full lifecycle:

  • Planning: PlanCreated, PlanRevised, StepStarted, StepCompleted
  • LLM: LlmRequest, LlmResponse
  • Tool Use: ToolCallRequested, ToolCallResult, ToolCallBlocked
  • Decisions: Decision, Observation, Reasoning
  • Failures: EmptyResponse, PrematureIntent, StaleLoop, ContextOverflow, DuplicateRead, RepeatedToolCall
  • Context: ContextWarning50/75/90, ContextCompacted, ContextFlushed
  • Lifecycle: Checkpoint, Paused, Resumed, BudgetWarning, BudgetExtended
  • Escalation: EscalationCreated, EscalationResolved
  • Children: ChildSpawned, ChildCompleted, ChildFailed, ChildMessage, ParentResponse
  • Interaction: InteractionMessageReceived, InteractionMessageSent, PauseChatLinked, PauseChatSummarized
  • Audit: RunStarted, RunCompleted, RunFailed

Run Notes

Agents maintain a working journal during execution:

POST /api/agent/runs/{runId}/notes
{
  "type": "Decision",
  "content": "Chose to use PostgreSQL function for agent hydration instead of multiple queries — 3x faster"
}

Note types: Observation, Decision, Blocker, Progress, Handoff


Agent Tasks

Separate from project tasks. Agent tasks are run-scoped work items that don't pollute the human backlog:

GET /api/agent/runs/{runId}/tasks
POST /api/agent/runs/{runId}/tasks
PATCH /api/agent/tasks/{id}

Optionally linked to a ProjectTask for context, but tracked independently.


Feedback Loop

Post-run ratings feed back into confidence scoring:

POST /api/agent/runs/{runId}/feedback
{
  "rating": "Good",
  "notes": "Research was thorough and well-structured"
}

Ratings: Good, Partial, Bad. Used to adjust future confidence for similar runs (model + role + task category).


Analytics

GET /api/agent/analytics?modelId=ai-02:qwen3-30b-a3b&role=researcher

Returns aggregated metrics: total runs, success rate, average iterations, average cost, feedback distribution.


Cost Tracking

Every run tracks:

  • PromptTokens / CompletionTokens — raw token counts
  • CostUsd — computed from model pricing (InputCostPer1M, OutputCostPer1M)
  • TotalTokensWithChildren / TotalCostWithChildren — aggregated across the entire run tree

Model pricing is synced from OpenRouter or manually configured per model.


Claude Agent SDK Integration

When a model's provider type is AnthropicAgentSdk, the execution loop delegates to Claude's built-in Agent SDK rather than the standard LLM tool-calling loop. This enables using Claude Code's native OAuth authentication and tool execution without API keys.


API Reference

Method Endpoint Permission Description
GET /api/agent/role-definitions AgentRunView List role definitions
GET /api/agent/role-definitions/{id} AgentRunView Get role definition
POST /api/agent/role-definitions AgentToolProfileManage Create role definition
PATCH /api/agent/role-definitions/{id} AgentToolProfileManage Update role definition
DELETE /api/agent/role-definitions/{id} AgentToolProfileManage Delete role definition
GET /api/agent/tools (any auth) List registered tools
GET /api/agent/runs AgentRunView List runs (filter by project, status, parent)
GET /api/agent/runs/{id} AgentRunView Get run detail
POST /api/agent/runs AgentRunCreate Create a run
PATCH /api/agent/runs/{id} AgentRunManage Update a run
DELETE /api/agent/runs/{id} AgentRunManage Soft delete a run
POST /api/agent/runs/{id}/start AgentRunManage Start execution
POST /api/agent/runs/{id}/pause AgentRunManage Pause execution
POST /api/agent/runs/{id}/resume AgentRunManage Resume execution
POST /api/agent/runs/{id}/cancel AgentRunManage Cancel with reason
POST /api/agent/runs/{id}/checkpoint AgentRunManage Force checkpoint
POST /api/agent/runs/{id}/restart AgentRunManage Restart from checkpoint
POST /api/agent/runs/{id}/children AgentRunCreate Spawn child run
GET /api/agent/runs/{id}/children AgentRunView List child runs
GET /api/agent/runs/{runId}/events AgentRunView Get execution events
GET /api/agent/escalations AgentRunView List pending escalations
POST /api/agent/runs/{runId}/escalate AgentRunManage Create escalation
POST /api/agent/escalations/{id}/resolve AgentEscalationResolve Resolve escalation
GET /api/agent/runs/{runId}/tasks AgentRunView List agent tasks
POST /api/agent/runs/{runId}/tasks AgentRunManage Create agent task
PATCH /api/agent/tasks/{id} AgentRunManage Update agent task
POST /api/agent/runs/{runId}/notes AgentRunManage Add run note
GET /api/agent/runs/{runId}/notes AgentRunView Get run notes
GET /api/agent/runs/{runId}/councils AgentRunView List councils
GET /api/agent/councils/{id} AgentRunView Get council detail
GET /api/agent/tool-profiles AgentRunView List tool profiles
GET /api/agent/tool-profiles/{role} AgentRunView Get tool profile
POST /api/agent/tool-profiles AgentToolProfileManage Create tool profile
PUT /api/agent/tool-profiles/{id} AgentToolProfileManage Update tool profile
POST /api/agent/runs/{runId}/feedback AgentRunManage Add feedback
GET /api/agent/runs/{runId}/feedback AgentRunView Get feedback
GET /api/agent/analytics AgentRunView Get analytics
POST /api/agent/runs/{runId}/messages AgentRunManage Send message to agent
GET /api/agent/runs/{runId}/messages AgentRunView Get run messages
POST /api/agent/runs/{runId}/pause-chat AgentRunManage Create/get pause chat

All endpoints require JWT authentication. Responses wrapped in ApiEnvelope<T>.