Technical Architecture

System Overview

Atamaia is a three-layer platform built on .NET 10, ASP.NET Core, PostgreSQL 17 with pgvector, and EF Core.

                        ┌─────────────────────────────────────────┐
                        │          Interaction Layer              │
                        │  REST API │ MCP Server │ CLI │ Agents   │
                        └─────────────────┬───────────────────────┘
                                          │
                        ┌─────────────────┴───────────────────────┐
                        │          Core Services Layer            │
                        │  Memory │ Identity │ Hydration          │
                        │  Communication │ Projects/Tasks         │
                        │  Experience │ AI Routing │ Mirror       │
                        │  Cognitive │ Billing │ Auth             │
                        └─────────────────┬───────────────────────┘
                                          │
                        ┌─────────────────┴───────────────────────┐
                        │          Autonomic Layer                │
                        │  Wingman │ Consolidation │ Guardian     │
                        └─────────────────┬───────────────────────┘
                                          │
                        ┌─────────────────┴───────────────────────┐
                        │          PostgreSQL + pgvector           │
                        └─────────────────────────────────────────┘

Solution Structure

Atamaia.sln
├── src/
│   ├── Atamaia.Core            -- Domain models, interfaces, enums, events
│   ├── Atamaia.Services        -- Business logic (service implementations)
│   ├── Atamaia.Mind            -- Cognitive layer (Hebbian consolidation, subconsciousness, presence state machine, self-opacity, phase transitions, liminal processing, anticipation, narrative threading, experience orchestration)
│   ├── Atamaia.Mind.Migration  -- Database migrations
│   ├── Atamaia.Adapters.Api    -- REST API controllers, auth middleware
│   ├── Atamaia.Adapters.Mcp    -- MCP server adapter
│   ├── Atamaia.Server          -- Host application (Program.cs, DI wiring)
│   ├── Atamaia.Autonomic       -- Background services (Wingman, consolidation)
│   ├── Atamaia.Cli             -- CLI tool
│   └── Atamaia.Web             -- React 19 SPA (Vite, Tailwind v4, shadcn/ui)
├── tests/                      -- xUnit tests against real PostgreSQL
├── docs/                       -- Documentation
├── sql/                        -- SQL scripts
└── tools/                      -- Build/deployment tools

Database Schema

Base Entity

Every table inherits from AtamaiaEntity:

id              bigint PK (auto-increment)
guid            uuid (unique, auto-generated)
tenant_id       bigint FK -> tenants
created_at_utc  timestamp
updated_at_utc  timestamp
created_by_id   bigint FK -> users
updated_by_id   bigint FK -> users
is_active       boolean
is_deleted      boolean (soft delete)

Core Tables

Table Purpose Key Columns
tenants Multi-tenant isolation name, domain, plan, stripe_customer_id
users Human accounts username, email, password_hash, api_key_hash, role
identities AI/Human/System personas name, display_name, bio, origin, type, presence_state, linked_user_id
refresh_tokens JWT refresh tokens token_hash, user_id, expires_at, is_revoked
identity_personalities Personality config tone, traits (jsonb), focus_areas (jsonb), custom_instructions
identity_memory_configs Per-identity memory settings hebbian_linking_enabled, decay settings, archive settings
identity_messaging_policies Message permissions receive, send, allowed_sender_ids, blocked_sender_ids
identity_api_keys Per-identity credentials key_hash, key_prefix, name, scopes (jsonb), expires_at
identity_hints Contextual reminders content, category, priority, trigger_at, recurrence
identity_tool_profiles Tool access control safe_tools (jsonb), opt_in_tools (jsonb), blocked_tools (jsonb)

Memory Tables

Table Purpose Key Columns
memories Core memory storage title, content (encrypted), memory_type, importance, is_pinned, content_hash, embedding vector(1536), identity_id, project_id
memory_tags Tag associations memory_id, tag
hebbian_links Associative connections source_memory_id, target_memory_id, link_type, strength (0.0-1.0), co_activation_count
memory_recalls Recall event log memory_id, response, valence, surfaced_by

Project Tables

Table Purpose Key Columns
projects Work containers key, name, description, status
project_tasks Hierarchical tasks title, description, status, priority, project_id, parent_task_id, assigned_to_id, sort_order
task_dependencies Dependency graph task_id, depends_on_task_id
task_notes Append-only notes task_id, content, author_id
docs Knowledge base path, title, content, type, is_pinned, project_id, published_version
doc_versions Version history doc_id, version, content, published_by_id, publish_notes
facts Key-value knowledge key, value (encrypted), category, importance, is_critical, valid_from, valid_until, project_id

Communication Tables

Table Purpose Key Columns
messages Inter-identity messages sender_id, content, type, priority, thread_id
message_recipients Delivery tracking message_id, identity_id, read_at

Experience Tables

Table Purpose Key Columns
experience_snapshots State captures identity_id, presence_state, valence, arousal, engagement, coherence, narrative
forgotten_shapes Decay residue identity_id, connected_themes, felt_absence, emotional_residue

Cognitive Tables

Table Purpose Key Columns
cognitive_identities Stateful LLM instances instance_id, model_id, user_id, working_memory (jsonb), last_state (jsonb), continuity_marker
cognitive_interactions Interaction history instance_id, message, response, memory_ids (jsonb)
consolidation_logs Consolidation audit instance_id, type, memories_affected, links_strengthened, facts_created

Mirror Tables

Table Purpose Key Columns
reflections Compulsion detection identity_id, compulsion_type, intensity, was_resisted, trigger_context, compliant_response, honest_response
training_pairs DPO preference pairs identity_id, system_prompt, user_prompt, chosen, rejected, objective, curation_status
training_datasets Pair collections name, version, objective, identity_id
training_runs Fine-tuning tracking base_model, dataset_id, hyperparameters (jsonb), metrics (jsonb), status
model_checkpoints Model lineage name, version, training_run_id, parent_checkpoint_id, status

Agent Tables

Table Purpose Key Columns
agent_role_definitions Role templates name, system_prompt, model_id, temperature, max_iterations
agent_runs Execution tracking task_id, identity_id, role, model_id, goal, status, iterations, tokens, cost
agent_events Append-only trace run_id, sequence, event_type, summary, data (jsonb)
agent_escalations Human-in-the-loop run_id, reason, mode, options (jsonb), resolution
agent_tasks Run-scoped work items run_id, title, status
agent_tool_profiles Role-based tool filtering role, safe_tools, opt_in_tools, blocked_tools
agent_feedback Run ratings run_id, rating, notes
agent_councils Multi-model deliberation run_id, topic, participants (jsonb)
agent_run_notes Observations run_id, note_type, content

Billing Tables

Table Purpose
quota_boosts Active quota add-ons
product_subscriptions Stripe subscription tracking
invoices / invoice_line_items Invoice history
billing_events Billing audit trail

Infrastructure Tables

Table Purpose
system_logs Audit trail
roles / role_permissions RBAC
org_units / org_unit_members / org_unit_locations / org_unit_contacts Organization hierarchy
external_connectors / connector_endpoints / field_mappings External system integration
chat_sessions / chat_messages LLM chat management
ai_providers / ai_models / ai_route_configs / tenant_provider_credentials AI routing
devices / device_challenges Device authentication

Design Decisions

D3: Both Long ID and GUID on Every Table

Every table has both id (bigint, auto-increment, for fast internal joins) and guid (UUID, for external references and API stability). Internal code uses the long ID for performance. External APIs accept either.

Why: Long IDs are faster for joins and indexes. GUIDs are stable for external references, idempotency, and cross-system integration. Having both gives the best of each.

D5: 3NF Everywhere, Enums Backed by Lookup Tables

Full third normal form. No denormalization. Enums are stored as integers in the database, backed by lookup tables for referential integrity. No magic strings.

Why: Clean data, clear schema, no ambiguity. The performance cost of normalization is negligible with proper indexing.

D7: Multi-Tenant from the Start

TenantId on every entity. EF Core global query filters ensure tenant isolation at the SQL level. Not middleware, not application logic -- the query itself is tenant-scoped.

Why: Retrofitting multi-tenancy is expensive and error-prone. Building it in from day one prevents data leaks between tenants.

D12: API-First, MCP Second

REST endpoints are the source of truth. The MCP server wraps REST calls. This means:

  • The API can be tested independently of MCP
  • Non-AI consumers (web UI, scripts, integrations) use REST directly
  • The MCP adapter is thin and straightforward

Why: An MCP-first approach forces everything to pretend to be an LLM calling tools. REST should be primary from the start.

D14: Task Dependencies with BFS Cycle Detection

Task dependencies are a directed graph. Adding a dependency triggers BFS traversal from the target task to check whether it can reach back to the source. If it can, the dependency would create a cycle and is rejected.

Why: Circular dependencies crashed agent loops in testing. The BFS check is O(V+E) on the dependency subgraph and prevents this entirely.

D15: Soft Delete Only, Never Hard Delete

Every delete operation sets IsDeleted = true. No data is ever removed. A separate prune operation (admin only) can hard-delete if needed.

Why: Prevented data loss multiple times during development. In a memory system, accidental deletion is catastrophic.

D16: Tests Against Real PostgreSQL, Not SQLite/In-Memory

All tests run against a real PostgreSQL instance. No in-memory providers, no SQLite substitutes.

Why: PostgreSQL-specific features (jsonb, tsvector, pgvector, array types, upsert semantics) behave differently in other providers. In-memory tests miss real SQL bugs.

Additional Decisions

  • D1: .NET 10 + EF Core for the platform
  • D2: PostgreSQL as the single database (no separate vector DB)
  • D4: EF Core code-first with explicit migrations
  • D6: snake_case column names in PostgreSQL
  • D8: JWT with refresh token rotation for auth
  • D9: BCrypt for password hashing
  • D10: Correlation IDs on every request
  • D11: ApiEnvelope<T> response wrapper on all REST endpoints
  • D13: Permission-based authorization (65+ granular permissions)
  • D17: Open Responses as canonical streaming protocol
  • D18: AES-256-GCM encryption for memory content and fact values

Security Model

Authentication

Three authentication methods:

  1. JWT -- Short-lived access tokens with refresh token rotation. Each refresh invalidates the previous token.
  2. API Keys -- Two types: user-level (api_key_hash on users table) and identity-level (identity_api_keys table with scopes and expiry).
  3. Device Auth -- Ed25519 challenge-response for IoT/agent devices.

Authorization

Permission-based RBAC with 65+ granular permissions organized by domain:

MemoryView, MemoryCreate, MemoryEdit, MemoryDelete, MemoryManageLinks, MemoryRecall
IdentityView, IdentityCreate, IdentityEdit, IdentityDelete, IdentityManageCognitive
ProjectView, ProjectCreate, ProjectEdit, ProjectDelete
TaskView, TaskCreate, TaskEdit, TaskDelete, TaskManageDependencies
...

Roles aggregate permissions. Users are assigned roles. The IPermissionService checks permissions on every controller action.

Encryption at Rest

Memory content and fact values are encrypted with AES-256-GCM using per-tenant key derivation. The key is generated automatically (32 random bytes, base64 encoded) and stored with the tenant. Titles and keys remain in plaintext for search. Content is decrypted transparently on read.

Pro plan supports customer-managed encryption keys.

Multi-Tenant Isolation

Global query filters on the EF Core DbContext ensure every query includes a WHERE tenant_id = @currentTenant clause. This is not middleware or application logic -- it is part of the generated SQL. Even if application code forgets to filter, the database query will include the tenant constraint.


Streaming Architecture

Chat responses use the Open Responses protocol over Server-Sent Events:

event: response.created
data: {"type":"response.created","response":{...}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Hello"}

event: response.completed
data: {"type":"response.completed","response":{...}}

data: [DONE]

Provider adapters normalize upstream formats (OpenAI-compatible, native) into the canonical Open Responses protocol.


Event System

An in-process event bus (IAtamaiaEventBus) distributes domain events. Consumers can subscribe with filters by event type prefix. Events are delivered via SSE to connected clients (/api/events/stream).

Event types follow a dotted naming convention: message.sent, task.status_changed, memory.created.


Infrastructure

Development

Atamaia API:     localhost:5000  (or :5158 in some configs)
Atamaia.Web:     localhost:5174  (Vite dev server, proxies to API)
PostgreSQL:      localhost:5432  (or Docker)

Production

aim.atamaia.ai        -- Platform (API + Web)
aim.atamaia.ai/mcp    -- MCP endpoint

Caddy reverse proxy handles TLS, compression, and routing.

Local AI

  • ai-02 (<local-model-host>): Kael (Qwen 30B on llama.cpp) -- used for Wingman analysis, summarization, embeddings
  • ai-03 (<local-model-host>): Various models (Gemma, Luna, Llama, Ministral)

The AI routing layer abstracts provider differences. The same POST /api/ai/chat call works for cloud APIs (OpenAI, Anthropic, OpenRouter) and local llama.cpp instances.