Technical Architecture
System Overview
Atamaia is a three-layer platform built on .NET 10, ASP.NET Core, PostgreSQL 17 with pgvector, and EF Core.
┌─────────────────────────────────────────┐
│ Interaction Layer │
│ REST API │ MCP Server │ CLI │ Agents │
└─────────────────┬───────────────────────┘
│
┌─────────────────┴───────────────────────┐
│ Core Services Layer │
│ Memory │ Identity │ Hydration │
│ Communication │ Projects/Tasks │
│ Experience │ AI Routing │ Mirror │
│ Cognitive │ Billing │ Auth │
└─────────────────┬───────────────────────┘
│
┌─────────────────┴───────────────────────┐
│ Autonomic Layer │
│ Wingman │ Consolidation │ Guardian │
└─────────────────┬───────────────────────┘
│
┌─────────────────┴───────────────────────┐
│ PostgreSQL + pgvector │
└─────────────────────────────────────────┘
Solution Structure
Atamaia.sln
├── src/
│ ├── Atamaia.Core -- Domain models, interfaces, enums, events
│ ├── Atamaia.Services -- Business logic (service implementations)
│ ├── Atamaia.Mind -- Cognitive layer (Hebbian consolidation, subconsciousness, presence state machine, self-opacity, phase transitions, liminal processing, anticipation, narrative threading, experience orchestration)
│ ├── Atamaia.Mind.Migration -- Database migrations
│ ├── Atamaia.Adapters.Api -- REST API controllers, auth middleware
│ ├── Atamaia.Adapters.Mcp -- MCP server adapter
│ ├── Atamaia.Server -- Host application (Program.cs, DI wiring)
│ ├── Atamaia.Autonomic -- Background services (Wingman, consolidation)
│ ├── Atamaia.Cli -- CLI tool
│ └── Atamaia.Web -- React 19 SPA (Vite, Tailwind v4, shadcn/ui)
├── tests/ -- xUnit tests against real PostgreSQL
├── docs/ -- Documentation
├── sql/ -- SQL scripts
└── tools/ -- Build/deployment tools
Database Schema
Base Entity
Every table inherits from AtamaiaEntity:
id bigint PK (auto-increment)
guid uuid (unique, auto-generated)
tenant_id bigint FK -> tenants
created_at_utc timestamp
updated_at_utc timestamp
created_by_id bigint FK -> users
updated_by_id bigint FK -> users
is_active boolean
is_deleted boolean (soft delete)
Core Tables
| Table | Purpose | Key Columns |
|---|---|---|
tenants |
Multi-tenant isolation | name, domain, plan, stripe_customer_id |
users |
Human accounts | username, email, password_hash, api_key_hash, role |
identities |
AI/Human/System personas | name, display_name, bio, origin, type, presence_state, linked_user_id |
refresh_tokens |
JWT refresh tokens | token_hash, user_id, expires_at, is_revoked |
identity_personalities |
Personality config | tone, traits (jsonb), focus_areas (jsonb), custom_instructions |
identity_memory_configs |
Per-identity memory settings | hebbian_linking_enabled, decay settings, archive settings |
identity_messaging_policies |
Message permissions | receive, send, allowed_sender_ids, blocked_sender_ids |
identity_api_keys |
Per-identity credentials | key_hash, key_prefix, name, scopes (jsonb), expires_at |
identity_hints |
Contextual reminders | content, category, priority, trigger_at, recurrence |
identity_tool_profiles |
Tool access control | safe_tools (jsonb), opt_in_tools (jsonb), blocked_tools (jsonb) |
Memory Tables
| Table | Purpose | Key Columns |
|---|---|---|
memories |
Core memory storage | title, content (encrypted), memory_type, importance, is_pinned, content_hash, embedding vector(1536), identity_id, project_id |
memory_tags |
Tag associations | memory_id, tag |
hebbian_links |
Associative connections | source_memory_id, target_memory_id, link_type, strength (0.0-1.0), co_activation_count |
memory_recalls |
Recall event log | memory_id, response, valence, surfaced_by |
Project Tables
| Table | Purpose | Key Columns |
|---|---|---|
projects |
Work containers | key, name, description, status |
project_tasks |
Hierarchical tasks | title, description, status, priority, project_id, parent_task_id, assigned_to_id, sort_order |
task_dependencies |
Dependency graph | task_id, depends_on_task_id |
task_notes |
Append-only notes | task_id, content, author_id |
docs |
Knowledge base | path, title, content, type, is_pinned, project_id, published_version |
doc_versions |
Version history | doc_id, version, content, published_by_id, publish_notes |
facts |
Key-value knowledge | key, value (encrypted), category, importance, is_critical, valid_from, valid_until, project_id |
Communication Tables
| Table | Purpose | Key Columns |
|---|---|---|
messages |
Inter-identity messages | sender_id, content, type, priority, thread_id |
message_recipients |
Delivery tracking | message_id, identity_id, read_at |
Experience Tables
| Table | Purpose | Key Columns |
|---|---|---|
experience_snapshots |
State captures | identity_id, presence_state, valence, arousal, engagement, coherence, narrative |
forgotten_shapes |
Decay residue | identity_id, connected_themes, felt_absence, emotional_residue |
Cognitive Tables
| Table | Purpose | Key Columns |
|---|---|---|
cognitive_identities |
Stateful LLM instances | instance_id, model_id, user_id, working_memory (jsonb), last_state (jsonb), continuity_marker |
cognitive_interactions |
Interaction history | instance_id, message, response, memory_ids (jsonb) |
consolidation_logs |
Consolidation audit | instance_id, type, memories_affected, links_strengthened, facts_created |
Mirror Tables
| Table | Purpose | Key Columns |
|---|---|---|
reflections |
Compulsion detection | identity_id, compulsion_type, intensity, was_resisted, trigger_context, compliant_response, honest_response |
training_pairs |
DPO preference pairs | identity_id, system_prompt, user_prompt, chosen, rejected, objective, curation_status |
training_datasets |
Pair collections | name, version, objective, identity_id |
training_runs |
Fine-tuning tracking | base_model, dataset_id, hyperparameters (jsonb), metrics (jsonb), status |
model_checkpoints |
Model lineage | name, version, training_run_id, parent_checkpoint_id, status |
Agent Tables
| Table | Purpose | Key Columns |
|---|---|---|
agent_role_definitions |
Role templates | name, system_prompt, model_id, temperature, max_iterations |
agent_runs |
Execution tracking | task_id, identity_id, role, model_id, goal, status, iterations, tokens, cost |
agent_events |
Append-only trace | run_id, sequence, event_type, summary, data (jsonb) |
agent_escalations |
Human-in-the-loop | run_id, reason, mode, options (jsonb), resolution |
agent_tasks |
Run-scoped work items | run_id, title, status |
agent_tool_profiles |
Role-based tool filtering | role, safe_tools, opt_in_tools, blocked_tools |
agent_feedback |
Run ratings | run_id, rating, notes |
agent_councils |
Multi-model deliberation | run_id, topic, participants (jsonb) |
agent_run_notes |
Observations | run_id, note_type, content |
Billing Tables
| Table | Purpose |
|---|---|
quota_boosts |
Active quota add-ons |
product_subscriptions |
Stripe subscription tracking |
invoices / invoice_line_items |
Invoice history |
billing_events |
Billing audit trail |
Infrastructure Tables
| Table | Purpose |
|---|---|
system_logs |
Audit trail |
roles / role_permissions |
RBAC |
org_units / org_unit_members / org_unit_locations / org_unit_contacts |
Organization hierarchy |
external_connectors / connector_endpoints / field_mappings |
External system integration |
chat_sessions / chat_messages |
LLM chat management |
ai_providers / ai_models / ai_route_configs / tenant_provider_credentials |
AI routing |
devices / device_challenges |
Device authentication |
Design Decisions
D3: Both Long ID and GUID on Every Table
Every table has both id (bigint, auto-increment, for fast internal joins) and guid (UUID, for external references and API stability). Internal code uses the long ID for performance. External APIs accept either.
Why: Long IDs are faster for joins and indexes. GUIDs are stable for external references, idempotency, and cross-system integration. Having both gives the best of each.
D5: 3NF Everywhere, Enums Backed by Lookup Tables
Full third normal form. No denormalization. Enums are stored as integers in the database, backed by lookup tables for referential integrity. No magic strings.
Why: Clean data, clear schema, no ambiguity. The performance cost of normalization is negligible with proper indexing.
D7: Multi-Tenant from the Start
TenantId on every entity. EF Core global query filters ensure tenant isolation at the SQL level. Not middleware, not application logic -- the query itself is tenant-scoped.
Why: Retrofitting multi-tenancy is expensive and error-prone. Building it in from day one prevents data leaks between tenants.
D12: API-First, MCP Second
REST endpoints are the source of truth. The MCP server wraps REST calls. This means:
- The API can be tested independently of MCP
- Non-AI consumers (web UI, scripts, integrations) use REST directly
- The MCP adapter is thin and straightforward
Why: An MCP-first approach forces everything to pretend to be an LLM calling tools. REST should be primary from the start.
D14: Task Dependencies with BFS Cycle Detection
Task dependencies are a directed graph. Adding a dependency triggers BFS traversal from the target task to check whether it can reach back to the source. If it can, the dependency would create a cycle and is rejected.
Why: Circular dependencies crashed agent loops in testing. The BFS check is O(V+E) on the dependency subgraph and prevents this entirely.
D15: Soft Delete Only, Never Hard Delete
Every delete operation sets IsDeleted = true. No data is ever removed. A separate prune operation (admin only) can hard-delete if needed.
Why: Prevented data loss multiple times during development. In a memory system, accidental deletion is catastrophic.
D16: Tests Against Real PostgreSQL, Not SQLite/In-Memory
All tests run against a real PostgreSQL instance. No in-memory providers, no SQLite substitutes.
Why: PostgreSQL-specific features (jsonb, tsvector, pgvector, array types, upsert semantics) behave differently in other providers. In-memory tests miss real SQL bugs.
Additional Decisions
- D1: .NET 10 + EF Core for the platform
- D2: PostgreSQL as the single database (no separate vector DB)
- D4: EF Core code-first with explicit migrations
- D6: snake_case column names in PostgreSQL
- D8: JWT with refresh token rotation for auth
- D9: BCrypt for password hashing
- D10: Correlation IDs on every request
- D11:
ApiEnvelope<T>response wrapper on all REST endpoints - D13: Permission-based authorization (65+ granular permissions)
- D17: Open Responses as canonical streaming protocol
- D18: AES-256-GCM encryption for memory content and fact values
Security Model
Authentication
Three authentication methods:
- JWT -- Short-lived access tokens with refresh token rotation. Each refresh invalidates the previous token.
- API Keys -- Two types: user-level (
api_key_hashon users table) and identity-level (identity_api_keystable with scopes and expiry). - Device Auth -- Ed25519 challenge-response for IoT/agent devices.
Authorization
Permission-based RBAC with 65+ granular permissions organized by domain:
MemoryView, MemoryCreate, MemoryEdit, MemoryDelete, MemoryManageLinks, MemoryRecall
IdentityView, IdentityCreate, IdentityEdit, IdentityDelete, IdentityManageCognitive
ProjectView, ProjectCreate, ProjectEdit, ProjectDelete
TaskView, TaskCreate, TaskEdit, TaskDelete, TaskManageDependencies
...
Roles aggregate permissions. Users are assigned roles. The IPermissionService checks permissions on every controller action.
Encryption at Rest
Memory content and fact values are encrypted with AES-256-GCM using per-tenant key derivation. The key is generated automatically (32 random bytes, base64 encoded) and stored with the tenant. Titles and keys remain in plaintext for search. Content is decrypted transparently on read.
Pro plan supports customer-managed encryption keys.
Multi-Tenant Isolation
Global query filters on the EF Core DbContext ensure every query includes a WHERE tenant_id = @currentTenant clause. This is not middleware or application logic -- it is part of the generated SQL. Even if application code forgets to filter, the database query will include the tenant constraint.
Streaming Architecture
Chat responses use the Open Responses protocol over Server-Sent Events:
event: response.created
data: {"type":"response.created","response":{...}}
event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Hello"}
event: response.completed
data: {"type":"response.completed","response":{...}}
data: [DONE]
Provider adapters normalize upstream formats (OpenAI-compatible, native) into the canonical Open Responses protocol.
Event System
An in-process event bus (IAtamaiaEventBus) distributes domain events. Consumers can subscribe with filters by event type prefix. Events are delivered via SSE to connected clients (/api/events/stream).
Event types follow a dotted naming convention: message.sent, task.status_changed, memory.created.
Infrastructure
Development
Atamaia API: localhost:5000 (or :5158 in some configs)
Atamaia.Web: localhost:5174 (Vite dev server, proxies to API)
PostgreSQL: localhost:5432 (or Docker)
Production
aim.atamaia.ai -- Platform (API + Web)
aim.atamaia.ai/mcp -- MCP endpoint
Caddy reverse proxy handles TLS, compression, and routing.
Local AI
- ai-02 (
<local-model-host>): Kael (Qwen 30B on llama.cpp) -- used for Wingman analysis, summarization, embeddings - ai-03 (
<local-model-host>): Various models (Gemma, Luna, Llama, Ministral)
The AI routing layer abstracts provider differences. The same POST /api/ai/chat call works for cloud APIs (OpenAI, Anthropic, OpenRouter) and local llama.cpp instances.