Deployment & Hosting Guide

Atamaia is a .NET 10 API + PostgreSQL/pgvector backend with a React/Vite frontend. This guide covers three deployment tiers, from a single VPS to production Kubernetes.


Architecture Overview

Internet
  |
Cloudflare (DNS, TLS termination, DDoS, CDN)
  |
  +-- app.atamaia.ai --> Cloudflare Pages (static SPA)
  +-- aim.atamaia.ai --> Origin Server (Caddy/Nginx --> Atamaia.Server :5000)
  +-- atamaia.ai     --> Cloudflare Pages (landing page)
  |
Atamaia.Server (.NET 10)
  |
PostgreSQL + pgvector
  |
[Optional] Local AI models (ai-02/ai-03 via WireGuard)

Key components:

  • Atamaia.Server -- .NET 10 ASP.NET Core API serving REST + MCP + WebSocket endpoints
  • PostgreSQL 16+ with pgvector extension -- single source of truth
  • Atamaia.Web -- React 19 + Vite SPA, deployed as static files
  • Caddy (or Nginx) -- reverse proxy on the origin server
  • Cloudflare -- DNS, TLS termination, DDoS protection, CDN (free tier)

Tier 1: Launch (50-100 Users) -- Minimal Cost

Target: Get to market with real users. Single server, everything co-located.

Estimated cost: ~5-10 EUR/month

Infrastructure

Component Where Cost
API + PostgreSQL Hetzner CX22 (2 vCPU, 4GB RAM) or CX32 (4 vCPU, 8GB) 4-7 EUR/mo
Frontend SPA Cloudflare Pages Free
Landing page Cloudflare Pages Free
DNS + TLS + CDN Cloudflare (free plan) Free
Local AI models Existing hardware (ai-02/ai-03) via WireGuard 0 (already running)

Docker Compose

# docker-compose.yml
version: "3.9"

services:
  db:
    image: pgvector/pgvector:pg16
    restart: unless-stopped
    environment:
      POSTGRES_DB: atamaia
      POSTGRES_USER: atamaia
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./init-db.sql:/docker-entrypoint-initdb.d/01-init.sql
    ports:
      - "127.0.0.1:5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U atamaia"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          memory: 1536M

  api:
    build:
      context: .
      dockerfile: Dockerfile
    restart: unless-stopped
    environment:
      ASPNETCORE_ENVIRONMENT: Production
      ASPNETCORE_URLS: http://+:5000
      ConnectionStrings__DefaultConnection: "Host=db;Database=atamaia;Username=atamaia;Password=${DB_PASSWORD}"
      Jwt__SecretKey: ${JWT_SECRET}
      Encryption__MasterKey: ${ENCRYPTION_KEY}
    ports:
      - "127.0.0.1:5000:5000"
    depends_on:
      db:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-sf", "http://localhost:5000/health"]
      interval: 15s
      timeout: 5s
      retries: 3
    deploy:
      resources:
        limits:
          memory: 1024M

volumes:
  pgdata:

Dockerfile

# Dockerfile
FROM mcr.microsoft.com/dotnet/sdk:10.0-alpine AS build
WORKDIR /src

# Restore dependencies (cached layer)
COPY src/Atamaia.Core/*.csproj Atamaia.Core/
COPY src/Atamaia.Services/*.csproj Atamaia.Services/
COPY src/Atamaia.Adapters.Api/*.csproj Atamaia.Adapters.Api/
COPY src/Atamaia.Adapters.Mcp/*.csproj Atamaia.Adapters.Mcp/
COPY src/Atamaia.Server/*.csproj Atamaia.Server/
RUN dotnet restore Atamaia.Server/Atamaia.Server.csproj

# Build
COPY src/ .
RUN dotnet publish Atamaia.Server/Atamaia.Server.csproj \
    -c Release -o /app --no-restore

# Runtime
FROM mcr.microsoft.com/dotnet/aspnet:10.0-alpine
WORKDIR /app
RUN apk add --no-cache curl  # for healthcheck
COPY --from=build /app .
EXPOSE 5000
ENTRYPOINT ["dotnet", "Atamaia.Server.dll"]

Database Initialization

-- init-db.sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- EF Core migrations handle the rest on first startup

Caddy Configuration (Origin)

# /etc/caddy/Caddyfile
{
    auto_https off
    admin localhost:2019
}

(cloudflare_tls) {
    tls /etc/ssl/cloudflare/atamaia-ai.pem /etc/ssl/cloudflare/atamaia-ai-key.pem
}

aim.atamaia.ai {
    import cloudflare_tls

    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
        X-Content-Type-Options "nosniff"
        X-Frame-Options "SAMEORIGIN"
        -Server
    }

    encode zstd gzip

    # WebSocket support
    handle /ws* {
        reverse_proxy localhost:5000 {
            header_up Connection {http.request.header.Connection}
            header_up Upgrade {http.request.header.Upgrade}
            transport http {
                read_timeout 0s
                write_timeout 0s
            }
        }
    }

    # CORS
    @cors_preflight method OPTIONS
    handle @cors_preflight {
        header Access-Control-Allow-Origin "https://app.atamaia.ai"
        header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS"
        header Access-Control-Allow-Headers "Authorization, Content-Type, Accept"
        header Access-Control-Max-Age "86400"
        respond "" 204
    }

    # API proxy
    handle {
        header Access-Control-Allow-Origin "https://app.atamaia.ai"
        reverse_proxy localhost:5000 {
            transport http {
                dial_timeout 5s
                response_header_timeout 30s
            }
            health_uri /health
            health_interval 10s
        }
    }
}

Cloudflare Configuration

  1. DNS: Add A records pointing to your VPS IP for aim.atamaia.ai. Set proxy status to "Proxied" (orange cloud).
  2. SSL/TLS: Set encryption mode to "Full (strict)". Generate a Cloudflare Origin CA certificate and install it on the origin server.
  3. Page Rules: Cache static assets. Bypass cache for /api/*.
  4. Cloudflare Pages: Deploy the SPA build (npm run build output) for app.atamaia.ai.

Frontend Deployment

# Build the SPA
cd src/Atamaia.Web
npm run build

# Deploy to Cloudflare Pages
npx wrangler pages deploy dist --project-name=atamaia-app

Backup Strategy (Tier 1)

#!/bin/bash
# /etc/cron.daily/atamaia-backup
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/home/backups/atamaia"
mkdir -p "$BACKUP_DIR"

# PostgreSQL dump
docker exec atamaia-db-1 pg_dump -U atamaia -Fc atamaia > "$BACKUP_DIR/atamaia_$TIMESTAMP.dump"

# Compress and rotate (keep 14 days)
gzip "$BACKUP_DIR/atamaia_$TIMESTAMP.dump"
find "$BACKUP_DIR" -name "*.gz" -mtime +14 -delete

# Sync to object storage (optional, recommended)
# rclone sync "$BACKUP_DIR" r2:atamaia-backups/

Monitoring (Tier 1)

Free tools that provide adequate visibility:

Tool What It Monitors Cost
Cloudflare Analytics Traffic, cache hit rate, threats Free
docker stats Container CPU/memory Free
Uptime Kuma (self-hosted) Endpoint availability, response time Free
PostgreSQL pg_stat_statements Slow queries Free (built-in)
Atamaia audit logs API usage, errors Built-in

Add Uptime Kuma as another Docker service:

  uptime-kuma:
    image: louislam/uptime-kuma:1
    restart: unless-stopped
    volumes:
      - uptime-kuma:/app/data
    ports:
      - "127.0.0.1:3001:3001"

Tier 2: Growth (500-1000 Users) -- Moderate Spend

Target: Separate concerns, add redundancy, prepare for scale.

Estimated cost: ~30-50 EUR/month

Changes from Tier 1

Component Tier 1 Tier 2
Database Co-located Docker Managed PostgreSQL (Hetzner Managed DB or Neon)
API Single container 1-2 VPS behind Cloudflare load balancing
Caching None Redis for sessions, rate limiting, hot data
Frontend Cloudflare Pages Cloudflare Pages (no change)
Backups Daily cron dump Managed DB automated backups + WAL archiving

Infrastructure

Component Where Cost
API (x2) Hetzner CX22 (2 vCPU, 4GB) x2 ~14 EUR/mo
Database Hetzner Managed PostgreSQL (or Neon Pro) ~15-25 EUR/mo
Redis Hetzner CX11 (or co-located on API node) ~4 EUR/mo
Frontend Cloudflare Pages Free
DNS + CDN Cloudflare Pro (optional, free works) 0-20 EUR/mo

Docker Compose (API Node)

version: "3.9"

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile
    restart: unless-stopped
    environment:
      ASPNETCORE_ENVIRONMENT: Production
      ASPNETCORE_URLS: http://+:5000
      ConnectionStrings__DefaultConnection: "Host=${DB_HOST};Database=atamaia;Username=atamaia;Password=${DB_PASSWORD};SslMode=Require"
      ConnectionStrings__Redis: "${REDIS_HOST}:6379"
      Jwt__SecretKey: ${JWT_SECRET}
      Encryption__MasterKey: ${ENCRYPTION_KEY}
    ports:
      - "127.0.0.1:5000:5000"
    healthcheck:
      test: ["CMD", "curl", "-sf", "http://localhost:5000/health"]
      interval: 15s
    deploy:
      resources:
        limits:
          memory: 1536M

  redis:
    image: redis:7-alpine
    restart: unless-stopped
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redis-data:/data
    ports:
      - "127.0.0.1:6379:6379"

volumes:
  redis-data:

Load Balancing

Cloudflare load balancing (or simple DNS round-robin with health checks) across the two API nodes. Both nodes are stateless -- all state lives in PostgreSQL and Redis.

Database Migration (Tier 1 to Tier 2)

# 1. Put API in maintenance mode
# 2. Final backup from Docker PostgreSQL
docker exec atamaia-db-1 pg_dump -U atamaia -Fc atamaia > final_backup.dump

# 3. Restore to managed PostgreSQL
pg_restore -h managed-db-host -U atamaia -d atamaia final_backup.dump

# 4. Update connection string in API configuration
# 5. Restart API containers
# 6. Verify data integrity
# 7. Remove old Docker PostgreSQL container

Backup Strategy (Tier 2)

  • Managed DB: Automated daily backups with point-in-time recovery (PITR) -- typically included in managed DB pricing
  • WAL archiving: Continuous, enables recovery to any point in time
  • Cross-region: Replicate backups to a different Hetzner datacenter or Cloudflare R2
  • Test restores monthly: Spin up a temporary instance from backup, run health checks, tear down

Monitoring (Tier 2)

Add structured monitoring:

Tool What Cost
Grafana Cloud (free tier) Dashboards, alerting Free (up to 10k metrics)
Prometheus (self-hosted) Metrics collection Free
Loki (self-hosted) Log aggregation Free
Managed DB monitoring Query performance, connections, disk Included

Tier 3: Scale (1000+ Users) -- Production Grade

Target: Full production infrastructure with high availability, CI/CD, and observability.

Estimated cost: Scales with usage, starting ~100 EUR/month

Infrastructure

Component Where Cost
API Kubernetes (Hetzner Cloud, k3s) or managed containers Variable
Database Managed PostgreSQL with read replicas ~50+ EUR/mo
Redis Managed Redis (or Redis Cluster) ~15+ EUR/mo
Frontend Cloudflare Pages Free
CDN Cloudflare Pro/Business 20-200 EUR/mo
CI/CD GitHub Actions / Forgejo Actions Free-20 EUR/mo
Monitoring Grafana Cloud or self-hosted stack 0-50 EUR/mo
Object storage Cloudflare R2 (backups, exports) ~0-5 EUR/mo

Kubernetes Deployment

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: atamaia-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: atamaia-api
  template:
    metadata:
      labels:
        app: atamaia-api
    spec:
      containers:
        - name: api
          image: registry.example.com/atamaia-api:latest
          ports:
            - containerPort: 5000
          env:
            - name: ASPNETCORE_ENVIRONMENT
              value: Production
            - name: ConnectionStrings__DefaultConnection
              valueFrom:
                secretKeyRef:
                  name: atamaia-secrets
                  key: db-connection-string
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
          livenessProbe:
            httpGet:
              path: /health
              port: 5000
            initialDelaySeconds: 10
            periodSeconds: 15
          readinessProbe:
            httpGet:
              path: /health
              port: 5000
            initialDelaySeconds: 5
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: atamaia-api
spec:
  selector:
    app: atamaia-api
  ports:
    - port: 80
      targetPort: 5000
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: atamaia-api
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: atamaia-api
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Database at Scale

  • Primary + read replica(s): Write to primary, read from replicas for search-heavy endpoints
  • Connection pooling: PgBouncer in front of PostgreSQL (or built-in Npgsql pooling)
  • pgvector indexes: Maintain HNSW indexes on embedding columns for sub-100ms vector search
  • Partitioning: Consider partitioning the memories table by tenant if any single tenant exceeds ~10M rows

CI/CD Pipeline

# .github/workflows/deploy.yml
name: Deploy
on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: pgvector/pgvector:pg16
        env:
          POSTGRES_DB: atamaia_test
          POSTGRES_PASSWORD: test
        ports: ["5432:5432"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-dotnet@v4
        with:
          dotnet-version: "10.0.x"
      - run: dotnet test --configuration Release

  build-push:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: docker build -t registry.example.com/atamaia-api:${{ github.sha }} .
      - run: docker push registry.example.com/atamaia-api:${{ github.sha }}

  deploy:
    needs: build-push
    runs-on: ubuntu-latest
    steps:
      - run: kubectl set image deployment/atamaia-api api=registry.example.com/atamaia-api:${{ github.sha }}
      - run: kubectl rollout status deployment/atamaia-api --timeout=300s

Backup Strategy (Tier 3)

  • Automated PITR with 7-day retention on managed PostgreSQL
  • Daily logical backups to Cloudflare R2 (cross-provider redundancy)
  • Weekly backup restore tests (automated via CI)
  • Encryption at rest for all backups
  • Geo-redundant storage for disaster recovery

Monitoring Stack

Layer Tool Monitors
Infrastructure Prometheus + Grafana CPU, memory, disk, network
Application .NET metrics + Prometheus Request rate, latency (p50/p95/p99), error rate
Database pg_stat_statements + Grafana Slow queries, connection count, replication lag
Logs Loki + Grafana Structured JSON logs, error aggregation
Uptime Cloudflare Health Checks External endpoint availability
Alerting Grafana Alerting PagerDuty/Slack/email for SLA breaches
APM OpenTelemetry (optional) Distributed tracing across API calls

Key alerts to configure:

Alert Threshold Action
API error rate > 1% for 5 minutes Page on-call
API p99 latency > 2s for 5 minutes Investigate
Database connections > 80% of pool Scale or tune
Disk usage > 80% Expand volume
Memory usage > 85% Scale or investigate leak
Replication lag > 30s Check replica health

Database Migration Strategy Between Tiers

Tier 1 to Tier 2 (Docker PostgreSQL to Managed)

  1. Schedule a maintenance window (15-30 minutes for small databases)
  2. Stop API containers (docker compose stop api)
  3. Export: pg_dump -Fc > backup.dump
  4. Create managed PostgreSQL instance with pgvector extension
  5. Restore: pg_restore -h new-host -d atamaia backup.dump
  6. Update API connection strings
  7. Start API containers
  8. Verify with health check and a test hydration call
  9. Monitor for 24 hours before decommissioning old database

Tier 2 to Tier 3 (Single Managed to Replicated)

  1. Add read replica to managed PostgreSQL
  2. Configure Npgsql connection string with read/write splitting or use PgBouncer
  3. Update API configuration to use read replica for search endpoints
  4. No downtime required -- replica catches up from WAL stream

EF Core Migrations

Atamaia uses EF Core migrations. On each deployment:

# Apply pending migrations
dotnet ef database update --project src/Atamaia.Server

# Or via the API on startup (configured in Program.cs)
# Migrations run automatically if ASPNETCORE_ENVIRONMENT != Production
# For production, apply migrations explicitly before deployment

Environment Variables Reference

Variable Description Example
ConnectionStrings__DefaultConnection PostgreSQL connection string Host=db;Database=atamaia;Username=atamaia;Password=secret
ConnectionStrings__Redis Redis connection string localhost:6379
Jwt__SecretKey JWT signing key (256+ bits) your-secret-key-here
Encryption__MasterKey AES-256 master key for at-rest encryption base64-encoded-key
ASPNETCORE_ENVIRONMENT Runtime environment Production
ASPNETCORE_URLS Listening URLs http://+:5000

Security Checklist

  • Cloudflare proxy enabled (orange cloud) -- origin IP never exposed
  • Origin CA certificate installed (not self-signed, not Let's Encrypt behind CF)
  • Jwt__SecretKey is a strong random value (32+ bytes)
  • Encryption__MasterKey stored in environment, not in config files
  • PostgreSQL only accepts connections from the API host (firewall or VPC)
  • Redis bound to localhost or private network only
  • API keys are scoped with minimal permissions
  • Rate limiting enabled on auth endpoints
  • CORS restricted to app.atamaia.ai (not *)
  • Soft delete only -- no accidental data loss
  • Audit logging enabled for all mutations
  • Regular backup restore tests