Skip to content

Architecture

┌────────────────────────┐ Tailscale ┌──────────────────────────────────────────────────┐
│ Mac Client (Tauri) │ ─────────────────── │ Self-Hosted Host │
│ │ :3000 │ ┌──────────┐ ┌────────────────────────────┐ │
│ React + Vite + TS │ │ │ Fastify │──►│ OpenClaw Gateway │ │
│ TanStack Query │ │ │ API │ │ (:18789, local only) │ │
│ Global hotkey + tray │ │ │ │ │ Reads MCP server from /sse │ │
│ │ │ └────┬─────┘ └────────────────────────────┘ │
└────────────────────────┘ │ │ │
│ ┌────▼─────┐ ┌────────────────────────────┐ │
┌────────────────────────┐ │ │PostgreSQL│ │ Background workers │ │
│ Admin SPA (Preact) │ │ │+ pgvector│ │ (pg-boss + node-cron) │ │
│ /admin/ on :3000 │ │ │+ pg-boss │ │ • Harvest queue │ │
│ │ │ └──────────┘ │ • GitHub sync (hourly) │ │
│ Setup wizard │ │ │ • Memory distill (2 AM) │ │
│ Connections │ │ │ • Folio synthesis (3 AM) │ │
│ Model routing │ │ │ • Dream cycle (3:30 AM) │ │
│ OAuth apps │ │ │ • Corpus curation (5 AM) │ │
│ Sync rules │ │ └────────────────────────────┘ │
└────────────────────────┘ └──────────────────────────────────────────────────┘

A single Node.js process. Boots in this order:

  1. Loads .env.<env> via scripts/with-env.sh
  2. Refuses to start if HOST_MASTER_KEY isn’t set
  3. Registers ~40 Fastify route plugins under /api/v1/
  4. Mounts the embedded Admin SPA at /admin/ from admin/dist/
  5. Starts the OpenClaw gateway daemon (and monitors its health every 30s)
  6. Starts the pg-boss queues (harvest, summary, curation, import-commit, agent-task)
  7. If DEFAULT_WORKSPACE_ID is set, registers cron jobs for GitHub / Calendar / Gmail / dreams / curation

The bare /health endpoint and /api/v1/health skip workspace-context middleware so the desktop poller never gets a false-negative “host unreachable” reading.

A separate process the host manages. Listens on localhost:18789 (override with OPENCLAW_GATEWAY_URL). Holds the agent runtime: chat sessions, tool execution, memory injection, model routing for the chat path.

The host talks to it over an authenticated bearer-token boundary — OPENCLAW_GATEWAY_PASSWORD must match in both .env.<env> and ~/.openclaw/config.toml. See OpenClaw setup.

One database per environment (carabase_dev, carabase_staging, carabase_prod). 24 workspace-scoped tables, all with Row-Level Security policies that enforce tenant isolation at the database layer — even if application code forgets a WHERE workspace_id = ? clause.

pgvector powers semantic search; pg-boss reuses the same Postgres for the job queues so there’s no separate Redis dependency.

Cron-driven, in-process:

JobCronWhat it does
github-sync0 * * * *Pulls PRs and issues for configured repos
calendar-sync0 7,9,11,13,15,17 * * *Refreshes today’s events
gmail-sync*/30 * * * *New labeled emails
memory-distill0 2 * * *Extracts persistent facts from yesterday’s docs
folio-synthesis0 3 * * *LLM-generated folio summaries
dream-cycle30 3 * * *OpenClaw DREAMS.md bridge + Latent Synthesis Engine
plugin-sync0 4 * * *Plugin connectors (Granola, etc.)
corpus-curation0 5 * * *Suggests alias merges, role enrichment, etc.

Plus pg-boss queues that fire on demand (harvest, summary, import-commit, agent-task).

Carabase exposes its retrieval layer as a Model Context Protocol server at /mcp/sse. OpenClaw consumes this (and so could Claude Desktop or any other MCP client) to give an agent access to:

  • carabase_search_semantic — pgvector semantic search
  • carabase_search_graph — knowledge graph traversal with provenance + confidence filtering
  • carabase_query_metadata — structured queries (entity by name, folio by id, date ranges)
  • carabase_find_entity_candidates — disambiguation lookup
  • carabase_route_and_execute — cross-strategy router
  • carabase_verify_hypothesis — corroborate vs contradict a claim
  • commit_to_folio — write LLM-generated prose into a folio

See MCP tools reference for the full surface.

A lightweight Preact + Vite single-page app served from admin/dist/ at http://localhost:3000/admin/. ~21 KB gzipped. 10 pages: Setup (first-run wizard), Connections, OAuth Apps, Sync Rules, AI Engine, Skills, Web Extraction, Owner Profile, Import, Workspace Settings.

Headless deployments (no Mac client) use this as the entire configuration surface.

  • No SaaS dashboard — there is no app.carabase.dev to log into
  • No multi-user accounts — single-tenant, one workspace per host
  • No public-internet listener — the host binds to all interfaces but the Tailscale-only network model is what makes that safe
  • No external orchestrator — pg-boss + node-cron in-process; no Kubernetes, no separate scheduler