Architecture
The five things in the box
Section titled “The five things in the box”┌────────────────────────┐ Tailscale ┌──────────────────────────────────────────────────┐│ Mac Client (Tauri) │ ─────────────────── │ Self-Hosted Host ││ │ :3000 │ ┌──────────┐ ┌────────────────────────────┐ ││ React + Vite + TS │ │ │ Fastify │──►│ OpenClaw Gateway │ ││ TanStack Query │ │ │ API │ │ (:18789, local only) │ ││ Global hotkey + tray │ │ │ │ │ Reads MCP server from /sse │ ││ │ │ └────┬─────┘ └────────────────────────────┘ │└────────────────────────┘ │ │ │ │ ┌────▼─────┐ ┌────────────────────────────┐ │┌────────────────────────┐ │ │PostgreSQL│ │ Background workers │ ││ Admin SPA (Preact) │ │ │+ pgvector│ │ (pg-boss + node-cron) │ ││ /admin/ on :3000 │ │ │+ pg-boss │ │ • Harvest queue │ ││ │ │ └──────────┘ │ • GitHub sync (hourly) │ ││ Setup wizard │ │ │ • Memory distill (2 AM) │ ││ Connections │ │ │ • Folio synthesis (3 AM) │ ││ Model routing │ │ │ • Dream cycle (3:30 AM) │ ││ OAuth apps │ │ │ • Corpus curation (5 AM) │ ││ Sync rules │ │ └────────────────────────────┘ │└────────────────────────┘ └──────────────────────────────────────────────────┘The host process
Section titled “The host process”A single Node.js process. Boots in this order:
- Loads
.env.<env>viascripts/with-env.sh - Refuses to start if
HOST_MASTER_KEYisn’t set - Registers ~40 Fastify route plugins under
/api/v1/ - Mounts the embedded Admin SPA at
/admin/fromadmin/dist/ - Starts the OpenClaw gateway daemon (and monitors its health every 30s)
- Starts the pg-boss queues (harvest, summary, curation, import-commit, agent-task)
- If
DEFAULT_WORKSPACE_IDis set, registers cron jobs for GitHub / Calendar / Gmail / dreams / curation
The bare /health endpoint and /api/v1/health skip workspace-context middleware so the desktop poller never gets a false-negative “host unreachable” reading.
The OpenClaw gateway
Section titled “The OpenClaw gateway”A separate process the host manages. Listens on localhost:18789 (override with OPENCLAW_GATEWAY_URL). Holds the agent runtime: chat sessions, tool execution, memory injection, model routing for the chat path.
The host talks to it over an authenticated bearer-token boundary — OPENCLAW_GATEWAY_PASSWORD must match in both .env.<env> and ~/.openclaw/config.toml. See OpenClaw setup.
Postgres + pgvector
Section titled “Postgres + pgvector”One database per environment (carabase_dev, carabase_staging, carabase_prod). 24 workspace-scoped tables, all with Row-Level Security policies that enforce tenant isolation at the database layer — even if application code forgets a WHERE workspace_id = ? clause.
pgvector powers semantic search; pg-boss reuses the same Postgres for the job queues so there’s no separate Redis dependency.
Background workers
Section titled “Background workers”Cron-driven, in-process:
| Job | Cron | What it does |
|---|---|---|
github-sync | 0 * * * * | Pulls PRs and issues for configured repos |
calendar-sync | 0 7,9,11,13,15,17 * * * | Refreshes today’s events |
gmail-sync | */30 * * * * | New labeled emails |
memory-distill | 0 2 * * * | Extracts persistent facts from yesterday’s docs |
folio-synthesis | 0 3 * * * | LLM-generated folio summaries |
dream-cycle | 30 3 * * * | OpenClaw DREAMS.md bridge + Latent Synthesis Engine |
plugin-sync | 0 4 * * * | Plugin connectors (Granola, etc.) |
corpus-curation | 0 5 * * * | Suggests alias merges, role enrichment, etc. |
Plus pg-boss queues that fire on demand (harvest, summary, import-commit, agent-task).
The MCP server
Section titled “The MCP server”Carabase exposes its retrieval layer as a Model Context Protocol server at /mcp/sse. OpenClaw consumes this (and so could Claude Desktop or any other MCP client) to give an agent access to:
carabase_search_semantic— pgvector semantic searchcarabase_search_graph— knowledge graph traversal with provenance + confidence filteringcarabase_query_metadata— structured queries (entity by name, folio by id, date ranges)carabase_find_entity_candidates— disambiguation lookupcarabase_route_and_execute— cross-strategy routercarabase_verify_hypothesis— corroborate vs contradict a claimcommit_to_folio— write LLM-generated prose into a folio
See MCP tools reference for the full surface.
The Admin SPA
Section titled “The Admin SPA”A lightweight Preact + Vite single-page app served from admin/dist/ at http://localhost:3000/admin/. ~21 KB gzipped. 10 pages: Setup (first-run wizard), Connections, OAuth Apps, Sync Rules, AI Engine, Skills, Web Extraction, Owner Profile, Import, Workspace Settings.
Headless deployments (no Mac client) use this as the entire configuration surface.
What’s intentionally NOT in the box
Section titled “What’s intentionally NOT in the box”- No SaaS dashboard — there is no
app.carabase.devto log into - No multi-user accounts — single-tenant, one workspace per host
- No public-internet listener — the host binds to all interfaces but the Tailscale-only network model is what makes that safe
- No external orchestrator — pg-boss + node-cron in-process; no Kubernetes, no separate scheduler