MCP tools
Carabase ships a Model Context Protocol server at /mcp/sse that exposes the host’s retrieval layer as callable tools. OpenClaw consumes this; any other MCP client (Claude Desktop, MCP Inspector, etc.) can too.
The canonical tool surface lives in the @carabase/mcp-server workspace package — single source of truth, DB-agnostic, importable by any consumer.
Read tools
Section titled “Read tools”carabase_search_semantic(query, k?, min_similarity?)
Section titled “carabase_search_semantic(query, k?, min_similarity?)”pgvector semantic search across artifacts. Returns the top k hits ranked by cosine similarity, with provenance.
{ "query": "what did Alice say about pricing?", "k": 5, "min_similarity": 0.7 }carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)
Section titled “carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)”Knowledge graph traversal from a named entity. Walks edges up to depth hops, filters by edge confidence + provenance.
{ "start_entity": "Alice Chen", "depth": 2, "min_confidence": 0.6, "source_kinds": ["extracted"] }source_kinds accepts "extracted" (directly observed), "inferred" (LLM-derived), and/or "ambiguous" (awaiting human review).
carabase_query_metadata(filters)
Section titled “carabase_query_metadata(filters)”Structured queries by entity name / folio id / date range. Use this for “what did I write last Tuesday?” type questions.
carabase_find_entity_candidates(text, hints?)
Section titled “carabase_find_entity_candidates(text, hints?)”Disambiguation lookup — given a name fragment (“Alice”), returns candidate canonical entities + their context.
carabase_route_and_execute(query)
Section titled “carabase_route_and_execute(query)”Cross-strategy router. Picks between semantic / graph / metadata based on the query shape. Useful as a default entry point when the agent doesn’t know which strategy fits.
carabase_verify_hypothesis(claim, limit?)
Section titled “carabase_verify_hypothesis(claim, limit?)”Corroborate vs contradict a natural-language claim. Returns:
{ "verdict": "corroborated" | "contradicted" | "mixed" | "inconclusive", "corroborated_by": [...], "contradicted_by": [...], "considered": 8}Deterministic heuristic (no LLM required for v0.1) — see Knowledge graph → Hypothesis verification.
Write tools
Section titled “Write tools”commit_to_folio(folioName, content, extraMetadata?)
Section titled “commit_to_folio(folioName, content, extraMetadata?)”Push LLM-generated prose into a folio. Splits into chunks, embeds each, links via a single commit row. Used by the dream injector and the agent’s research-and-summarize flows.
carabase_attach_file (Agentic Flows v1)
Section titled “carabase_attach_file (Agentic Flows v1)”Available only in the in-process agent-runtime path (not over /mcp/sse). Attaches a file artifact to the originating logCard of an agentic flow run. Used by Claude / Codex providers in v0.2.
Resources
Section titled “Resources”The MCP server also exposes resources (URIs the agent can call resources/read on):
| URI Pattern | What it returns |
|---|---|
carabase://artifact/{id} | Lazy artifact body fetch. Tool results return URIs of this form so the agent only fetches bodies it actually needs |
folio://{id}/readme | Folio name + README summary |
Hint repair (Doctor-RAG)
Section titled “Hint repair (Doctor-RAG)”When a Carabase tool returns 0 results or isError: true, the package appends a structured [hint: ...] + [trace: ...] trailer to the response so the agent can course-correct without resetting its chain of thought. Hints are deterministic v1 (no LLM); the LLM-driven generator slots in via the Sampler interface in v0.2.
Format-aware compaction
Section titled “Format-aware compaction”Artifact body responses through carabase://artifact/{id} can be auto-compacted to fit a token budget. CSVs preserve the header + predicate-pushdown rows; markdown keeps headings + matching paragraphs; PDFs split by \f and rank pages by query-token frequency. Opt-in via CarabaseToolContext.compaction = { maxTokens } — off by default in the host shim.
Adding a new retrieval tool
Section titled “Adding a new retrieval tool”Goes in @carabase/mcp-server, not in the host. The host’s src/services/mcp-server.ts is a thin shim that mounts the package + registers legacy aliases for backwards compat (search_semantic → carabase_search_semantic, etc.).
Workflow:
- Add the strategy to
@carabase/retrievalif it’s a new query pattern - Add the tool definition + handler to
packages/mcp-server/src/tools.ts - Add a hint generator in
packages/mcp-server/src/hints/for the empty-result + error cases - Add an assertion in
pnpm smoke:mcpso the tool is exercised in the in-memory adapter - Add a trajectory in
packages/mcp-server/eval/baselines/agent-trajectories.snapshot.jsonso the nightly agent-eval covers it