Skip to content

MCP tools

Carabase ships a Model Context Protocol server at /mcp/sse that exposes the host’s retrieval layer as callable tools. OpenClaw consumes this; any other MCP client (Claude Desktop, MCP Inspector, etc.) can too.

The canonical tool surface lives in the @carabase/mcp-server workspace package — single source of truth, DB-agnostic, importable by any consumer.

carabase_search_semantic(query, k?, min_similarity?)

Section titled “carabase_search_semantic(query, k?, min_similarity?)”

pgvector semantic search across artifacts. Returns the top k hits ranked by cosine similarity, with provenance.

{ "query": "what did Alice say about pricing?", "k": 5, "min_similarity": 0.7 }

carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)

Section titled “carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)”

Knowledge graph traversal from a named entity. Walks edges up to depth hops, filters by edge confidence + provenance.

{ "start_entity": "Alice Chen", "depth": 2, "min_confidence": 0.6, "source_kinds": ["extracted"] }

source_kinds accepts "extracted" (directly observed), "inferred" (LLM-derived), and/or "ambiguous" (awaiting human review).

Structured queries by entity name / folio id / date range. Use this for “what did I write last Tuesday?” type questions.

carabase_find_entity_candidates(text, hints?)

Section titled “carabase_find_entity_candidates(text, hints?)”

Disambiguation lookup — given a name fragment (“Alice”), returns candidate canonical entities + their context.

Cross-strategy router. Picks between semantic / graph / metadata based on the query shape. Useful as a default entry point when the agent doesn’t know which strategy fits.

Corroborate vs contradict a natural-language claim. Returns:

{
"verdict": "corroborated" | "contradicted" | "mixed" | "inconclusive",
"corroborated_by": [...],
"contradicted_by": [...],
"considered": 8
}

Deterministic heuristic (no LLM required for v0.1) — see Knowledge graph → Hypothesis verification.

commit_to_folio(folioName, content, extraMetadata?)

Section titled “commit_to_folio(folioName, content, extraMetadata?)”

Push LLM-generated prose into a folio. Splits into chunks, embeds each, links via a single commit row. Used by the dream injector and the agent’s research-and-summarize flows.

Available only in the in-process agent-runtime path (not over /mcp/sse). Attaches a file artifact to the originating logCard of an agentic flow run. Used by Claude / Codex providers in v0.2.

The MCP server also exposes resources (URIs the agent can call resources/read on):

URI PatternWhat it returns
carabase://artifact/{id}Lazy artifact body fetch. Tool results return URIs of this form so the agent only fetches bodies it actually needs
folio://{id}/readmeFolio name + README summary

When a Carabase tool returns 0 results or isError: true, the package appends a structured [hint: ...] + [trace: ...] trailer to the response so the agent can course-correct without resetting its chain of thought. Hints are deterministic v1 (no LLM); the LLM-driven generator slots in via the Sampler interface in v0.2.

Artifact body responses through carabase://artifact/{id} can be auto-compacted to fit a token budget. CSVs preserve the header + predicate-pushdown rows; markdown keeps headings + matching paragraphs; PDFs split by \f and rank pages by query-token frequency. Opt-in via CarabaseToolContext.compaction = { maxTokens } — off by default in the host shim.

Goes in @carabase/mcp-server, not in the host. The host’s src/services/mcp-server.ts is a thin shim that mounts the package + registers legacy aliases for backwards compat (search_semanticcarabase_search_semantic, etc.).

Workflow:

  1. Add the strategy to @carabase/retrieval if it’s a new query pattern
  2. Add the tool definition + handler to packages/mcp-server/src/tools.ts
  3. Add a hint generator in packages/mcp-server/src/hints/ for the empty-result + error cases
  4. Add an assertion in pnpm smoke:mcp so the tool is exercised in the in-memory adapter
  5. Add a trajectory in packages/mcp-server/eval/baselines/agent-trajectories.snapshot.json so the nightly agent-eval covers it