MCP tools

Carabase ships a Model Context Protocol server at /mcp/sse that exposes the host’s retrieval layer as callable tools. OpenClaw consumes this; any other MCP client (Claude Desktop, MCP Inspector, etc.) can too.

The canonical tool surface lives in the @carabase/mcp-server workspace package — single source of truth, DB-agnostic, importable by any consumer.

Read tools

`carabase_search_semantic(query, k?, min_similarity?)`

pgvector semantic search across artifacts. Returns the top k hits ranked by cosine similarity, with provenance.

{ "query": "what did Alice say about pricing?", "k": 5, "min_similarity": 0.7 }

`carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)`

Knowledge graph traversal from a named entity. Walks edges up to depth hops, filters by edge confidence + provenance.

{ "start_entity": "Alice Chen", "depth": 2, "min_confidence": 0.6, "source_kinds": ["extracted"] }

source_kinds accepts "extracted" (directly observed), "inferred" (LLM-derived), and/or "ambiguous" (awaiting human review).

`carabase_query_metadata(filters)`

Structured queries by entity name / folio id / date range. Use this for “what did I write last Tuesday?” type questions.

`carabase_find_entity_candidates(text, hints?)`

Disambiguation lookup — given a name fragment (“Alice”), returns candidate canonical entities + their context.

`carabase_route_and_execute(query)`

Cross-strategy router. Picks between semantic / graph / metadata based on the query shape. Useful as a default entry point when the agent doesn’t know which strategy fits.

`carabase_verify_hypothesis(claim, limit?)`

Corroborate vs contradict a natural-language claim. Returns:

{
  "verdict": "corroborated" | "contradicted" | "mixed" | "inconclusive",
  "corroborated_by": [...],
  "contradicted_by": [...],
  "considered": 8
}

Deterministic heuristic (no LLM required for v0.1) — see Knowledge graph → Hypothesis verification.

Write tools

`commit_to_folio(folioName, content, extraMetadata?)`

Push LLM-generated prose into a folio. Splits into chunks, embeds each, links via a single commit row. Used by the dream injector and the agent’s research-and-summarize flows.

`carabase_attach_file` (Agentic Flows v1)

Available only in the in-process agent-runtime path (not over /mcp/sse). Attaches a file artifact to the originating logCard of an agentic flow run. Used by Claude / Codex providers in v0.2.

Resources

The MCP server also exposes resources (URIs the agent can call resources/read on):

URI Pattern	What it returns
`carabase://artifact/{id}`	Lazy artifact body fetch. Tool results return URIs of this form so the agent only fetches bodies it actually needs
`folio://{id}/readme`	Folio name + README summary

Hint repair (Doctor-RAG)

When a Carabase tool returns 0 results or isError: true, the package appends a structured [hint: ...] + [trace: ...] trailer to the response so the agent can course-correct without resetting its chain of thought. Hints are deterministic v1 (no LLM); the LLM-driven generator slots in via the Sampler interface in v0.2.

Format-aware compaction

Artifact body responses through carabase://artifact/{id} can be auto-compacted to fit a token budget. CSVs preserve the header + predicate-pushdown rows; markdown keeps headings + matching paragraphs; PDFs split by \f and rank pages by query-token frequency. Opt-in via CarabaseToolContext.compaction = { maxTokens } — off by default in the host shim.

Adding a new retrieval tool

Goes in @carabase/mcp-server, not in the host. The host’s src/services/mcp-server.ts is a thin shim that mounts the package + registers legacy aliases for backwards compat (search_semantic → carabase_search_semantic, etc.).

Workflow:

Add the strategy to @carabase/retrieval if it’s a new query pattern
Add the tool definition + handler to packages/mcp-server/src/tools.ts
Add a hint generator in packages/mcp-server/src/hints/ for the empty-result + error cases
Add an assertion in pnpm smoke:mcp so the tool is exercised in the in-memory adapter
Add a trajectory in packages/mcp-server/eval/baselines/agent-trajectories.snapshot.json so the nightly agent-eval covers it

MCP tools

Read tools

carabase_search_semantic(query, k?, min_similarity?)

carabase_search_graph(start_entity, depth?, min_confidence?, source_kinds?)

carabase_query_metadata(filters)

carabase_find_entity_candidates(text, hints?)

carabase_route_and_execute(query)

carabase_verify_hypothesis(claim, limit?)