How OCC works

This page follows a single query through the OCC pipeline. Each step is described at a high level — the dedicated pages on retrieval and deliberation cover the mechanics in detail.

When you type a question into the Node, six things happen:

  1. Routing. A small classifier decides whether the question is small talk or a knowledge question.
  2. Decomposition. A knowledge question is broken into one or more sub-queries, one per topic.
  3. Search. Each sub-query is sent to the broker's search index, which returns the most relevant pages across all approved packs.
  4. Retrieval. The Node fetches those pages and follows their cross-links one hop further.
  5. Deliberation. Three agents — Expert, Critic, Synthesizer — produce the final answer together.
  6. Display. The answer streams to the screen alongside the list of pages it was grounded in.

The rest of this page walks through each step.

1. Routing

When the Node receives your message, the first job is to decide whether to engage the full pipeline or just have a casual exchange. A short prompt like "thanks" or "hello" doesn't need the broker, the search, or the multi-agent deliberation. A question like "what is the Inca road system?" does.

A small two-stage classifier handles this in milliseconds. Internally it picks one of four categories:

  • chitchat — greetings, "thanks", "who are you". Bypass everything, reply directly.
  • tools — a single concrete action (read a file, run code, fetch a URL, browse the pack catalog). See Tools.
  • knowledge — an informational question that benefits from the curated packs. Runs the full pipeline below.
  • skill — a composite multi-step task (deep research, fact-check, document Q&A, code intents). See Skills.

The default is biased toward retrieval: when in doubt, the Node treats the message as a knowledge question and uses its packs rather than its training data.

Chat mode has a small set of agentic tools available regardless of routing: reading a file in your workspace, running a piece of code, listing workspace files, reading uploaded documents and audio.

web gate

Web tools — searching the open web, fetching a URL — are gated separately. They only become available when you explicitly ask for them, by including a URL in your message or by using words like "web", "online", or "internet". The Node never reaches for the open web on its own initiative.

2. Decomposition

A question can span more than one topic. "How do I containerize an MCP server in Docker?" really asks two things: something about Docker, something about MCP. Searching with the whole sentence as one query would dilute relevance and likely return only Docker pages or only MCP pages — never both.

Before retrieval, the Node breaks the question into independent sub-queries, one per logical topic. A simple question stays as one sub-query. A multi-domain question becomes two to four focused sub-queries. Each is then searched independently.

The decomposition is a single small language model call. It is invisible in normal use — you just notice that complex questions get answered by drawing from multiple packs at once.

3. Search

Each sub-query is sent to the broker's /search endpoint. The broker maintains a full-text search index over every page of every approved pack — titles, summaries, and metadata — and returns the most relevant pages ranked by score.

This is how OCC scales. With a handful of packs the index is trivially fast; with thousands of packs it remains fast (the underlying engine is SQLite FTS5, which handles indexes of millions of rows in single-digit milliseconds). The Node never has to download a list of packs or scan them locally — it asks the broker a focused question and gets focused answers.

The result of the search step is a set of (pack, page) references — typically the top three to eight per sub-query, merged round-robin so no high-scoring pack starves the others.

4. Retrieval

The Node fetches the identified pages, grouped by pack. Each page is a short, dense markdown file with a title, a one-line summary, structured sections, and a list of cross-links — the See Also section.

After fetching the initial pages, the Node does one extra round: it follows the See Also links one hop further, fetching neighboring pages that are likely to add context. This is the Karpathy LLM-wiki pattern: rather than scanning unbounded documents, the Node walks a small, dense graph of curated pages and reads what is connected to the most relevant ones.

The result is a context — typically four to twelve pages, depending on the question and your hardware tier — that gets passed to deliberation.

5. Deliberation

The retrieved pages are not the answer. They are raw material. Three agents shape them into a response:

  • The Expert writes a first draft of the answer using the retrieved pages. It is the most informed of the three about the user's exact question, but also the most likely to over-claim — to assert a date or a citation that sounds right but isn't actually in the sources.

  • The Critic reviews that draft against the same sources, with one extra power: it can call a fetch_full_page tool to read any retrieved page in full when it suspects an unsupported claim.

  • The Synthesizer reads the Expert draft and the Critic review and produces the final answer. Claims the Critic flagged are removed or softened; gaps the Critic identified are addressed. The Synth never sees the raw sources directly — it works only from the two perspectives that already engaged with them.

critic faithfulness rule

The Critic's job is to flag — not to rewrite. If a date isn't in the sources, the Critic says "unsupported by source"; it does not propose an alternative date from its own training data. The final answer contracts around what the sources actually support.

When a peer with stronger hardware is available on the network, the Critic step can run on that peer over an end-to-end encrypted exchange. The query and draft pass through the broker but the broker never reads them — the peer is the only one with the keys to decrypt. The local Node remains the Expert and the Synthesizer; the Critic is borrowed.

6. Display

The answer streams into the chat. Below it, the Node shows a Sources panel containing:

  • The list of retrieved pages (pack path, title, snippet) — the material the answer was grounded in.
  • The Expert draft — what the local model produced before review.
  • The Critic review — flags, supported claims, suggestions.

Nothing is hidden. You can see what was retrieved, what the model first wrote, what was challenged, and how the final answer integrates everything. If a claim is removed from the final answer because the Critic flagged it as unsupported, you can inspect that exchange directly.

provenance

Not just citations at the bottom of an answer — the ability to inspect the entire chain of reasoning that produced it.


Each of these steps is described in more depth in its own page:

Something missing or incorrect? Open an issue on GitHub