occ / forge

OCC Forge

Forge is the tool used to turn trusted sources into structured knowledge packs. It ingests documents and URLs, extracts concepts, builds LLM Wiki pages, and produces a signed manifest ready for review.

llm wiki pattern

Classic RAG

  • Raw sources → chunked at query time
  • No knowledge accumulation between queries
  • LLM re-discovers context from scratch each time

LLM Wiki (OCC)

  • Sources → LLM builds structured wiki → queries hit the wiki
  • Knowledge accumulates and cross-references
  • Conflicts flagged at ingest time, not at query time

Pattern inspired by Andrej Karpathy's LLM Wiki proposal (April 2026).

pipeline

01

Sources

URLs or local files — documentation, papers, specifications

02

Concept extraction

GPT-5 reads each source and extracts a list of concepts with slug, title, and summary

03

LLM Wiki pages

For each concept, Forge writes or updates a structured markdown wiki page

04

index.md / log.md

The index is regenerated on every run; the log is append-only with timestamps and source hashes

05

manifest.yaml

Hub-ready manifest with name, version, domains, source list with hashes, and signature field

06

Review candidate

The pack enters the OCC review queue for community approval before registry inclusion

core operations

INGEST

LLM reads a new document, extracts concepts, writes or updates wiki pages, and appends to the log. Existing pages are enriched without losing content. Conflicts are flagged inline.

QUERY

LLM reads index.md to identify relevant pages, retrieves them, and synthesizes an answer with citations. Retrieval is keyword-based with stop-word filtering.

LINT

Health check: detects contradictions, orphaned pages, outdated claims, and missing cross-references. Returns a structured report.