An LLM-maintained wiki for arXiv papers.
Ingests PDFs into a self-maintaining markdown knowledge base — section by section, with full source tracking. Every answer is traced back to the exact passage in the original paper that produced it.
Background
In April 2026, Andrej Karpathy shared a pattern that went viral in the AI community.
episteme implements this for academic papers and adds a source tracing layer: after answering from the wiki, it embeds the response and vector-searches the original source sections — returning the exact passages that produced the answer. Every claim is auditable.
System Design
Two POST endpoints. The ingest pipeline builds the wiki. The chat pipeline queries it and traces the answer back to the source.
Wiki Internals
All reads and writes happen at the section level. These two index files are what let the system track exactly where every piece of knowledge came from.
Human-readable. Maps every section_id to a description of what that wiki section covers and which paper it came from. The LLM consults this on every ingest to decide whether to create a new section or update an existing one.
## sec_001 Transformer architecture overview. Encoder-decoder structure, self-attention. Source: attention_is_all_you_need.pdf ## sec_002 Multi-head attention mechanism. Parallel projection heads, concatenation. Source: attention_is_all_you_need.pdf ## sec_003 BERT masked language modeling. Pre-training objective, [MASK] token. Source: bert_pretraining.pdf
Machine-readable. Maps section_id → source_id (source filename). Used by the chat pipeline's tracer to scope the ChromaDB vector search to only the papers that the retrieved wiki sections were compiled from.
{ "sec_001": "attention_is_all_you_need.pdf", "sec_002": "attention_is_all_you_need.pdf", "sec_003": "bert_pretraining.pdf", "sec_004": "bert_pretraining.pdf", "sec_005": "gpt2_paper.pdf", "sec_006": "attention_is_all_you_need.pdf" }
Comparison
The difference is when the reasoning happens — at query time, or at ingest time.
| Property | Classic RAG | episteme (LLM Wiki) |
|---|---|---|
| Knowledge accumulation | None — every query starts fresh | Wiki grows richer with each paper |
| Cross-paper synthesis | Chunks retrieved in isolation | LLM links concepts at write time |
| Source attribution | Rough chunk-level | Exact passage via post-answer trace |
| Self-maintenance | Passive — never improves | LLM updates wiki on new ingests |
| Human readability | Raw chunks, not browsable | Structured markdown you can read |
Structure
API Reference
Quick Start
.envGitHub Metadata
Chosen so developers who saw Karpathy's LLM Wiki idea find this project.