Skip to content

Instantly share code, notes, and snippets.

@pjdurka
Created May 26, 2026 14:09
Show Gist options
  • Select an option

  • Save pjdurka/6793739c25db5c49a0386ab43f418d3f to your computer and use it in GitHub Desktop.

Select an option

Save pjdurka/6793739c25db5c49a0386ab43f418d3f to your computer and use it in GitHub Desktop.
Maintainer-Wiki Kit

An agent-agnostic recipe for LLM-maintained software knowledge bases

Andrej Karpathy's "LLM Wiki" (2026) introduced the idea of using an LLM agent as the bookkeeper of a structured knowledge base. This kit operationalises that idea for a concrete purpose: keeping open-source scientific (read: underfunded) software projects maintainable as contributors come and go. Westner et al. (2025) report bus factors of 1–3 across major FOSS projects in neuroscience. "More docs and comments" is not a sustainable solution; bookkeeping is boring, so why not outsource it to an LLM under a standing rule the agent loads every session?

Usage: Hand this file to a coding agent and say "set it up." The agent does the rest. This file is consumed once; what persists is the small skeleton and the standing rule it installs.

In the wild: This kit was developed alongside IDE4EEG, a FOSS EEG analysis environment. Its docs/maintainer-wiki/ is a production instance — 15+ pages covering architecture, failure history, and known gotchas, with a working linter enforced on every push.


Part A — Bootstrap (run once)

Agent: perform these steps now. Be idempotent—never overwrite a file that already exists; ask before modifying my agent-instruction file. Report what you created and the linter result at the end.

Ask me two things (and wait for answers):

  1. Where should the wiki live? Default: docs/maintainer-wiki/.
  2. Which agent-instruction file do you load every session? Pick from the table in the Appendix (e.g. CLAUDE.md, AGENTS.md, .cursor/rules/, GEMINI.md, .github/copilot-instructions.md). If unsure, default to AGENTS.md and also create a one-line CLAUDE.md that says See AGENTS.md.

Create the wiki skeleton in the chosen directory (do not clobber existing files):

  • index.md — the catalog. One line per page: - [Title](file.md) — one-line hook.
  • log.md — append-only, newest entries on top. Seed with today's date (YYYY-MM-DD) and the line "Wiki bootstrapped."
  • README.md — copy the template in Part C verbatim, then replace the {{PLACEHOLDERS}} with this project's details.
  • One starter page, concept-architecture-overview.md, with a short "How this project is laid out" stub and a relative link back to the index ([index](index.md)) so the link graph is non-empty. Add it to index.md.

Implement the linter from the specification in Part D — in this repo's primary language and standard test/CI tooling (do not assume Python), as scripts/lint_wiki.* or wherever lint scripts live here. Keep it dependency-light, and verify it against Part D's acceptance test before moving on.

Wire the pre-push hook so structural drift fails before it leaves the machine:

  • Create .githooks/pre-push (shell), make it executable, and have it invoke the linter you built (--wiki <chosen-dir> --repo-root .), exiting non-zero on failure.
  • Activate it: git config core.hooksPath .githooks.
  • If the repo already uses a hook manager (pre-commit, husky, lefthook), add the lint invocation there instead and tell me which you chose.

Install the standing maintenance rule. Append the stanza in Part B to my agent-instruction file (the one from step 1). Ask before editing it. This is the only step that depends on which agent I use — everything else is neutral.

Run the linter once against <chosen-dir> and report the result. It must pass on the seeded skeleton (the first half of Part D's acceptance test).

Tell me this file is now optional — the skeleton + the installed stanza carry the discipline going forward. I may delete this kit, or keep it as a reference.


Part B — The standing maintenance rule

Agent: append this block verbatim to my agent-instruction file. Replace {{WIKI_DIR}} with the chosen path.

## Maintainer wiki (shared knowledge base — read + keep current)

`{{WIKI_DIR}}/` is a committed knowledge base for contributors and future
maintainers: the *why*, the failure history, and the landmines not visible in
the source. It is maintained by you, the agent — humans read it; you do the
bookkeeping. Treat it as living. Three operations:

- **Ingest.** When something notable lands — a design decision, a fix with a
  non-obvious root cause, a release lesson — add or update the relevant page,
  update `index.md`, and append to `log.md` (newest first, absolute dates).
  **The fixer updates the docs in the same change.** If a commit alters
  behaviour a page describes — *especially when it closes a documented
  `Open:` / gotcha / deferred item* — flip that page in the same commit.
  A fix that lands without its doc update is how the wiki goes stale at
  the source.
- **Query.** Answer questions against the wiki; file good answers back as
  new pages.
- **Lint.** Run the wiki linter (also enforced in the pre-push hook). It
  guards structure — broken links, orphan/unlisted pages, unresolved repo
  paths. Contradictions and stale prose still need a human + agent read.

**Two non-negotiable conventions:**
1. *The fixer updates the docs in the same change* (above).
2. *Status claims must cite falsifying code.* Every `Open:` / deferred /
   not-wired / known-bug claim MUST cite the `path/file.ext:symbol` that
   makes it true, and be re-verified against current code before being
   written **or ported** — `grep` the symbol, don't trust a remembered
   status. A claim anchored to live code self-falsifies on a one-line
   check; one anchored to a stale memory does not.

**Guardrail.** Tie every quantitative claim (counts, defaults, thresholds)
to a test or a `path:line`. When a page and the code disagree, the code
wins — fix the page. A failing test beats a confident wiki.

Part C — README.md template

Agent: copy this into {{WIKI_DIR}}/README.md and fill the placeholders.

# {{PROJECT_NAME}} maintainer wiki

A knowledge base for **contributors and future maintainers** — the *why*, the
failure history, and the landmines that are not visible in the source.

## How to read it (humans)

Every page is plain markdown, read top to bottom. Start with `index.md` for the
catalog. There is nothing to run — it is documentation.

- **concept-*** — how a subsystem works and *why it is shaped that way*.
- **gotcha-*** — bugs that bit us, the root cause, and the guard that now
  prevents recurrence. The highest-value pages for a successor.
- **build-*** / **convention-*** — packaging landmines and house rules.
- **reference-*** — durable catalogs, glossaries, external pointers.
- `log.md` — append-only record of notable decisions and events.

## How it is maintained (the part that keeps it from rotting)

Patterned on Karpathy's ["LLM Wiki"](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) (2026): the bottleneck of a knowledge base
is bookkeeping, not thinking, so an LLM agent does the bookkeeping under three
operations — **Ingest**, **Query**, **Lint** — defined in this project's
agent-instruction file. The mechanical subset (broken links, orphan/unlisted
pages, unresolved repo paths) is automated as a small linter
(`scripts/lint_wiki.*`) and runs in the pre-push hook, so structural drift
fails a check, not a reader's memory.

**Guardrail.** An LLM does not get bored, but it also does not get suspicious.
Tie every quantitative claim to a test or a `path:line` so drift surfaces
against the code. When a page and the code disagree, the code wins.

**Status claims must cite falsifying code.** `Open:` / deferred / not-wired /
known-bug claims flip silently when someone fixes the thing, so each MUST cite
the `path/file.ext:symbol` that makes it true and be re-verified before being
written or ported.

## Schema / conventions

- The project's agent-instruction file (e.g. `CLAUDE.md` / `AGENTS.md`) is the
  authoritative operating manual and schema; this wiki is the narrative and
  history layer around it.
- One topic per file. Cross-link liberally with relative markdown links;
  create a stub before linking to a new page (the linter fails broken links).
- De-personalised by policy: no individual names, machine specs, credentials,
  or token-handling procedures. Assume the repo may be public.
- Dates are absolute (`YYYY-MM-DD`), never relative ("last week").

Part D — Linter specification

The linter is given as a spec, not a script: implement it in this repo's primary language and CI tooling. The behaviour, not the source, must match — the acceptance test below pins it down.

Purpose. A fast, dependency-light check that guards the structure of the wiki, not the truth of its prose. Runnable by hand and from the pre-push hook.

Invocation. Two parameters, with defaults:

  • --wiki — the wiki directory (default docs/maintainer-wiki).
  • --repo-root — root for resolving cited paths (default: repo root / cwd).

Checks. All hard-fail. Print one [ok]/[fail] line per check, a detail line per failure, then a one-line summary; exit non-zero if any check fails.

  1. Broken internal links. For every markdown link [text](target): if target is relative and ends in .md (after stripping any #anchor), it must resolve to a file that exists in the wiki directory. Ignore http(s):, mailto:, and bare #anchor links.

  2. Index coverage (orphan guard). Every wiki page except the entry points (index.md, README.md, log.md — a configurable exempt set) must be linked from index.md.

  3. Repo-path resolution. Every backtick-quoted token that looks like a repo-relative path — contains a /, has a file extension, optionally a trailing :symbol or #anchor — must resolve to an existing file under --repo-root. Skip .md tokens (covered by check 1) and bare filenames without a / (avoids false positives on prose like config.toml).

  4. Scope discipline. Truth-level drift — contradictions, stale numbers, defaults that changed — is not mechanically checkable here; that is the human + agent read. Do not try to validate prose meaning.

Wire-up. Invoke from the pre-push hook and, if the repo has a lint/CI config, register it there too.

Tuning knobs. The exempt-page set (check 2); and the repo-path heuristic (require a /) — documented so adopters write repo references as full repo-relative paths (pkg/mod.ext:symbol).

Acceptance test. On a freshly-seeded skeleton the linter must print all [ok] and exit 0. It must then exit non-zero if you introduce any one of:

  • (a) a link to a .md page that does not exist;
  • (b) a page not listed in index.md;
  • (c) a backtick path like pkg/missing.ext that does not resolve under the repo root.

Implement until all four behaviours hold. A reference implementation (Python, stdlib-only) is available at scripts/lint_wiki.py in the IDE4EEG repository — but the spec above is the canonical artifact.


Appendix — Per-tool agent-instruction file

The standing rule (Part B) goes into whichever file your agent auto-loads each session.

Agent Instruction file
Claude Code CLAUDE.md
Codex / cross-tool standard AGENTS.md
Cursor .cursor/rules/*.mdc
Gemini CLI GEMINI.md
GitHub Copilot .github/copilot-instructions.md
Aider CONVENTIONS.md (named in config)

Note: agent-instruction filenames change as tools evolve — verify against each tool's current documentation.

If you want neutrality, put the content in AGENTS.md and make the tool-specific file a one-liner: e.g. CLAUDE.md: See AGENTS.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment