Memex: Advanced LLM Wiki with Critical Database Limitations


Evaluation: Memex — A Thoughtful Implementation Still Built on Sand

Executive Summary

Memex is the most sophisticated LLM Wiki implementation I have seen. Unlike the simple tutorials and agency hacks, Memex demonstrates genuine architectural awareness: it has git versioning, citation tracking, provenance analysis, contradiction policies, revertible ingests, and a zero-dependency dashboard. The creator clearly understands many of the problems with naive LLM Wiki approaches — and has built thoughtful mitigations.

Warning: The “Memex” referenced in this document as an “LLM Wiki implementation” is not the original Memex concept envisioned by Vannevar Bush in his 1945 essay “As We May Think” — Bush’s Memex was a hypothetical microfilm-based mechanical desk for associative trails, with no digital computation, no LLMs, no git, no markdown, and no automated ingestion; the name has been appropriated for a modern software project that bears only metaphorical resemblance to the original vision, and readers should not confuse the two.

Yet Memex still fails the Four Pillars of a proper knowledge base. It is a beautifully engineered prototype, not a production knowledge base. The fundamental substrate remains markdown files in a folder, not a database with foreign keys, schemas, permissions, and SQL.

This evaluation is not a dismissal. Memex is the best attempt in this space. But “best attempt” does not equal “fitness for purpose.”

https://gnu.support/images/2026/04/2026-04-23/800/the-tide-is-coming.webp


What Memex Gets Right (Substantially)

Feature Why It Matters Verdict
Git-backed history Every ingest is a commit. Revertible. Audit trail exists (at file level). ✅ Excellent
Inline citations [^src-*] Tries to solve provenance. Each claim links to a source. ✅ Good
Provenance dashboard Shows per-page citation coverage, missing citations. ✅ Good
Contradiction policies Historical/Disputed/Superseded — acknowledges that LLMs create contradictions. ✅ Insightful
WHY reports Every ingest explains its decisions. Transparency. ✅ Excellent
Wiki Ratio gauge Measures how often Claude reads wiki vs raw files. Self-awareness. ✅ Clever
4-layer raw immutability Protects source files from corruption. ✅ Good
Lint + auto-fix 16-point health check. Attempts to maintain integrity. ✅ Good
Zero-dependency dashboard Python stdlib only. No npm, no Docker. ✅ Pragmatic
CLI + GUI parity Works from terminal or dashboard. ✅ Good
Adaptive indexing Flat → hierarchical → indexed. Scales better than static index.md. ✅ Smart

These are not trivial. The creator has thought deeply about the failure modes of LLM Wiki and built real mitigations. Most LLM Wiki videos show 4 articles and a graph view. Memex shows 600+ sources, revertible ingests, and a provenance dashboard. Respect.


The Four Pillars Evaluation

1. Store with Integrity → ❌ FAIL

Requirement Memex Implementation Status
Start with raw, immutable source ✅ Yes — raw/ folder with 4-layer protection PASS
Use a real database (schemas, FK, indexes) ❌ No — markdown files + git FAIL
Store any knowledge type ⚠️ Text/markdown only — no PDFs, images, spreadsheets, emails, code, voice, video PARTIAL
Record files or locations ✅ Yes — file paths PASS
Preserve integrity / immutability ✅ Git provides versioning, but no cryptographic verification, no field-level immutability PASS (partial)

The Problem: Git tracks files, not fields. It cannot prevent two conflicting edits to the same paragraph. It cannot enforce that a citation link remains valid after a page rename. It is a version control system, not a database with referential integrity.

The Creator Seems Aware: The README mentions raw/ immutability — 4 layers of protection. But wiki/ pages are not immutable. They are edited by Claude on every ingest that touches a related entity. No foreign keys means no automatic updates when a source changes.


2. Relate with Precision → ❌ FAIL

Requirement Memex Implementation Status
Typed relationships ⚠️ Entities, concepts, sources — but types are implicit in filenames/frontmatter, not enforced PARTIAL
Foreign keys ❌ None — citations are text patterns [^src-*], not database FKs. Rename a source, citations break. FAIL
Bidirectional links ⚠️ Obsidian graph view shows backlinks, but not enforced at storage level PARTIAL
Explicit structure ✅ Folders + CLAUDE.md schema PASS
Hyperlinking ✅ Markdown links + citation badges PASS

The Problem: Memex uses text pattern citations [^src-*] instead of foreign keys. This is clever — but it is still text. If you rename raw/gpt-1.md to raw/gpt1.md, every citation to it breaks silently. There is no cascade update. No database means no referential integrity.

The Creator’s Mitigation: Git history allows revert. The provenance dashboard shows missing citations. But detection is not prevention. Links still break.


3. Trust with Provenance → ⚠️ PARTIAL (Best of all LLM Wikis)

Requirement Memex Implementation Status
Provenance — every fact knows its source ✅ Inline citations [^src-*] + provenance dashboard PASS
Permissions / access control ❌ None — anyone with file access sees everything. No user roles. No object-level ACL. FAIL
Audit trails ✅ Git commits + WHY reports + log.md PASS
Human curation ✅ Dashboard provides approval/revert. Human in loop. PASS
Cryptographic verification ❌ None — no signatures FAIL

Milestone: Memex is the first LLM Wiki implementation that actually attempts provenance. The citation system is real. The provenance dashboard shows coverage. This is genuine progress.

But the fatal flaw: No permissions. Memex is single-user by design. The instant you add a second person, or want to share with a client, or host it on a server — zero access control. Every user sees every page. No read/write separation. No role-based access.

The README does not mention this limitation. It presents Memex as a “personal knowledge base” — that is accurate. But “personal” is not “production.”


4. Retrieve with Speed → ❌ FAIL

Requirement Memex Implementation Status
Search by any dimension ⚠️ TF-IDF full-text search. No facet search, no metadata filters. PARTIAL
SQL queries ❌ No — cannot SELECT * FROM entities WHERE type = 'company' FAIL
Queryable structure ❌ No — queries go through Claude (LLM), not deterministic FAIL
Access file properties ⚠️ Git provides dates, but not custom properties PARTIAL

The Problem: Memex’s query is not deterministic. When you ask a question, Claude reads relevant wiki pages and synthesizes an answer. That is RAG, just with a different retrieval source. The answer is probabilistic, not guaranteed.

The Wiki Ratio gauge is honest about this: it measures how often Claude reads wiki vs raw files. But it does not change the fundamental fact that retrieval is LLM-mediated, not database-query deterministic.

What a real knowledge base does: SELECT answer FROM facts WHERE question = 'Who is John's sister?' — deterministic, sub-second, free, guaranteed correct if data is correct.

What Memex does: Claude reads files, infers, guesses, maybe correct, maybe hallucinates, costs tokens, takes seconds.


One Comparison Table: Memex vs. Four Pillars

Pillar Requirement Memex Status
STORE Real database (FKs, indexes) Markdown + git ❌ FAIL
STORE Any knowledge type Text only ⚠️ PARTIAL
RELATE Foreign keys Text citations [^src-*] ❌ FAIL
RELATE Typed relationships Implicit (filenames/frontmatter) ⚠️ PARTIAL
TRUST Provenance ✅ Citations + dashboard PASS
TRUST Permissions / ACL None ❌ FAIL
TRUST Audit trails ✅ Git + WHY reports PASS
RETRIEVE SQL / deterministic No — LLM-mediated ❌ FAIL
RETRIEVE Search by any dimension TF-IDF only ⚠️ PARTIAL

Tally: ❌ 4 failures | ⚠️ 3 partial | ✅ 2 passes

Memex passes provenance and audit trails — which is more than any other LLM Wiki. But it still fails on database, foreign keys, permissions, and deterministic retrieval.


What Memex Reveals About LLM Wiki Architecture

Observation Implication
The creator had to build git versioning, citation parsing, provenance dashboards, linting, and contradiction policies just to make markdown files barely trustworthy Markdown is the wrong substrate. All this engineering is compensating for missing database features.
Permissions are completely absent Memex is explicitly “personal.” The moment you need multi-user or client access, it collapses.
Query is still LLM-mediated The Wiki Ratio gauge is honest, but the fundamental retrieval is still probabilistic, expensive, and non-deterministic.
Citations are text patterns, not foreign keys [^src-*] is clever, but it breaks on rename. No cascade. No referential integrity.
No SQL, no structured queries Complex questions require Claude to infer, not query. Fragile.

The Uncomfortable Truth

Memex is impressive. The creator has built more thoughtful infrastructure than 99% of LLM Wiki promoters. The provenance system, git backing, contradiction policies, and dashboard are genuinely good work.

But it is still markdown files in a folder.

Every engineering hour spent building citation parsers, provenance dashboards, and lint checkers is an hour not spent migrating to a real database. PostgreSQL would give you:

Memex is polishing a turd. A very shiny, well-engineered turd. But still a turd.


Recommendation for the Creator (Nicholas Yoo)

You are clearly talented. Your dashboard is clean, your architecture is thoughtful, your provenance system is the best in class for LLM Wiki.

Now build it on PostgreSQL.

Keep the same concepts:

But replace the fragile foundation with a real one. You will eliminate:

You are so close to something actually good. Don’t stop at “best LLM Wiki.” Build a real Dynamic Knowledge Repository.


Final Verdict

Aspect Evaluation
Technical execution ✅ Excellent — best LLM Wiki by far
Provenance ✅ First implementation to actually solve it (mostly)
Git integration ✅ Smart — revertible ingests
Honest metrics ✅ Wiki Ratio gauge admits LLM mediation
Fails Four Pillars ❌ Yes — no database, no FKs, no permissions, no SQL
Production-ready ❌ No — single-user, no permissions, markdown substrate
Personal knowledge base ⚠️ Maybe — for a technical user who accepts limits
Business / team / client ❌ Absolutely not — no permissions, no multi-user

Overall: 🐑 Not a sheep. A thoughtful engineer building on sand. 🏗️

Memex is the best argument for why LLM Wiki needs to die — because even the best implementation still fails the Four Pillars. The problem is not the engineering. The problem is the substrate.

Move to PostgreSQL. Then we talk. 🐘


A Note to the Reader

If you are considering Memex for personal use: it is the best LLM Wiki option available. The creator has done real work. You will have a better experience than with raw markdown files.

But understand what you are getting:

Use it as a personal research notebook. Do not use it for anything that requires security, collaboration, or deterministic answers.

🐑💀🧙

⚠️ THE WORD “WIKI” HAS BEEN PERVERTED ⚠️

Related pages