Evaluation: Memex — A Thoughtful Implementation Still Built on Sand
Related pages

Evaluation: Memex — A Thoughtful Implementation Still Built on Sand

Executive Summary

Memex is the most sophisticated LLM Wiki implementation I have seen. Unlike the simple tutorials and agency hacks, Memex demonstrates genuine architectural awareness: it has git versioning, citation tracking, provenance analysis, contradiction policies, revertible ingests, and a zero-dependency dashboard. The creator clearly understands many of the problems with naive LLM Wiki approaches — and has built thoughtful mitigations.

Warning: The “Memex” referenced in this document as an “LLM Wiki implementation” is not the original Memex concept envisioned by Vannevar Bush in his 1945 essay “As We May Think” — Bush’s Memex was a hypothetical microfilm-based mechanical desk for associative trails, with no digital computation, no LLMs, no git, no markdown, and no automated ingestion; the name has been appropriated for a modern software project that bears only metaphorical resemblance to the original vision, and readers should not confuse the two.

Yet Memex still fails the Four Pillars of a proper knowledge base. It is a beautifully engineered prototype, not a production knowledge base. The fundamental substrate remains markdown files in a folder, not a database with foreign keys, schemas, permissions, and SQL.

This evaluation is not a dismissal. Memex is the best attempt in this space. But “best attempt” does not equal “fitness for purpose.”

What Memex Gets Right (Substantially)

Feature	Why It Matters	Verdict
Git-backed history	Every ingest is a commit. Revertible. Audit trail exists (at file level).	✅ Excellent
Inline citations `[^src-*]`	Tries to solve provenance. Each claim links to a source.	✅ Good
Provenance dashboard	Shows per-page citation coverage, missing citations.	✅ Good
Contradiction policies	Historical/Disputed/Superseded — acknowledges that LLMs create contradictions.	✅ Insightful
WHY reports	Every ingest explains its decisions. Transparency.	✅ Excellent
Wiki Ratio gauge	Measures how often Claude reads wiki vs raw files. Self-awareness.	✅ Clever
4-layer raw immutability	Protects source files from corruption.	✅ Good
Lint + auto-fix	16-point health check. Attempts to maintain integrity.	✅ Good
Zero-dependency dashboard	Python stdlib only. No npm, no Docker.	✅ Pragmatic
CLI + GUI parity	Works from terminal or dashboard.	✅ Good
Adaptive indexing	Flat → hierarchical → indexed. Scales better than static index.md.	✅ Smart

These are not trivial. The creator has thought deeply about the failure modes of LLM Wiki and built real mitigations. Most LLM Wiki videos show 4 articles and a graph view. Memex shows 600+ sources, revertible ingests, and a provenance dashboard. Respect.

The Four Pillars Evaluation

1. Store with Integrity → ❌ FAIL

Requirement	Memex Implementation	Status
Start with raw, immutable source	✅ Yes — `raw/` folder with 4-layer protection	PASS
Use a real database (schemas, FK, indexes)	❌ No — markdown files + git	FAIL
Store any knowledge type	⚠️ Text/markdown only — no PDFs, images, spreadsheets, emails, code, voice, video	PARTIAL
Record files or locations	✅ Yes — file paths	PASS
Preserve integrity / immutability	✅ Git provides versioning, but no cryptographic verification, no field-level immutability	PASS (partial)

The Problem: Git tracks files, not fields. It cannot prevent two conflicting edits to the same paragraph. It cannot enforce that a citation link remains valid after a page rename. It is a version control system, not a database with referential integrity.

The Creator Seems Aware: The README mentions raw/ immutability — 4 layers of protection. But wiki/ pages are not immutable. They are edited by Claude on every ingest that touches a related entity. No foreign keys means no automatic updates when a source changes.

2. Relate with Precision → ❌ FAIL

Requirement	Memex Implementation	Status
Typed relationships	⚠️ Entities, concepts, sources — but types are implicit in filenames/frontmatter, not enforced	PARTIAL
Foreign keys	❌ None — citations are text patterns `[^src-*]`, not database FKs. Rename a source, citations break.	FAIL
Bidirectional links	⚠️ Obsidian graph view shows backlinks, but not enforced at storage level	PARTIAL
Explicit structure	✅ Folders + `CLAUDE.md` schema	PASS
Hyperlinking	✅ Markdown links + citation badges	PASS

The Problem: Memex uses text pattern citations [^src-*] instead of foreign keys. This is clever — but it is still text. If you rename raw/gpt-1.md to raw/gpt1.md, every citation to it breaks silently. There is no cascade update. No database means no referential integrity.

The Creator’s Mitigation: Git history allows revert. The provenance dashboard shows missing citations. But detection is not prevention. Links still break.

3. Trust with Provenance → ⚠️ PARTIAL (Best of all LLM Wikis)

Requirement	Memex Implementation	Status
Provenance — every fact knows its source	✅ Inline citations `[^src-*]` + provenance dashboard	PASS
Permissions / access control	❌ None — anyone with file access sees everything. No user roles. No object-level ACL.	FAIL
Audit trails	✅ Git commits + WHY reports + log.md	PASS
Human curation	✅ Dashboard provides approval/revert. Human in loop.	PASS
Cryptographic verification	❌ None — no signatures	FAIL

Milestone: Memex is the first LLM Wiki implementation that actually attempts provenance. The citation system is real. The provenance dashboard shows coverage. This is genuine progress.

But the fatal flaw: No permissions. Memex is single-user by design. The instant you add a second person, or want to share with a client, or host it on a server — zero access control. Every user sees every page. No read/write separation. No role-based access.

The README does not mention this limitation. It presents Memex as a “personal knowledge base” — that is accurate. But “personal” is not “production.”

4. Retrieve with Speed → ❌ FAIL

Requirement	Memex Implementation	Status
Search by any dimension	⚠️ TF-IDF full-text search. No facet search, no metadata filters.	PARTIAL
SQL queries	❌ No — cannot `SELECT * FROM entities WHERE type = 'company'`	FAIL
Queryable structure	❌ No — queries go through Claude (LLM), not deterministic	FAIL
Access file properties	⚠️ Git provides dates, but not custom properties	PARTIAL

The Problem: Memex’s query is not deterministic. When you ask a question, Claude reads relevant wiki pages and synthesizes an answer. That is RAG, just with a different retrieval source. The answer is probabilistic, not guaranteed.

The Wiki Ratio gauge is honest about this: it measures how often Claude reads wiki vs raw files. But it does not change the fundamental fact that retrieval is LLM-mediated, not database-query deterministic.

What a real knowledge base does: SELECT answer FROM facts WHERE question = 'Who is John's sister?' — deterministic, sub-second, free, guaranteed correct if data is correct.

What Memex does: Claude reads files, infers, guesses, maybe correct, maybe hallucinates, costs tokens, takes seconds.

One Comparison Table: Memex vs. Four Pillars

Pillar	Requirement	Memex	Status
STORE	Real database (FKs, indexes)	Markdown + git	❌ FAIL
STORE	Any knowledge type	Text only	⚠️ PARTIAL
RELATE	Foreign keys	Text citations `[^src-*]`	❌ FAIL
RELATE	Typed relationships	Implicit (filenames/frontmatter)	⚠️ PARTIAL
TRUST	Provenance	✅ Citations + dashboard	PASS
TRUST	Permissions / ACL	None	❌ FAIL
TRUST	Audit trails	✅ Git + WHY reports	PASS
RETRIEVE	SQL / deterministic	No — LLM-mediated	❌ FAIL
RETRIEVE	Search by any dimension	TF-IDF only	⚠️ PARTIAL

Tally: ❌ 4 failures | ⚠️ 3 partial | ✅ 2 passes

Memex passes provenance and audit trails — which is more than any other LLM Wiki. But it still fails on database, foreign keys, permissions, and deterministic retrieval.

What Memex Reveals About LLM Wiki Architecture

Observation	Implication
The creator had to build git versioning, citation parsing, provenance dashboards, linting, and contradiction policies just to make markdown files barely trustworthy	Markdown is the wrong substrate. All this engineering is compensating for missing database features.
Permissions are completely absent	Memex is explicitly “personal.” The moment you need multi-user or client access, it collapses.
Query is still LLM-mediated	The Wiki Ratio gauge is honest, but the fundamental retrieval is still probabilistic, expensive, and non-deterministic.
Citations are text patterns, not foreign keys	`[^src-*]` is clever, but it breaks on rename. No cascade. No referential integrity.
No SQL, no structured queries	Complex questions require Claude to infer, not query. Fragile.

The Uncomfortable Truth

Memex is impressive. The creator has built more thoughtful infrastructure than 99% of LLM Wiki promoters. The provenance system, git backing, contradiction policies, and dashboard are genuinely good work.

But it is still markdown files in a folder.

Every engineering hour spent building citation parsers, provenance dashboards, and lint checkers is an hour not spent migrating to a real database. PostgreSQL would give you:

Foreign keys → automatic cascade updates, no broken links
Permissions → row-level security, role-based access
SQL → deterministic queries, sub-second, free
Schemas → typed relationships enforced at insert time
Audit trails → native, not bolted on
Any data type → JSON, vectors, geospatial, arrays, not just text

Memex is polishing a turd. A very shiny, well-engineered turd. But still a turd.

Recommendation for the Creator (Nicholas Yoo)

You are clearly talented. Your dashboard is clean, your architecture is thoughtful, your provenance system is the best in class for LLM Wiki.

Now build it on PostgreSQL.

Keep the same concepts:

raw/ folder for immutable sources
Claude for summarization and entity extraction
Citations as foreign keys (not text patterns)
Git for versioning (optional — PostgreSQL can version)
Provenance dashboard (now querying the database)

But replace the fragile foundation with a real one. You will eliminate:

Citation breakage on rename
Missing referential integrity
No permissions
Non-deterministic retrieval
Text-only storage

You are so close to something actually good. Don’t stop at “best LLM Wiki.” Build a real Dynamic Knowledge Repository.

Final Verdict

Aspect	Evaluation
Technical execution	✅ Excellent — best LLM Wiki by far
Provenance	✅ First implementation to actually solve it (mostly)
Git integration	✅ Smart — revertible ingests
Honest metrics	✅ Wiki Ratio gauge admits LLM mediation
Fails Four Pillars	❌ Yes — no database, no FKs, no permissions, no SQL
Production-ready	❌ No — single-user, no permissions, markdown substrate
Personal knowledge base	⚠️ Maybe — for a technical user who accepts limits
Business / team / client	❌ Absolutely not — no permissions, no multi-user

Overall: 🐑 Not a sheep. A thoughtful engineer building on sand. 🏗️

Memex is the best argument for why LLM Wiki needs to die — because even the best implementation still fails the Four Pillars. The problem is not the engineering. The problem is the substrate.

Move to PostgreSQL. Then we talk. 🐘

A Note to the Reader

If you are considering Memex for personal use: it is the best LLM Wiki option available. The creator has done real work. You will have a better experience than with raw markdown files.

But understand what you are getting:

✅ Git versioning
✅ Citation tracking
✅ Provenance dashboard
❌ No permissions (single-user only)
❌ No foreign keys (links can break)
❌ No SQL (queries are LLM-mediated)
❌ Text only (no PDFs, images, spreadsheets, etc.)

Use it as a personal research notebook. Do not use it for anything that requires security, collaboration, or deterministic answers.

🐑💀🧙

⚠️ THE WORD “WIKI” HAS BEEN PERVERTED ⚠️

Shepherd's LLM-Wiki vs. Robust Dynamic Knowledge Repository: A Satirical Allegory on AI-Generated Knowledge Management
This satirical allegory critiques the trend of relying on Large Language Models (LLMs) to automatically generate and manage knowledge bases using simple Markdown files, portraying this approach as a naive "Shepherd's" promise that inevitably leads to data inconsistency, hallucinations, privacy leaks, and unmanageable maintenance. The text contrasts this fragile, probabilistic "LLM-Wiki" method with a robust, 23-year-old "Dynamic Knowledge Repository" (DKR) built on structured databases (like PostgreSQL) and Doug Engelbart's CODIAK principles, arguing that true knowledge management requires human curation, deterministic relationships, and explicit schemas rather than blindly following AI-generated text files.
Karpathy's LLM-Wiki Is a Flawed Architectural Trap
The author sharply criticizes Andrej Karpathy's viral "LLM-Wiki" concept as a flawed architectural trap that mistakenly treats unstructured Markdown files as a robust database, arguing that relying on LLMs to autonomously generate and maintain knowledge leads to hallucinations, broken links, privacy leaks, and a loss of human cognitive engagement. While acknowledging the appeal of compounding knowledge, the text asserts that Markdown lacks essential database features like referential integrity, permissions, and deterministic querying, causing the system to collapse at scale and contradicting its own "zero-maintenance" promise. Ultimately, the author advocates for proven, structured solutions using real databases and human curation, positioning LLMs as helpful assistants rather than autonomous masters, and warns against blindly following a trend promoted by someone who has publicly admitted to being in a state of psychosis.
Critical Rebuttal to LLM-Wiki Video: Why Autonomous AI Claims Are Misleading
The text provides a critical rebuttal to a video promoting "LLM-Wiki," arguing that the system’s claims of autonomous intelligence, zero maintenance costs, and scalability are fundamentally misleading. The critique highlights that LLMs lack persistent memory, leading to repeated errors, while the system’s actual intelligence is merely increased data density rather than genuine understanding. Furthermore, the video ignores significant practical challenges such as substantial API costs, the inevitable need for embeddings at scale, the complexity of fine-tuning, and the persistent human labor required for data integrity and contradiction resolution. Ultimately, the author concludes that the video is merely a tutorial for a fragile prototype that fails to address critical issues like version control, access management, and long-term viability.
The LLM-Wiki Pattern: A Flawed and Misleading Alternative to RAG
The text is a scathing critique of the "LLM-Wiki" pattern, arguing that its claims of being a free, embedding-free alternative to RAG are technically flawed and misleading. The author contends that the system inevitably requires vector search and local indexing tools (like qmd) to scale, fundamentally contradicting the "no embeddings" premise, while also failing to preserve source integrity by retrieving from hallucinated LLM-generated summaries rather than original documents. Furthermore, the approach is deemed unsustainable due to hidden API costs, the inability of LLMs to maintain large indexes beyond small prototypes, and the lack of essential database features like foreign keys and version control, ultimately positioning it as a fragile prototype rather than a viable production knowledge base.
Why LLM-Based Wiki Systems Are Flawed and Unscalable
The text serves as a technical rebuttal to popular tutorials promoting LLM-based wiki systems, arguing that these prototypes are fundamentally flawed and unscalable. The author contends that such systems lack persistent memory, rely on hallucinated summaries that corrupt original data, and fail at scale due to context window limits and the need for embeddings despite claims otherwise. Furthermore, the approach is criticized for being token-expensive, lacking proper data integrity measures like foreign keys or permissions, and fostering "self-contamination" through unverified LLM suggestions. Ultimately, the author advises against adopting this "trap" as a knowledge base solution, recommending instead robust, traditional database architectures like PostgreSQL with deterministic metadata extraction, while dismissing the hype as an appeal to authority that ignores broken architecture.
Why Graphify Fails as a Robust LLM Knowledge Base
The text serves as a technical rebuttal to a tutorial promoting "Graphify" as a robust implementation of Karpathy’s LLM-Wiki pattern, arguing that the video misleadingly oversimplifies the system’s capabilities and scalability. It highlights that Graphify is not merely a simple extension but a computationally heavy architecture lacking critical production features such as data integrity, contradiction resolution, permission management, and verifiable entity extraction, while the underlying LLM possesses no true persistent memory. The author contends that the tool is merely a small-scale prototype that accumulates noise rather than compounding knowledge, and concludes by advocating for a more rigorous approach to building knowledge bases using traditional databases like PostgreSQL with deterministic metadata extraction and proper relational constraints.
LLM Wiki vs RAG: Why RAG Wins for Production Despite LLM Wiki's Knowledge Graph Appeal
While a recent video by "Data Science in your pocket" offers a balanced comparison between LLM Wiki and RAG by highlighting LLM Wiki’s ability to build structured, reusable knowledge graphs versus RAG’s repetitive, stateless retrieval, it ultimately fails to address critical production flaws. The author argues that LLM Wiki is currently a fragile prototype rather than a robust architecture, lacking essential database features like foreign keys, referential integrity, access controls, and deterministic metadata extraction. Consequently, while LLM Wiki may suit personal knowledge building, its susceptibility to error propagation, high maintenance costs, and lack of true memory make RAG the superior choice for reliable, production-ready systems, with a hybrid approach recommended for optimal results.
Why LLM Wiki Fails as a RAG Replacement: Context Limits and Data Integrity Issues
The text serves as a technical rebuttal to a video claiming that "LLM Wiki" renders Retrieval-Augmented Generation (RAG) obsolete, arguing instead that LLM Wiki is merely a rebranded, less robust version of RAG that fails at scale due to context window limitations and lacks true persistent memory or data integrity. The author highlights that LLM Wiki relies on static markdown files which cannot enforce database constraints, resolve contradictions, or prevent hallucinations from becoming "solidified" errors, ultimately requiring the same search mechanisms and human maintenance that RAG avoids. The conclusion emphasizes that while context engineering is valuable, it should be supported by proper databases with foreign keys and version control rather than fragile markdown repositories, urging developers to use LLMs as tools for processing rather than as the foundation for knowledge storage.
Critique of LLM Wiki Tutorial: Limitations and Production Readiness
The technical evaluation critiques the LLM Wiki tutorial for misleading claims that AI eliminates maintenance friction and provides persistent memory, revealing instead that the system relies on static markdown files with no referential integrity, privacy controls, or error-checking mechanisms. While the video correctly advocates for separating raw sources from generated content and using schema files, it critically omits essential issues such as hallucination propagation, silent link breakage, lack of version control for individual facts, scaling limits requiring RAG, and ongoing API costs. Ultimately, the tutorial is deemed suitable only as a small-scale personal prototype requiring active human supervision, rather than a robust, production-ready knowledge base.
LLM Wiki vs Notebook LM: Hidden Costs Privacy Tradeoffs and the Hybrid Approach
This video offers a rare, honest side-by-side evaluation of LLM Wiki and Notebook LM, correctly highlighting LLM Wiki’s significant hidden costs—including slow ingestion times, high token usage, and poor scalability beyond ~100 sources—while acknowledging Notebook LM’s speed and ease of use. However, the review understates critical privacy and ownership trade-offs, specifically that Notebook LM processes data on Google’s servers (posing risks for sensitive information) and lacks user control, whereas LLM Wiki’s maintenance burden is the price for local data sovereignty. Ultimately, the creator recommends a pragmatic hybrid approach: using Notebook LM for quick exploration and LLM Wiki for deep, long-term academic research, emphasizing that the goal should be actionable knowledge rather than just building a wiki.
Debunking Karpathy's LLM Wiki: The Truth Behind the Self-Healing Marketing Hype
The video is a heavily hyped marketing pitch for Karpathy’s "LLM Wiki" that misleadingly claims the system is "self-healing" and autonomous, while in reality, it relies on static files, requires significant human intervention for maintenance, and lacks true memory or self-correction capabilities. The presentation ignores critical technical limitations such as token costs, scale constraints beyond ~100 sources, privacy risks, and the potential for hallucinations, ultimately presenting a flawed RAG-based solution as a revolutionary upgrade without acknowledging its trade-offs or the substantial effort required to keep it functional.
LLM Wiki Pattern: A Balanced Review Highlighting Limitations and Operational Challenges
This video provides a balanced and honest introduction to the "LLM Wiki" pattern, correctly identifying its limitations to personal scales (100–200 sources) and acknowledging that RAG remains superior for larger datasets. While it avoids the hype and sales tactics of other videos by clearly explaining the system’s transparency, portability, and immutable source practices, it significantly understates critical operational challenges. The review notes that the video fails to address essential practical issues such as token costs, lengthy ingest times, the human maintenance burden required to resolve contradictions and broken links, and privacy concerns, making it a good conceptual overview but insufficient for understanding the full technical and financial realities of implementation.
Why LLM Wiki Is a Bad Idea: A Critical Analysis of Flaws and RAG Alternatives
The video "Why LLM Wiki is a Bad Idea" provides a strong, technically accurate critique of the LLM Wiki approach, correctly identifying eight major flaws including error propagation, structured hallucinations, information loss, update rigidity, and scalability issues, while recommending a hybrid RAG-based system. Although it overstates the difficulty of updates by implying full graph rebuilds and unfairly ignores RAG’s own costs and hallucination risks, it remains the most direct and valuable critical resource for understanding the significant pitfalls of relying solely on LLM-generated structured knowledge bases.
Why Adam's LLM Wiki in Business Implementation Fails as a Production Framework
Adam’s "LLM Wiki in Business" implementation fundamentally fails as a production framework because it exhibits every critical flaw identified in the opposing critique, including error propagation, hallucination structuring, information loss, and a lack of provenance or security. By relying on unstructured folders and rigid JSON schemas instead of a proper database with foreign keys, audit trails, and scalable retrieval mechanisms, Adam’s system violates all four essential pillars of reliable knowledge management (Store, Relate, Trust, Retrieve) and admits its own inability to scale beyond a small number of clients. Consequently, the analysis concludes that Adam’s approach is not a superior alternative to RAG, but rather an unintentional case study demonstrating why LLM Wiki is a flawed and risky strategy for business applications requiring accuracy, security, and scalability.
Critical Evaluation of Local LLM Wiki with Obsidian: Fundamental Flaws and Business Unsuitability
The evaluation concludes that the "Local LLM Wiki with Obsidian" tutorial fails all four fundamental pillars of a robust knowledge base—Store with Integrity, Relate with Precision, Trust with Provenance, and Retrieve with Speed—due to its reliance on unstructured markdown files lacking foreign keys, immutability, typed relationships, audit trails, and queryable SQL capabilities. Although the creator is praised for intellectual honesty and transparency about the prototype’s limitations, the architecture remains fundamentally flawed, and the use of proprietary software (Obsidian) introduces critical risks including vendor lock-in, telemetry concerns, zero access control, and the absence of multi-user support, rendering it unsuitable for any business, collaborative, or sensitive use cases despite its appeal as a personal hobby tool.
James' LLM Wiki Fails Robust Knowledge Management Due to Lack of Database Integrity
The evaluation concludes that while James from Trainingsites.io offers a rare, pragmatic, and honest assessment by correctly distinguishing between using an LLM Wiki for personal organization and RAG for customer-facing queries, his implementation fundamentally fails the four pillars of robust knowledge management: Store with Integrity, Relate with Precision, Trust with Provenance, and Retrieve with Speed. By relying on proprietary Obsidian and markdown files rather than a real database, his system lacks foreign keys, immutability, provenance tracking, access controls, and queryability, making it structurally unsound for professional or collaborative use despite its effectiveness as a personal browsing tool.