Full Evaluation: "I Can't Use Karpathy's LLM Wiki" — An Educator's Honest Reality Check
Related pages

Full Evaluation: “I Can’t Use Karpathy’s LLM Wiki” — An Educator’s Honest Reality Check

Executive Summary

This video is different from all the others. The creator, James from Trainingsites.io, does something rare: he builds the LLM Wiki, acknowledges it works beautifully for personal use, and then explains exactly why he cannot use it for his actual business — because his customers don’t have access to it.

He then makes a pragmatic choice: RAG for customer-facing queries, LLM Wiki for personal organization. This is the most balanced, honest take in the entire thread.

But the architecture still fails the four pillars. James’s honesty about his use case does not fix the structural problems. And he is still building on proprietary software (Obsidian) with everything that entails.

What This Video Gets Right (Credit Where Due)

Aspect	Evaluation
Honesty about limitations	✅ Explicit: “I can’t use it for the biggest part of what I wanted to do”
Customer awareness	✅ Recognizes that internal tools don’t serve external users
Balanced conclusion	✅ “There’s a place for a wiki and a place for a RAG”
No hype	✅ Shows the graph view, acknowledges it’s cool, but doesn’t overclaim
Pragmatic solution	✅ Uses RAG (Pinecone + Supabase) for customers, wiki for himself
Real scale	✅ 607 source folders, 600+ videos — real-world numbers
Clear problem statement	✅ “My clients and customers can’t access this”

James is not a sheep. He is a practitioner who built, tested, and made an informed decision based on his actual use case. Respect.

The Four Pillars Evaluation (Applied to James’s Implementation)

1. Store with Integrity → ❌ FAIL

Requirement	Implementation	Status
Start with raw, immutable source	✅ Yes — raw folders with original files	PASS
Use a real database (schemas, FK, indexes)	❌ No — Obsidian + markdown files	FAIL
Store any knowledge type	⚠️ Yes — videos, PDFs, Google Docs, spreadsheets, images referenced, but actual wiki stores markdown summaries only	PARTIAL
Record files or locations	✅ Yes — file paths referenced	PASS
Preserve integrity / immutability	❌ No — no versioning, no cryptographic verification	FAIL

Verdict: Same fatal flaw. No database. No foreign keys.

2. Relate with Precision → ❌ FAIL

Requirement	Implementation	Status
Typed relationships	⚠️ Partial — has types (concepts, entities, techniques, tools, comparisons, synthesis) but these are generic, not semantic like “supports/contradicts”	PARTIAL
Foreign keys	❌ None — markdown links break silently	FAIL
Bidirectional links	⚠️ Obsidian graph view shows backlinks visually, but not enforced at storage level	PARTIAL
Explicit hierarchy	✅ Folders provide hierarchy	PASS
Hyperlinking	✅ Yes — markdown links between pages	PASS

Verdict: Better than most LLM Wiki implementations (has actual typed categories), but still no foreign keys. Rename a file and links die.

3. Trust with Provenance → ❌ FAIL

Requirement	Implementation	Status
Provenance — every fact knows its source	❌ No — the wiki contains LLM-generated summaries. No traceability back to original transcript sentence.	FAIL
Permissions / access control	❌ None in Obsidian. James correctly identifies this as the problem — his customers can’t access it. But even for personal use, no access control means anyone with file access sees everything.	FAIL
Audit trails	❌ None	FAIL
Human curation	✅ Yes — Obsidian allows manual editing	PASS
Cryptographic verification	❌ None	FAIL

Critical Issue: James correctly identifies that his customers can’t access the wiki. But he misses the reverse problem: if his laptop is compromised or if he ever collaborates with someone, there are no permissions at all. The wiki is an all-or-nothing access model.

4. Retrieve with Speed → ❌ FAIL

Requirement	Implementation	Status
Search by any dimension	⚠️ Obsidian full-text search only — cannot query by entity type, topic category, or custom fields programmatically	PARTIAL
SQL queries	❌ No — markdown files require parsing	FAIL
Queryable structure	❌ No — the wiki is human-readable markdown, not machine-queryable without scripts	FAIL
Access file properties	⚠️ Filesystem provides basic metadata	PARTIAL

Verdict: James admits this indirectly. He built a separate RAG system (Pinecone + Supabase) for his customers because the wiki is not queryable. The wiki is for browsing. RAG is for answering questions. That is a correct assessment.

James’s Key Insight (Worth Highlighting)

“There’s not a way that I saw that allows the customer to be able to interface with this particular piece. It’s just not there yet. So, you’re going to have to have a RAG on the customer side, but you can have a wiki on the personal side.”

This is the most honest statement in any video about LLM Wiki.

He correctly identifies:

LLM Wiki is a personal organization tool — not customer-facing
LLM Wiki is for browsing — not querying
RAG serves a different purpose (answer questions, serve customers)
Both have their place

This is not hype. This is not “RAG is dead.” This is a working professional making tool choices based on actual requirements.

The Problem James Doesn’t Address (No Fault of His Own)

James is honest about what he built. But he doesn’t evaluate LLM Wiki against the four pillars because that’s not his frame. Here is what he doesn’t say (but should be said):

Unaddressed Issue	Why It Matters
Obsidian is proprietary	Closed source, telemetry unknown, vendor lock-in. James is building his “personal wiki” on rented land.
No foreign keys	Links break when files rename. At 600+ videos, this is a ticking time bomb.
No provenance	LLM hallucinations become permanent markdown. No way to trace claims back to source.
No audit trail	Who changed what? When? Why? No answers.
No permissions	Fine for solo personal use. Fails immediately for teams or collaboration.
No queryability	James admits this by building RAG for customers. The wiki is browse-only.

Comparison Table: James’s Implementation vs. Requirements

Requirement	James’s Implementation	Status
STORE: Real database	Obsidian + markdown files	❌ FAIL
STORE: Immutability	Editable files, no versioning	❌ FAIL
RELATE: Typed relationships	✅ Yes — concepts, entities, techniques, tools, comparisons, synthesis	PASS
RELATE: Foreign keys	None	❌ FAIL
RELATE: Bidirectional links	⚠️ Obsidian graph view (visual only, not enforced)	PARTIAL
TRUST: Provenance	Lost after LLM summarization	❌ FAIL
TRUST: Permissions	❌ None — customers can’t access, but no access control either	FAIL
TRUST: Audit trails	None	❌ FAIL
TRUST: Open source / auditable	❌ Obsidian is proprietary	FAIL
RETRIEVE: SQL / structured queries	No — James built separate RAG for this	❌ FAIL
RETRIEVE: Search by any dimension	⚠️ Obsidian full-text only	PARTIAL
RETRIEVE: Works at scale	✅ Yes — 607 folders, works for personal browsing	PASS
Human in the loop	✅ Yes	PASS
Incremental updates	✅ Yes — “updated on every ingest”	PASS
Customer-facing	❌ Explicitly cannot be used for customers (RAG instead)	N/A
Proprietary dependency	❌ Obsidian	FAIL

Tally: ❌ 8 failures | ⚠️ 3 partial | ✅ 3 passes

James vs. Other Creators

Aspect	Adam (Agency)	Local Tutorial Guy	James (Trainingsites)
Claims production readiness	❌ Yes (falsely)	✅ No (honest prototype)	✅ No (knows its limits)
Acknowledges limitations	❌ Minimal	✅ Explicit about stateless	✅ Explicit about customer access
Customer awareness	❌ None	❌ None	✅ Central to his argument
RAG vs. Wiki stance	“RAG is overkill”	Not discussed	“Both have their place”
Uses Obsidian	❌ No (folders only)	✅ Yes	✅ Yes
Mentions Obsidian is proprietary	❌ N/A	❌ No	❌ No
Overall honesty	❌ Misleading	✅ Refreshing	✅ Most balanced

Verdict: James is the most honest and balanced of all creators reviewed. He is not a sheep. He is a practitioner making pragmatic choices.

⚠️ OBSIDIAN WARNING (Same as Previous Evaluation) ⚠️

James uses Obsidian. He does not mention it is proprietary. Viewers deserve to know:

Warning	Explanation
Proprietary	Obsidian is closed source. You cannot audit it. You cannot modify it. You do not own it.
Telemetry	Obsidian collects anonymous usage data. For personal use? Maybe fine. For client NDAs? Unacceptable.
No permissions	Obsidian has zero access control. Anyone with file system access sees everything.
Vendor lock-in	Your workflow depends on Obsidian-specific features (graph view, backlinks pane, Dataview). Migrating away is painful.
Sync is paid	Obsidian Sync costs money and sends data through their servers.
Plugin risk	Community plugins run arbitrary, unvetted code. A malicious plugin can exfiltrate your entire wiki.

If you are building a personal wiki for yourself: Obsidian is workable, though proprietary.

If you are building anything for clients, collaboration, or sensitive information: Do not use Obsidian. Use open source (Logseq, Joplin, Trilium) or a real database with proper access control.

What James Gets Right That Others Miss

Customer vs. personal use are different. Most creators assume everyone is like them. James knows his customers have different needs.
Browsing ≠ querying. His wiki is for browsing. His RAG is for querying. He does not claim the wiki replaces RAG.
Pragmatism over ideology. He built both. He uses both. He recommends both depending on use case.
Real scale. 607 source folders. 600+ videos. He is not demoing with 4 articles. He is showing real-world usage.
Honest about what he doesn’t know. “I haven’t figured out a way” — rare humility in this space.

What James Misses (For Viewers to Consider)

Missed Issue	Why It Matters
Obsidian is proprietary	James presents Obsidian as “free software” (it is free as in beer, not free as in freedom). He does not mention closed source or telemetry.
No provenance	His wiki contains LLM-generated summaries. If the LLM hallucinates, that hallucination is now permanent. No way to trace back.
No foreign keys	At 600+ videos, renaming a folder or file will break links. Obsidian does not fix this.
No audit trail	If he edits a summary manually, there is no record of what changed or why.

These are not attacks on James. They are structural limitations of LLM Wiki + Obsidian that he does not address because his video is about use case, not architecture.

Final Verdict

Pillar	Result
Store with Integrity	❌ FAIL
Relate with Precision	❌ FAIL
Trust with Provenance	❌ FAIL
Retrieve with Speed	❌ FAIL

Overall: ❌ Fails all four pillars — same as every other LLM Wiki implementation.

But…

James gets something more important than architecture: use case fit.

He is not claiming the wiki is a database. He is not claiming it replaces RAG. He is not selling a course on “AI-first business systems.” He is showing what he built, why it works for him personally, why it doesn’t work for his customers, and what he uses instead.

That is intellectual honesty. It is rare. It deserves recognition.

Recommendation for Viewers

If you watch this video:

Learn from James’s pragmatism. He built, tested, and made an informed choice. Do the same.
Understand the trade-offs. James’s wiki works for him at 600+ videos. But it still has no foreign keys, no provenance, no permissions, and no SQL. Those may or may not matter for your use case.
Do not assume Obsidian is “free” in the freedom sense. It is proprietary. If that matters to you, use open source alternatives.
Ask yourself: Who is the user? If it’s just you, LLM Wiki + Obsidian might be fine. If it’s customers, clients, or collaborators, you need RAG or a real database with permissions.
James’s conclusion is correct: “There’s a place for a wiki and a place for a RAG.” Use the right tool for the job.

Closing Thought

James is not a sheep. He is not a hype merchant. He is an educator who built something, found its limits, and told his audience honestly.

That said, the architecture he built still violates the four pillars. The difference is that James knows its limits and works around them (by using RAG for customers). Most LLM Wiki promoters don’t even acknowledge the limits.

Respect for James. Contempt for the architecture. 🐑💀🧙

The actual video

One More Thing

James, if you read this: You are doing good work. Honest educators are rare. But please consider adding a disclaimer that Obsidian is proprietary and that your wiki has no provenance tracking. Your viewers deserve to know. And consider trying an open source alternative like Logseq or Trilium — same markdown files, same graph view, but you actually own the software.

Respectfully, Someone who evaluates architectures, not people.

⚠️ THE WORD “WIKI” HAS BEEN PERVERTED ⚠️

Shepherd's LLM-Wiki vs. Robust Dynamic Knowledge Repository: A Satirical Allegory on AI-Generated Knowledge Management
This satirical allegory critiques the trend of relying on Large Language Models (LLMs) to automatically generate and manage knowledge bases using simple Markdown files, portraying this approach as a naive "Shepherd's" promise that inevitably leads to data inconsistency, hallucinations, privacy leaks, and unmanageable maintenance. The text contrasts this fragile, probabilistic "LLM-Wiki" method with a robust, 23-year-old "Dynamic Knowledge Repository" (DKR) built on structured databases (like PostgreSQL) and Doug Engelbart's CODIAK principles, arguing that true knowledge management requires human curation, deterministic relationships, and explicit schemas rather than blindly following AI-generated text files.
Karpathy's LLM-Wiki Is a Flawed Architectural Trap
The author sharply criticizes Andrej Karpathy's viral "LLM-Wiki" concept as a flawed architectural trap that mistakenly treats unstructured Markdown files as a robust database, arguing that relying on LLMs to autonomously generate and maintain knowledge leads to hallucinations, broken links, privacy leaks, and a loss of human cognitive engagement. While acknowledging the appeal of compounding knowledge, the text asserts that Markdown lacks essential database features like referential integrity, permissions, and deterministic querying, causing the system to collapse at scale and contradicting its own "zero-maintenance" promise. Ultimately, the author advocates for proven, structured solutions using real databases and human curation, positioning LLMs as helpful assistants rather than autonomous masters, and warns against blindly following a trend promoted by someone who has publicly admitted to being in a state of psychosis.
Critical Rebuttal to LLM-Wiki Video: Why Autonomous AI Claims Are Misleading
The text provides a critical rebuttal to a video promoting "LLM-Wiki," arguing that the system’s claims of autonomous intelligence, zero maintenance costs, and scalability are fundamentally misleading. The critique highlights that LLMs lack persistent memory, leading to repeated errors, while the system’s actual intelligence is merely increased data density rather than genuine understanding. Furthermore, the video ignores significant practical challenges such as substantial API costs, the inevitable need for embeddings at scale, the complexity of fine-tuning, and the persistent human labor required for data integrity and contradiction resolution. Ultimately, the author concludes that the video is merely a tutorial for a fragile prototype that fails to address critical issues like version control, access management, and long-term viability.
The LLM-Wiki Pattern: A Flawed and Misleading Alternative to RAG
The text is a scathing critique of the "LLM-Wiki" pattern, arguing that its claims of being a free, embedding-free alternative to RAG are technically flawed and misleading. The author contends that the system inevitably requires vector search and local indexing tools (like qmd) to scale, fundamentally contradicting the "no embeddings" premise, while also failing to preserve source integrity by retrieving from hallucinated LLM-generated summaries rather than original documents. Furthermore, the approach is deemed unsustainable due to hidden API costs, the inability of LLMs to maintain large indexes beyond small prototypes, and the lack of essential database features like foreign keys and version control, ultimately positioning it as a fragile prototype rather than a viable production knowledge base.
Why LLM-Based Wiki Systems Are Flawed and Unscalable
The text serves as a technical rebuttal to popular tutorials promoting LLM-based wiki systems, arguing that these prototypes are fundamentally flawed and unscalable. The author contends that such systems lack persistent memory, rely on hallucinated summaries that corrupt original data, and fail at scale due to context window limits and the need for embeddings despite claims otherwise. Furthermore, the approach is criticized for being token-expensive, lacking proper data integrity measures like foreign keys or permissions, and fostering "self-contamination" through unverified LLM suggestions. Ultimately, the author advises against adopting this "trap" as a knowledge base solution, recommending instead robust, traditional database architectures like PostgreSQL with deterministic metadata extraction, while dismissing the hype as an appeal to authority that ignores broken architecture.
Why Graphify Fails as a Robust LLM Knowledge Base
The text serves as a technical rebuttal to a tutorial promoting "Graphify" as a robust implementation of Karpathy’s LLM-Wiki pattern, arguing that the video misleadingly oversimplifies the system’s capabilities and scalability. It highlights that Graphify is not merely a simple extension but a computationally heavy architecture lacking critical production features such as data integrity, contradiction resolution, permission management, and verifiable entity extraction, while the underlying LLM possesses no true persistent memory. The author contends that the tool is merely a small-scale prototype that accumulates noise rather than compounding knowledge, and concludes by advocating for a more rigorous approach to building knowledge bases using traditional databases like PostgreSQL with deterministic metadata extraction and proper relational constraints.
LLM Wiki vs RAG: Why RAG Wins for Production Despite LLM Wiki's Knowledge Graph Appeal
While a recent video by "Data Science in your pocket" offers a balanced comparison between LLM Wiki and RAG by highlighting LLM Wiki’s ability to build structured, reusable knowledge graphs versus RAG’s repetitive, stateless retrieval, it ultimately fails to address critical production flaws. The author argues that LLM Wiki is currently a fragile prototype rather than a robust architecture, lacking essential database features like foreign keys, referential integrity, access controls, and deterministic metadata extraction. Consequently, while LLM Wiki may suit personal knowledge building, its susceptibility to error propagation, high maintenance costs, and lack of true memory make RAG the superior choice for reliable, production-ready systems, with a hybrid approach recommended for optimal results.
Why LLM Wiki Fails as a RAG Replacement: Context Limits and Data Integrity Issues
The text serves as a technical rebuttal to a video claiming that "LLM Wiki" renders Retrieval-Augmented Generation (RAG) obsolete, arguing instead that LLM Wiki is merely a rebranded, less robust version of RAG that fails at scale due to context window limitations and lacks true persistent memory or data integrity. The author highlights that LLM Wiki relies on static markdown files which cannot enforce database constraints, resolve contradictions, or prevent hallucinations from becoming "solidified" errors, ultimately requiring the same search mechanisms and human maintenance that RAG avoids. The conclusion emphasizes that while context engineering is valuable, it should be supported by proper databases with foreign keys and version control rather than fragile markdown repositories, urging developers to use LLMs as tools for processing rather than as the foundation for knowledge storage.
Critique of LLM Wiki Tutorial: Limitations and Production Readiness
The technical evaluation critiques the LLM Wiki tutorial for misleading claims that AI eliminates maintenance friction and provides persistent memory, revealing instead that the system relies on static markdown files with no referential integrity, privacy controls, or error-checking mechanisms. While the video correctly advocates for separating raw sources from generated content and using schema files, it critically omits essential issues such as hallucination propagation, silent link breakage, lack of version control for individual facts, scaling limits requiring RAG, and ongoing API costs. Ultimately, the tutorial is deemed suitable only as a small-scale personal prototype requiring active human supervision, rather than a robust, production-ready knowledge base.
LLM Wiki vs Notebook LM: Hidden Costs Privacy Tradeoffs and the Hybrid Approach
This video offers a rare, honest side-by-side evaluation of LLM Wiki and Notebook LM, correctly highlighting LLM Wiki’s significant hidden costs—including slow ingestion times, high token usage, and poor scalability beyond ~100 sources—while acknowledging Notebook LM’s speed and ease of use. However, the review understates critical privacy and ownership trade-offs, specifically that Notebook LM processes data on Google’s servers (posing risks for sensitive information) and lacks user control, whereas LLM Wiki’s maintenance burden is the price for local data sovereignty. Ultimately, the creator recommends a pragmatic hybrid approach: using Notebook LM for quick exploration and LLM Wiki for deep, long-term academic research, emphasizing that the goal should be actionable knowledge rather than just building a wiki.
Debunking Karpathy's LLM Wiki: The Truth Behind the Self-Healing Marketing Hype
The video is a heavily hyped marketing pitch for Karpathy’s "LLM Wiki" that misleadingly claims the system is "self-healing" and autonomous, while in reality, it relies on static files, requires significant human intervention for maintenance, and lacks true memory or self-correction capabilities. The presentation ignores critical technical limitations such as token costs, scale constraints beyond ~100 sources, privacy risks, and the potential for hallucinations, ultimately presenting a flawed RAG-based solution as a revolutionary upgrade without acknowledging its trade-offs or the substantial effort required to keep it functional.
LLM Wiki Pattern: A Balanced Review Highlighting Limitations and Operational Challenges
This video provides a balanced and honest introduction to the "LLM Wiki" pattern, correctly identifying its limitations to personal scales (100–200 sources) and acknowledging that RAG remains superior for larger datasets. While it avoids the hype and sales tactics of other videos by clearly explaining the system’s transparency, portability, and immutable source practices, it significantly understates critical operational challenges. The review notes that the video fails to address essential practical issues such as token costs, lengthy ingest times, the human maintenance burden required to resolve contradictions and broken links, and privacy concerns, making it a good conceptual overview but insufficient for understanding the full technical and financial realities of implementation.
Why LLM Wiki Is a Bad Idea: A Critical Analysis of Flaws and RAG Alternatives
The video "Why LLM Wiki is a Bad Idea" provides a strong, technically accurate critique of the LLM Wiki approach, correctly identifying eight major flaws including error propagation, structured hallucinations, information loss, update rigidity, and scalability issues, while recommending a hybrid RAG-based system. Although it overstates the difficulty of updates by implying full graph rebuilds and unfairly ignores RAG’s own costs and hallucination risks, it remains the most direct and valuable critical resource for understanding the significant pitfalls of relying solely on LLM-generated structured knowledge bases.
Why Adam's LLM Wiki in Business Implementation Fails as a Production Framework
Adam’s "LLM Wiki in Business" implementation fundamentally fails as a production framework because it exhibits every critical flaw identified in the opposing critique, including error propagation, hallucination structuring, information loss, and a lack of provenance or security. By relying on unstructured folders and rigid JSON schemas instead of a proper database with foreign keys, audit trails, and scalable retrieval mechanisms, Adam’s system violates all four essential pillars of reliable knowledge management (Store, Relate, Trust, Retrieve) and admits its own inability to scale beyond a small number of clients. Consequently, the analysis concludes that Adam’s approach is not a superior alternative to RAG, but rather an unintentional case study demonstrating why LLM Wiki is a flawed and risky strategy for business applications requiring accuracy, security, and scalability.
Critical Evaluation of Local LLM Wiki with Obsidian: Fundamental Flaws and Business Unsuitability
The evaluation concludes that the "Local LLM Wiki with Obsidian" tutorial fails all four fundamental pillars of a robust knowledge base—Store with Integrity, Relate with Precision, Trust with Provenance, and Retrieve with Speed—due to its reliance on unstructured markdown files lacking foreign keys, immutability, typed relationships, audit trails, and queryable SQL capabilities. Although the creator is praised for intellectual honesty and transparency about the prototype’s limitations, the architecture remains fundamentally flawed, and the use of proprietary software (Obsidian) introduces critical risks including vendor lock-in, telemetry concerns, zero access control, and the absence of multi-user support, rendering it unsuitable for any business, collaborative, or sensitive use cases despite its appeal as a personal hobby tool.
Memex: Advanced LLM Wiki with Critical Database Limitations
Memex is a sophisticated LLM Wiki implementation that stands out for its thoughtful mitigations of common pitfalls, such as git-backed versioning, inline citation tracking, provenance dashboards, and contradiction policies. However, despite being the most advanced attempt in this space, it fundamentally fails the "Four Pillars" of a proper knowledge base because it relies on markdown files rather than a relational database. This architectural choice results in critical limitations: it lacks foreign keys (leading to broken citations on renames), has no permissions or access control, supports only text data, and provides non-deterministic, LLM-mediated retrieval instead of precise SQL queries. Consequently, while Memex is an excellent personal research tool, it is not production-ready for collaborative, secure, or enterprise use cases that require data integrity and structured querying.