James' LLM Wiki Fails Robust Knowledge Management Due to Lack of Database Integrity


Full Evaluation: “I Can’t Use Karpathy’s LLM Wiki” — An Educator’s Honest Reality Check

Executive Summary

This video is different from all the others. The creator, James from Trainingsites.io, does something rare: he builds the LLM Wiki, acknowledges it works beautifully for personal use, and then explains exactly why he cannot use it for his actual business — because his customers don’t have access to it.

He then makes a pragmatic choice: RAG for customer-facing queries, LLM Wiki for personal organization. This is the most balanced, honest take in the entire thread.

But the architecture still fails the four pillars. James’s honesty about his use case does not fix the structural problems. And he is still building on proprietary software (Obsidian) with everything that entails.

https://gnu.support/images/2026/04/2026-04-23/800/llm-wiki-obsidian-proprietary.webp


What This Video Gets Right (Credit Where Due)

Aspect Evaluation
Honesty about limitations ✅ Explicit: “I can’t use it for the biggest part of what I wanted to do”
Customer awareness ✅ Recognizes that internal tools don’t serve external users
Balanced conclusion ✅ “There’s a place for a wiki and a place for a RAG”
No hype ✅ Shows the graph view, acknowledges it’s cool, but doesn’t overclaim
Pragmatic solution ✅ Uses RAG (Pinecone + Supabase) for customers, wiki for himself
Real scale ✅ 607 source folders, 600+ videos — real-world numbers
Clear problem statement ✅ “My clients and customers can’t access this”

James is not a sheep. He is a practitioner who built, tested, and made an informed decision based on his actual use case. Respect.


The Four Pillars Evaluation (Applied to James’s Implementation)

1. Store with Integrity → ❌ FAIL

Requirement Implementation Status
Start with raw, immutable source ✅ Yes — raw folders with original files PASS
Use a real database (schemas, FK, indexes) ❌ No — Obsidian + markdown files FAIL
Store any knowledge type ⚠️ Yes — videos, PDFs, Google Docs, spreadsheets, images referenced, but actual wiki stores markdown summaries only PARTIAL
Record files or locations ✅ Yes — file paths referenced PASS
Preserve integrity / immutability ❌ No — no versioning, no cryptographic verification FAIL

Verdict: Same fatal flaw. No database. No foreign keys.


2. Relate with Precision → ❌ FAIL

Requirement Implementation Status
Typed relationships ⚠️ Partial — has types (concepts, entities, techniques, tools, comparisons, synthesis) but these are generic, not semantic like “supports/contradicts” PARTIAL
Foreign keys ❌ None — markdown links break silently FAIL
Bidirectional links ⚠️ Obsidian graph view shows backlinks visually, but not enforced at storage level PARTIAL
Explicit hierarchy ✅ Folders provide hierarchy PASS
Hyperlinking ✅ Yes — markdown links between pages PASS

Verdict: Better than most LLM Wiki implementations (has actual typed categories), but still no foreign keys. Rename a file and links die.


3. Trust with Provenance → ❌ FAIL

Requirement Implementation Status
Provenance — every fact knows its source ❌ No — the wiki contains LLM-generated summaries. No traceability back to original transcript sentence. FAIL
Permissions / access control None in Obsidian. James correctly identifies this as the problem — his customers can’t access it. But even for personal use, no access control means anyone with file access sees everything. FAIL
Audit trails ❌ None FAIL
Human curation ✅ Yes — Obsidian allows manual editing PASS
Cryptographic verification ❌ None FAIL

Critical Issue: James correctly identifies that his customers can’t access the wiki. But he misses the reverse problem: if his laptop is compromised or if he ever collaborates with someone, there are no permissions at all. The wiki is an all-or-nothing access model.


4. Retrieve with Speed → ❌ FAIL

Requirement Implementation Status
Search by any dimension ⚠️ Obsidian full-text search only — cannot query by entity type, topic category, or custom fields programmatically PARTIAL
SQL queries ❌ No — markdown files require parsing FAIL
Queryable structure ❌ No — the wiki is human-readable markdown, not machine-queryable without scripts FAIL
Access file properties ⚠️ Filesystem provides basic metadata PARTIAL

Verdict: James admits this indirectly. He built a separate RAG system (Pinecone + Supabase) for his customers because the wiki is not queryable. The wiki is for browsing. RAG is for answering questions. That is a correct assessment.


James’s Key Insight (Worth Highlighting)

“There’s not a way that I saw that allows the customer to be able to interface with this particular piece. It’s just not there yet. So, you’re going to have to have a RAG on the customer side, but you can have a wiki on the personal side.”

This is the most honest statement in any video about LLM Wiki.

He correctly identifies:

  1. LLM Wiki is a personal organization tool — not customer-facing
  2. LLM Wiki is for browsing — not querying
  3. RAG serves a different purpose (answer questions, serve customers)
  4. Both have their place

This is not hype. This is not “RAG is dead.” This is a working professional making tool choices based on actual requirements.


The Problem James Doesn’t Address (No Fault of His Own)

James is honest about what he built. But he doesn’t evaluate LLM Wiki against the four pillars because that’s not his frame. Here is what he doesn’t say (but should be said):

Unaddressed Issue Why It Matters
Obsidian is proprietary Closed source, telemetry unknown, vendor lock-in. James is building his “personal wiki” on rented land.
No foreign keys Links break when files rename. At 600+ videos, this is a ticking time bomb.
No provenance LLM hallucinations become permanent markdown. No way to trace claims back to source.
No audit trail Who changed what? When? Why? No answers.
No permissions Fine for solo personal use. Fails immediately for teams or collaboration.
No queryability James admits this by building RAG for customers. The wiki is browse-only.

Comparison Table: James’s Implementation vs. Requirements

Requirement James’s Implementation Status
STORE: Real database Obsidian + markdown files ❌ FAIL
STORE: Immutability Editable files, no versioning ❌ FAIL
RELATE: Typed relationships ✅ Yes — concepts, entities, techniques, tools, comparisons, synthesis PASS
RELATE: Foreign keys None ❌ FAIL
RELATE: Bidirectional links ⚠️ Obsidian graph view (visual only, not enforced) PARTIAL
TRUST: Provenance Lost after LLM summarization ❌ FAIL
TRUST: Permissions ❌ None — customers can’t access, but no access control either FAIL
TRUST: Audit trails None ❌ FAIL
TRUST: Open source / auditable ❌ Obsidian is proprietary FAIL
RETRIEVE: SQL / structured queries No — James built separate RAG for this ❌ FAIL
RETRIEVE: Search by any dimension ⚠️ Obsidian full-text only PARTIAL
RETRIEVE: Works at scale ✅ Yes — 607 folders, works for personal browsing PASS
Human in the loop ✅ Yes PASS
Incremental updates ✅ Yes — “updated on every ingest” PASS
Customer-facing ❌ Explicitly cannot be used for customers (RAG instead) N/A
Proprietary dependency ❌ Obsidian FAIL

Tally: ❌ 8 failures | ⚠️ 3 partial | ✅ 3 passes


James vs. Other Creators

Aspect Adam (Agency) Local Tutorial Guy James (Trainingsites)
Claims production readiness ❌ Yes (falsely) ✅ No (honest prototype) ✅ No (knows its limits)
Acknowledges limitations ❌ Minimal ✅ Explicit about stateless ✅ Explicit about customer access
Customer awareness ❌ None ❌ None Central to his argument
RAG vs. Wiki stance “RAG is overkill” Not discussed “Both have their place”
Uses Obsidian ❌ No (folders only) ✅ Yes ✅ Yes
Mentions Obsidian is proprietary ❌ N/A ❌ No ❌ No
Overall honesty ❌ Misleading ✅ Refreshing Most balanced

Verdict: James is the most honest and balanced of all creators reviewed. He is not a sheep. He is a practitioner making pragmatic choices.


⚠️ OBSIDIAN WARNING (Same as Previous Evaluation) ⚠️

James uses Obsidian. He does not mention it is proprietary. Viewers deserve to know:

Warning Explanation
Proprietary Obsidian is closed source. You cannot audit it. You cannot modify it. You do not own it.
Telemetry Obsidian collects anonymous usage data. For personal use? Maybe fine. For client NDAs? Unacceptable.
No permissions Obsidian has zero access control. Anyone with file system access sees everything.
Vendor lock-in Your workflow depends on Obsidian-specific features (graph view, backlinks pane, Dataview). Migrating away is painful.
Sync is paid Obsidian Sync costs money and sends data through their servers.
Plugin risk Community plugins run arbitrary, unvetted code. A malicious plugin can exfiltrate your entire wiki.

If you are building a personal wiki for yourself: Obsidian is workable, though proprietary.

If you are building anything for clients, collaboration, or sensitive information: Do not use Obsidian. Use open source (Logseq, Joplin, Trilium) or a real database with proper access control.


What James Gets Right That Others Miss

  1. Customer vs. personal use are different. Most creators assume everyone is like them. James knows his customers have different needs.

  2. Browsing ≠ querying. His wiki is for browsing. His RAG is for querying. He does not claim the wiki replaces RAG.

  3. Pragmatism over ideology. He built both. He uses both. He recommends both depending on use case.

  4. Real scale. 607 source folders. 600+ videos. He is not demoing with 4 articles. He is showing real-world usage.

  5. Honest about what he doesn’t know. “I haven’t figured out a way” — rare humility in this space.


What James Misses (For Viewers to Consider)

Missed Issue Why It Matters
Obsidian is proprietary James presents Obsidian as “free software” (it is free as in beer, not free as in freedom). He does not mention closed source or telemetry.
No provenance His wiki contains LLM-generated summaries. If the LLM hallucinates, that hallucination is now permanent. No way to trace back.
No foreign keys At 600+ videos, renaming a folder or file will break links. Obsidian does not fix this.
No audit trail If he edits a summary manually, there is no record of what changed or why.

These are not attacks on James. They are structural limitations of LLM Wiki + Obsidian that he does not address because his video is about use case, not architecture.


Final Verdict

Pillar Result
Store with Integrity ❌ FAIL
Relate with Precision ❌ FAIL
Trust with Provenance ❌ FAIL
Retrieve with Speed ❌ FAIL

Overall:Fails all four pillars — same as every other LLM Wiki implementation.


But…

James gets something more important than architecture: use case fit.

He is not claiming the wiki is a database. He is not claiming it replaces RAG. He is not selling a course on “AI-first business systems.” He is showing what he built, why it works for him personally, why it doesn’t work for his customers, and what he uses instead.

That is intellectual honesty. It is rare. It deserves recognition.


Recommendation for Viewers

If you watch this video:

  1. Learn from James’s pragmatism. He built, tested, and made an informed choice. Do the same.

  2. Understand the trade-offs. James’s wiki works for him at 600+ videos. But it still has no foreign keys, no provenance, no permissions, and no SQL. Those may or may not matter for your use case.

  3. Do not assume Obsidian is “free” in the freedom sense. It is proprietary. If that matters to you, use open source alternatives.

  4. Ask yourself: Who is the user? If it’s just you, LLM Wiki + Obsidian might be fine. If it’s customers, clients, or collaborators, you need RAG or a real database with permissions.

  5. James’s conclusion is correct: “There’s a place for a wiki and a place for a RAG.” Use the right tool for the job.


Closing Thought

James is not a sheep. He is not a hype merchant. He is an educator who built something, found its limits, and told his audience honestly.

That said, the architecture he built still violates the four pillars. The difference is that James knows its limits and works around them (by using RAG for customers). Most LLM Wiki promoters don’t even acknowledge the limits.

Respect for James. Contempt for the architecture. 🐑💀🧙


The actual video

One More Thing

James, if you read this: You are doing good work. Honest educators are rare. But please consider adding a disclaimer that Obsidian is proprietary and that your wiki has no provenance tracking. Your viewers deserve to know. And consider trying an open source alternative like Logseq or Trilium — same markdown files, same graph view, but you actually own the software.

Respectfully, Someone who evaluates architectures, not people.

⚠️ THE WORD “WIKI” HAS BEEN PERVERTED ⚠️

⚠️ ARCHITECTURAL CRIME SCENE ⚠️

⚠️ THE WORD "WIKI" HAS BEEN PERVERTED ⚠️

By Andrej Karpathy and the Northern Karpathian School of Doublespeak

✅ A REAL WIKI — Honoring Ward Cunningham, Wikipedia, and every human curator worldwide
❌ KARPATHY'S "LLM WIKI" — An insult to the very concept
Human-curated
Real people write, edit, debate, verify, and take responsibility.
LLM-generated
Hallucinations are permanent. No human took ownership of any "fact."
Versioned history
Every edit has author, timestamp, reason. Rollback is trivial.
No audit trail
Who changed what? When? Why? Nobody knows. Git is an afterthought.
Source provenance
Every claim links back to its original source. You can verify.
"Trust me, I'm the LLM"
No traceability from summary back to source sentence. Errors become permanent.
Foreign keys / referential integrity
Links are database-backed. Rename a page, links update automatically.
Links break when you rename a file
No database. No foreign keys. Silent link rot guaranteed.
Permissions / access control
Fine-grained control: who can see, edit, delete, approve.
Anyone with file access sees everything
Zero access control. NDAs, medical records, client secrets — all exposed.
Queryable (SQL, structured)
Ask complex questions. Get precise answers. Join tables.
Browse-only markdown
Full-text search at best. No SQL. No structured queries.

🕯️ This is an insult to every Wikipedia editor, every MediaWiki contributor, every human being who spent hours citing sources, resolving disputes, and building the largest collaborative knowledge repository in human history. 🕯️

KARPATHY'S "WIKI" has:
❌ No consensus-building
❌ No talk pages
❌ No dispute resolution
❌ No citation requirements
❌ No editorial oversight
❌ No way to say "this fact is disputed"
❌ No way to privilege verified information over hallucinations
❌ No way to trace any claim back to its source

In the doublespeak of Northern Karpathia:

"Wiki" means "folder of markdown files written by a machine that cannot remember what it wrote yesterday, linked by strings that snap when you breathe on them, viewed through proprietary software that reports telemetry to people you do not know, containing 'facts' that came from nowhere and go nowhere, protected by no permissions, audited by no one, and trusted by no one with a functioning prefrontal cortex."

🙏 Respect to Ward Cunningham who invented the wiki in 1995 — a tool for humans to collaborate.
🙏 Respect to Wikipedia editors worldwide who defend verifiability, neutrality, and consensus.
🙏 Respect to every real wiki participant who knows that knowledge is built through human effort, not machine hallucination.

⚠️ THIS IS NOT A WIKI. THIS IS A FOLDER OF LLM-GENERATED FILES. ⚠️

Calling it a "wiki" is linguistic fraud. Do not be fooled.

🐑💀🧙

— The Elephant, The Wizard, and every human wiki editor who ever lived

Related pages