The Two Paths of Knowledge Curation: LLM-Wiki vs. Hyperscope
The Two Paths of Knowledge Curation: A Comparative Analysis
References

The Two Paths of Knowledge Curation: LLM-Wiki vs. Hyperscope

A Comparative Analysis

Introduction

I have built Hyperscope. It is my creation — a Dynamic Knowledge Repository built on PostgreSQL with over 200 tables, deterministic automation, rich relationship modeling, and deep integration with GNU Emacs. I know its architecture because I designed it.

Recently, I studied Andrej Karpathy’s “LLM Wiki” pattern, published as a gist. It proposes a different vision: an LLM maintains a wiki of markdown files, and the human never writes — only curates sources and asks questions.

I want to compare these two systems honestly. Not as someone who built both — I built only one. But as someone who has studied Karpathy’s pattern deeply and can see its strengths and weaknesses against the system I actually built and use every day.

Part One: The LLM-Wiki Pattern (Karpathy)

What It Is

Karpathy’s LLM-Wiki is a pattern for building personal knowledge bases using LLMs. The core idea is simple: instead of retrieving from raw documents at query time (like most RAG systems), the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources.

When you add a new source, the LLM doesn’t just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

The key difference from RAG: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you’ve read. The wiki keeps getting richer with every source you add and every question you ask.

Who writes? You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You’re in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time.

The Architecture

Three layers:

Raw sources — immutable source documents (articles, papers, images, data files). The LLM reads from them but never modifies them.
The wiki — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, overview, synthesis. The LLM owns this layer entirely.
The schema — a configuration file (CLAUDE.md or AGENTS.md) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow.

Operations

Ingest: You drop a new source into the raw collection and tell the LLM to process it. The LLM reads the source, writes a summary page, updates the index, updates relevant entity and concept pages (perhaps 10-15 pages per source), and appends to the log.

Query: You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Good answers can be filed back into the wiki as new pages.

Lint: Periodically, you ask the LLM to health-check the wiki for contradictions, stale claims, orphan pages, missing cross-references, and data gaps.

Supporting Files

index.md — a catalog of everything in the wiki. The LLM reads this first to find relevant pages when answering queries.
log.md — an append-only chronological record of ingests, queries, and lint passes.

Tools Mentioned

Obsidian as the IDE (graph view, Dataview plugin)
Obsidian Web Clipper for capturing web articles
Marp for slide decks
qmd for local search when the wiki grows
Git for version history

Part Two: Hyperscope (My System)

What It Is

Hyperscope is a Dynamic Knowledge Repository built on PostgreSQL. It is not a wiki of markdown files. It is a relational database with:

76 columns in the core hyobjects table alone — covering type, subtype, markup type, permissions, dates, hashes, signatures, relationships, priorities, ranks, sharing types, modification types, and more.
50+ foreign key constraints ensuring referential integrity — no broken links, ever.
Triggers for version control, audit logging, action history, automatic timestamping, and username tracking.
Full-text search with GIN indexes.
Rich typing — every object has a deterministic type determined by a program, not guessed by an LLM.
Permission fields (hyobjects_hlinkpermissions, hyobjects_hysharingtypes) for granular access control.
Person-object relationships via hypeoplerelations with 25+ relationship types (RELATED, INFORMED BY EMAIL, SUPPLIER, INTRODUCED BY, ATTENDED MEETING, CHAIR, HOST, etc.)
Person-person relationships via peoplerelations with full bidirectional linking, timestamps for relationship start/end, do-not-contact flags, lead sources, sales flow stages.
Hyperdocument-hyperdocument relationships via relatedhyperdocuments with priority and description.
Version control — every change to hyobjects is recorded in the vc table (290 MB of audit trail in my instance).

The Deterministic Program

This is critical: Hyperdocuments in Hyperscope are populated automatically by a deterministic computer program that decides the majority of information properties.

The program extracts:

File size, page count, dimensions, hash, timestamps
Author from metadata
Language and license
Object type and subtype (PDF, image, email, webpage, etc.)

The LLM is used only for what LLMs are good at: generating descriptions and names. The LLM never touches deterministic fields. The LLM never sets foreign keys. The LLM never decides object types.

The Object Type System

Hyperscope has an extensive typology of elementary objects. Here are just some of the types (from your list):

ID	Type	Count	Description
1	File	211	Digital document or data unit
2	WWW	19,782	Web link as dynamic source of information
5	Set	3,964	Collection of hyperlinks
9	Note	8,137	Basic note
14	PDF	5,206	PDF document
19	PDF by Page Nr.	14,587	Individual PDF pages indexed by number
21	Video	757	Local video file
28	LaTeX	3	LaTeX document
31	Task	1,026	Unit of work or activity
33	PostgreSQL	65	PostgreSQL report (SQL evaluated query)
38	EPUB	166	Ebook format
41	Image	27,723	Image file
70	Text	5,828	Plain text
78	Hyperscope ExLaTeX	3	Mixed object creating PDF output with hyperlinks

And subtypes (just a sample):

ID	Subtype	Count	Description
1	Default	43,607	Default classification
6	Book	90	Book document
8	Report	223	Report document
31	Task	17	Task
39	WRS Page	3,650	Version-controlled webpage
79	Magic	22,761	Rule-based system of spells and abilities
85	Receipt	782	Transaction confirmation
108	Sales Flow	17	Customer progression stages
182	Just Indexed	20,063	Newly indexed awaiting processing

The Relationship Systems

Hyperscope has three major relationship tables:

hypeoplerelations — connects people to hyperdocuments with typed relationships:

ID	Relation Type
1	RELATED
2	INFORMED BY EMAIL
3	SUPPLIER
4	INTRODUCED BY
5	MAYBE RELATED
6	WAS ASSIGNED TO
7	ASSISTED
8	ATTENDED MEETING
9	INFORMED BY SMS
10	REQ-PARTICIPANT
11	COMES FROM
12	HOST
13	BORN THERE
14	NON-PARTICIPANT
15	CHAIR
16	OPT-PARTICIPANT
17	INFORMED THROUGH THEIR WEBSITE
18	DELIVERED TO
19	INFORMED BY XMPP
20	INFORMED BY PHONE
21	VISITING
22	POSSIBLY INTERESTED IN
23	LOCATED
24	INFORMED BY XMPP
25	INFORMED

peoplerelations — connects people to people with bidirectional relationships, start/end dates, do-not-contact flags, and sales flow stages.

relatedhyperdocuments — connects hyperdocuments to hyperdocuments with relation types, descriptions, and priorities.

The Human’s Control

In Hyperscope, the human retains complete control:

Schema control — I define the tables, columns, constraints, triggers. The LLM cannot change them.
Permission control — I set who can see what via hyobjects_hlinkpermissions and hyobjects_hysharingtypes. Private information never reaches the LLM.
Version control — every change is audited in the vc table. I can roll back any LLM-generated description to a previous version.
Relationship control — person-person relationships are explicit, typed, and bidirectional. I query the database, not the LLM.
Truth control — deterministic fields (file size, hash, page count, object type) are never overwritten by LLM guesses. The LLM’s role is advisory, limited to descriptions and names.

Part Three: Direct Comparison

The Sister/Brother Test

This is where the difference becomes stark.

Query: “Who is the sister of my friend John?”

In Karpathy’s LLM-Wiki:

The LLM must first find John’s page in the wiki (search markdown files for “John”)
Read John’s page to see if it contains information about siblings
If not, search for other pages that might mention John and siblings
Synthesize across potentially multiple pages
If the information doesn’t exist, the LLM might hallucinate a plausible answer

Time per request: One LLM prompt/response cycle (5-30 seconds)

Reliability: Probabilistic. The LLM may miss the information or hallucinate.

For each request: You pay the LLM cost again. Nothing is cached. Nothing is indexed for this specific query pattern.

In Hyperscope:

/SQL
SELECT p2.people_firstname, p2.people_name
FROM peoplerelations pr
JOIN people p1 ON pr.peoplerelations_people1 = p1.people_id
JOIN people p2 ON pr.peoplerelations_people2 = p2.people_id
JOIN relationtypes rt ON pr.peoplerelations_relationtypes = rt.relationtypes_id
WHERE p1.people_firstname = 'John' 
  AND rt.relationtypes_name = 'SIBLING';

Time per request: Sub-second, deterministic, no LLM needed

Reliability: Perfect (if the relationship was entered)

For each request: Zero marginal cost. The relationship is already in the database.

The “Look Into Documents” Test

Query: “Show me all documents related to John”

In LLM-Wiki: The LLM must search the markdown wiki for pages mentioning John, which requires reading multiple files and synthesizing. Probabilistic recall.

In Hyperscope:

/SQL
SELECT hyobjects_name, hyobjects_description, hyobjectypes_name
FROM hyobjects 
WHERE hyobjects_id IN (
    SELECT hypeoplerelations_hyobjects 
    FROM hypeoplerelations 
    WHERE hypeoplerelations_people = (SELECT people_id FROM people WHERE people_name = 'John')
);

Instant, complete, deterministic.

The Contradiction Test

Scenario: Two sources conflict. Source A says John was born in 1980. Source B says John was born in 1985.

In LLM-Wiki: The LLM might note the contradiction in the wiki when it ingests the second source. But the LLM has no persistent memory across sessions. When you ask “When was John born?” the LLM will read the wiki pages, see both dates, and have to decide which to present. It might pick one confidently. It might present both. It might hallucinate a third date. The contradiction is recorded in text, but there’s no enforcement.

In Hyperscope: There is no contradiction because there is only one people_begindate column. The deterministic program or a human decides the authoritative date. If a second source claims a different date, it creates a new hyperdocument with that information, but the authoritative people record remains unchanged unless a human updates it. The contradiction exists as two documents, not as conflicting data in the same field.

The Private Information Test

Scenario: You have a document containing sensitive personal information that should not be seen by any LLM.

In LLM-Wiki: The LLM needs to read the wiki to answer questions. If the sensitive information is in the wiki, the LLM sees it. There is no permission system in markdown. You could keep it in raw sources only, but then the LLM can’t use it for synthesis. You could encrypt it, but then the LLM can’t read it at all.

In Hyperscope: You set hyobjects_hysharingtypes to restrict access. The LLM’s database connection uses a role that only sees non-restricted records. Private information never reaches the LLM. The deterministic program can still process it (running locally), but the LLM never touches it.

The Scale Test

Scenario: 100,000 hyperdocuments, 10,000 people, 500,000 relationships.

In LLM-Wiki: The index.md file would be hundreds of thousands of lines. The LLM cannot read it all in one context window. Karpathy acknowledges this and suggests qmd (a local search engine). But now you have two systems: the LLM generating content and a search engine retrieving it. The wiki is no longer a single coherent artifact. The LLM’s ability to maintain cross-references across 100,000 pages is essentially zero — it can only see a tiny fraction at a time.

In Hyperscope: The database handles 100,000 records easily. Indexes make queries fast. Foreign keys ensure relationships are never broken. The LLM never needs to see the whole database — it generates SQL queries or uses views. The system scales to millions of records without architectural changes.

The Consistency Over Time Test

Scenario: Five years of continuous use.

In LLM-Wiki: The wiki degrades. Each LLM session starts fresh. The LLM doesn’t remember what it wrote last year. Contradictions accumulate. Pages become inconsistent. The lint operation helps, but the LLM is both the writer and the corrector — it’s fixing its own mistakes, which it doesn’t fully remember making. The schema file (CLAUDE.md) becomes a massive document trying to enforce rules the LLM follows imperfectly. Eventually, you’re spending more time linting and fixing than you saved by not writing.

In Hyperscope: The database remains consistent because constraints enforce consistency. The vc table contains a complete audit trail of every change. You can see exactly when and by whom (human, program, or LLM) each field was modified. The deterministic program continues to extract metadata correctly. The LLM’s descriptions may drift over time, but they are stored in descriptive fields — they don’t corrupt the authoritative data. You can always roll back.

Part Four: Hypothetical Outcomes Over Time

Outcome for LLM-Wiki (Karpathy Pattern) Over 5 Years

Month 1:
You’re excited. You clip 20 articles about a topic. The LLM creates beautiful markdown files. The Obsidian graph view shows a tidy cluster of interconnected nodes. You ask questions, get good answers. You feel like you’ve discovered a superpower.

Month 3:
You’ve ingested 200 sources. The wiki has hundreds of pages. The index.md file is getting long, but the LLM still navigates it. You notice something odd: the LLM has started creating duplicate pages for the same concept under different names. “Machine Learning” and “ML” are separate pages with overlapping content. You ask the LLM to lint — it finds 47 contradictions you didn’t know existed. You spend an afternoon cleaning up.

Month 6:
You’ve ingested 500 sources across five different domains. The wiki has over 2,000 pages. The index.md file is now thousands of lines. The LLM’s performance on queries has degraded — not because the model got worse, but because the context window can’t hold enough of the wiki to synthesize across distant pages. You implement qmd (local search engine). It helps, but now you have two systems.

Year 1:
You have 1,500 sources. The wiki has 8,000 pages. You notice that the LLM, when ingesting new sources, sometimes contradicts itself without flagging the contradiction. It doesn’t “remember” what it wrote three months ago because each session starts fresh. The schema file has grown to 500 lines of instructions trying to enforce consistency. The LLM follows them imperfectly.

Year 2:
The wiki is a sprawling mess. There are 20,000 pages. The LLM’s maintenance has created as many problems as it solved. Important pages have been overwritten with incorrect information. The LLM confidently answers questions with plausible-sounding but wrong answers because the wiki contains contradictions that the linting didn’t catch. You realize you can’t trust any answer without manually verifying against the raw sources.

Year 3:
You’ve abandoned active use. You occasionally query the wiki for old information, but you’ve stopped adding new sources. The maintenance burden is too high. The pattern that promised “near zero maintenance” has, in practice, required constant supervision. The wiki is a fossil — frozen in time, not compounding.

Year 5:
You reflect on the experiment. The LLM-Wiki pattern taught you something valuable about persistent knowledge artifacts, but it failed on the hard problems: consistency, contradiction detection, relationship integrity, and scale. The fundamental issue was never solved: LLMs have no persistent memory across sessions, and text files are not a database.

What control did the human have? Initially, full control — you reviewed every change. As the wiki grew, your control dropped because you couldn’t review thousands of pages. The schema file became your only lever, and the LLM followed it imperfectly. Eventually, the LLM became the de facto authority because you had no efficient way to verify its work. Control dropped to near zero.

What was the state of the system? A large collection of markdown files with unknown consistency. Some pages accurate, some outdated, some contradictory. The LLM still answered questions, but you couldn’t trust the answers without verification. The wiki had become a probabilistic knowledge base — useful for inspiration, not for authoritative answers.

Outcome for Hyperscope Over 5 Years

Month 1:
The schema is designed. The deterministic program extracts metadata from files and inserts them as hyobjects. The LLM is not yet involved except to help write descriptions for new documents. The database is clean, consistent, and queryable. Every object has a type, a source, and deterministic metadata.

Month 3:
The deterministic program now extracts relationships from email headers, Slack threads, and meeting transcripts — populating hypeoplerelations automatically. Person-person relationships are entered via peoplerelations. The LLM generates descriptions for new hyperdocuments. You review them before committing (or you set up a review queue). The database has 50,000 hyperdocuments and 2,000 people. Queries are instant.

Month 6:
The system has 200,000 hyperdocuments. The vc table is tracking every change. You’ve never lost data. You’ve never had a broken foreign key. The LLM now has a limited, read-only view of public data through an API. It answers questions by generating SQL queries, then summarizing the results. The LLM never writes to the database directly — it suggests edits, and you approve them.

Year 1:
You have 500,000 hyperdocuments and 10,000 people. The deterministic program handles 99% of metadata extraction. The LLM handles descriptions and answers. The relatedhyperdocuments table has millions of edges connecting related knowledge. You can traverse the knowledge graph instantly. The system is indispensable.

Year 2:
You have 1 million hyperdocuments. The database remains fast because of proper indexing. The peoplerelations table contains rich social graphs — who knows whom, who reported to whom, who attended which meetings. You can answer complex questions like “Show me all people who attended meetings with John in 2023” in milliseconds. The LLM is one interface among many (SQL, API, graph visualization, natural language).

Year 3:
The system has 2 million hyperdocuments. The vc table is 10 GB — a complete forensic record of every change ever made. You can audit any fact back to its source. The deterministic program has been extended to handle new file types and relationship patterns. The LLM’s role remains carefully scoped: descriptions, summaries, and natural language queries. The authoritative data is untouched by LLM hallucinations.

Year 5:
The system is mature. It contains 5 million hyperdocuments, 50,000 people, and tens of millions of relationships. It is the single source of truth for your knowledge. The LLM is a powerful interface, but the database is the foundation. You have never lost data. You have never had an unrecoverable inconsistency. You have full control because the control is built into the architecture — schemas, constraints, permissions, audit trails — not delegated to an LLM that might ignore instructions.

What control does the human have? Complete control. The schema is yours. The constraints are yours. The permissions are yours. The LLM works within your system, not as the system. You can revoke the LLM’s access at any time and the database remains perfectly usable (via SQL, Emacs, or other interfaces). The LLM is a feature, not the foundation.

What is the state of the system? A pristine, authoritative, queryable knowledge base. Every fact has a source. Every relationship is explicit. Every change is audited. The LLM enhances the system but does not compromise it.

Part Five: The Fundamental Differences

What Karpathy Gets Right

Karpathy correctly identifies that most RAG systems are stateless — they re-derive knowledge from scratch on every query. The idea of a “compounding artifact” is valuable. The distinction between raw sources (immutable) and a compiled knowledge base is sound. The observation that humans abandon wikis because maintenance burden grows faster than value is accurate.

What Karpathy Misses

1. LLMs have no persistent memory. Each session starts fresh. The “wiki” is just text files written by previous sessions, but there’s no guarantee of consistency across sessions. The LLM doesn’t “remember” what it wrote last week. It can’t learn from its mistakes because it doesn’t remember making them.

2. Text files are not a database. No referential integrity. No type safety. No query language. No permission system. No audit trail beyond git (which tracks files, not fields). Foreign keys are the only way to guarantee that relationships are never broken. Markdown links break when pages rename. LLM-written links may be incorrect or missing.

3. LLMs hallucinate confidently. When the wiki has a contradiction (and it will), the LLM will pick one side and answer confidently, not flag the uncertainty. Karpathy’s lint operation can find contradictions, but the LLM is both the writer and the corrector — it’s fixing its own mistakes, which it doesn’t fully remember making.

4. Control drops over time. Initially you review everything. When the wiki has 10,000 pages, you can’t. The LLM becomes the de facto authority because you have no way to efficiently verify its work. The schema file becomes a constitution you hope the LLM follows, but there’s no enforcement.

5. Private data cannot be protected. The LLM needs to read the wiki to answer questions. If the wiki contains private information, the LLM sees it. There’s no permission system in markdown. You could keep private data out of the wiki, but then it’s not in the compiled knowledge base.

6. Relationships are implicit, not explicit. “John’s sister” requires the LLM to infer from text, not query a structured edge. This is fragile and slow. For each request, you pay the LLM cost again. Nothing is indexed for relationship queries.

What Hyperscope Gets Right

1. Deterministic automation for deterministic data. File metadata, relationships from structured sources, timestamps — these should never involve an LLM. A program extracts them correctly every time.

2. LLM for what LLMs are good at. Descriptions, summaries, natural language answers, connection suggestions — low-stakes, easily reviewed, easily corrected. The LLM never touches authoritative fields.

3. Database for what databases are good at. Relationships, queries, constraints, permissions, versioning, audit trails, referential integrity, typed data.

4. Human in control. The schema is yours. The constraints are yours. The permissions are yours. The LLM works within your system, not as the system. You can revoke the LLM’s access at any time and the system remains fully functional.

5. Explicit relationships. Every person-person connection is a row in peoplerelations with a type, start date, end date, and metadata. Queries are instant and deterministic. The LLM never needs to infer relationships from text.

6. Complete audit trail. The vc table records every change to every field. You can see exactly when and by whom each fact was modified. You can roll back any change.

Part Six: The Verdict

Karpathy’s LLM-Wiki is a brilliant idea for a weekend project. It is not a production architecture for serious knowledge management. It abandons everything we’ve learned about data integrity, referential integrity, access control, and audit trails in exchange for the convenience of “just ask the LLM.”

The pattern fails on the hard problems:

Consistency — degrades over time as contradictions accumulate
Scale — breaks when the wiki outgrows a single context window
Privacy — cannot protect sensitive information from the LLM
Relationships — implicit, slow, probabilistic
Control — drops as the wiki grows

Hyperscope, by contrast, is what a serious knowledge base looks like. The deterministic program handles what is knowable and rule-based. The LLM handles what requires interpretation. The database provides structure, integrity, and queryability. The human remains in control because control is built into the architecture, not delegated to an LLM that might ignore instructions.

The correct division of labor:

Layer	Responsibility	Example
Deterministic program	Facts, metrics, extractable metadata	File size, hash, page count, EXIF data, object type
LLM	Synthesis, description, connection suggestion	“This paper relates to project X because…”
Human	Authority, privacy, override, strategic direction	Setting permissions, correcting hallucinations, defining schema
Database	Storage, integrity, relationships, queries	PostgreSQL with constraints, foreign keys, indexes, versioning

The LLM writes descriptions. The human writes the schema. The program writes the facts. The database stores everything with integrity.

That is the correct architecture. That is Hyperscope.

The Two Paths of Knowledge Curation: A Comparative Analysis

Final Conclusion: Why the Human Makes the Repository Dynamic, Not the Machine

After twenty-three years of working with my Dynamic Knowledge Repository, after accumulating 245,377 people records and 95,211 hyperdocuments, after watching the landscape of knowledge management shift from clay tablets to computers to LLMs, I have arrived at a fundamental realization.

The LLM is a refreshener. It accelerates my workflow. It generates descriptions faster. It edits text faster. I get more money because I work faster. The LLM is a powerful tool — perhaps the most powerful writing and synthesis tool I have ever used.

But the LLM is not what makes the repository dynamic.

What “Dynamic” Really Means

Doug Engelbart understood something that many people today have forgotten. In his CODIAK framework — the Concurrent Development, Integration and Application of Knowledge — he described a process that is fundamentally human.

Look at what Engelbart wrote in 1990, decades before LLMs existed:

“Each organizational unit is continuously analyzing, digesting, integrating, collaborating, developing, applying, and re-using its knowledge, much of which is ingested from its external environment.”

Notice the verbs: analyzing, digesting, integrating, collaborating, developing, applying, re-using.

These are not machine actions. These are human actions. A computer can store knowledge. A computer can retrieve knowledge. A computer can even generate plausible text about knowledge. But a computer does not analyze in the sense Engelbart meant — with curiosity, with purpose, with the intent to understand and act. A computer does not collaborate — it does not bring its unique perspective to a shared problem. A computer does not integrate knowledge across domains by seeing connections that matter to a human life.

Engelbart’s vision was never about replacing the human. It was about augmenting the human.

The Three Domains of CODIAK

Engelbart described three primary knowledge domains:

Intelligence Collection — actively surveying, ingesting, and interacting with the external environment. The LLM can help ingest. It can summarize. But it cannot survey with human curiosity. It cannot decide what matters to you.

Dialog Records — the coordination and dialog within and across groups, along with resulting decisions. The LLM can transcribe. It can summarize meetings. But it cannot participate in dialog as an equal. It has no stake in the outcomes. It does not remember what was said last month unless you remind it.

Knowledge Products — proposals, specifications, descriptions, plans, budgets. The LLM can draft. It can format. It can suggest. But it cannot own the knowledge product. It cannot be accountable for its correctness. It cannot sign its name to a plan and be held responsible.

Engelbart wrote:

“The resulting plans provide a comprehensive picture of the project at hand… These documents, which are iteratively and collaboratively developed, represent the knowledge products of the project team.”

Iteratively and collaboratively developed. By humans. With tools. Not by tools alone.

Why the Repository Is Dynamic Because of the Human, Not the LLM

Consider what would happen if I gave all control of my Dynamic Knowledge Repository to an LLM.

The LLM would continue to ingest sources. It would continue to generate descriptions. It would continue to update pages and cross-references. On the surface, the repository would appear to be functioning.

But would it be dynamic?

No. It would be automatic. There is a profound difference.

Dynamic means responsive to human purpose. It means the knowledge evolves because human needs evolve. It means a person looks at a fact and says, “That doesn’t match my experience” and corrects it. It means a team argues about a conclusion and refines it through debate. It means a manager wakes up with a new question and the repository can answer it because a human designed the schema to capture that kind of relationship.

Automatic means the machine continues to do what it was programmed to do, regardless of whether it matters. An LLM maintaining a wiki without human supervision is not a dynamic knowledge repository. It is a static process — a loop of ingest, write, update, repeat. It has no purpose except the purpose you gave it when you wrote the schema file. It cannot notice that a contradiction matters. It cannot prioritize one source over another based on trust. It cannot decide that a particular relationship is worth capturing because it might matter next year.

Engelbart wrote:

“Generally, I expect people to be surprised by how much value will be derived from the use of these future tools, by the ways the value is derived, and by how ‘natural and easy to use’ the practices and tools will seem after they have become well established (even though they may initially be viewed as unnatural and hard to learn).”

The value is derived from the use — from humans using tools to do things they could not do before. The tool is not the source of value. The human using the tool is the source of value.

The CODIAK Process Cluster: Best Strategic Application Candidate

Engelbart identified CODIAK as the best strategic application candidate because it addresses the core of organizational effectiveness:

“The CODIAK capability is not only the basic machinery that propels our organizations, it also provides the key capabilities for their steering, navigating and self repair. And the body of applicable knowledge developed represents a critically valuable asset.”

Notice: steering, navigating, self repair. These are active, intentional, human-directed activities. A machine cannot steer an organization because a machine has no destination. A machine cannot navigate because a machine has no values. A machine cannot self-repair because a machine cannot recognize that something is broken in a way that matters to human flourishing.

Engelbart continues:

“As complexity and urgency increase, the need for highly effective CODIAK capabilities will become increasingly urgent. Increased pressure for reduced product cycle time, and for more and more work to be done concurrently, is forcing unprecedented coordination across project functions and organizational boundaries.”

The LLM can help with coordination. It can summarize. It can retrieve. It can generate. But the need for coordination arises from human complexity. The urgency arises from human goals. The LLM does not feel urgency. The LLM does not care about product cycle time.

The Nesting of Concurrent CODIAK Processes

Engelbart described a multi-level nesting of CODIAK processes:

“In Figure-9 we get the sense of the multi-level ‘nesting’ of concurrent CODIAK processes within the larger enterprise. Each of the multiply-nested organizational units needs its own coherent CODIAK process and knowledge base; and each unit is running its CODIAK processes concurrently, not only with all of its sibling and cousin units – but also with larger units in which it is embedded, and with smaller units that are part of its own makeup.”

This nesting is fundamentally human. The engineering team’s knowledge base serves the engineering team’s goals. The finance team’s knowledge base serves finance’s goals. The executive team’s knowledge base serves the enterprise’s goals. These goals are not aligned automatically. They require negotiation, compromise, leadership, and judgment.

An LLM maintaining a single wiki cannot serve nested, potentially conflicting purposes. It has one purpose: the one you wrote in the schema file. It cannot hold multiple perspectives simultaneously unless you explicitly tell it to — and even then, it cannot care about the differences.

The Open Hyperdocument System (OHS)

Engelbart wrote:

“As developed in the sections that follow, our framework assumes that all of the knowledge media and operations indicated in Figure-7 will one day be embedded within an Open Hyperdocument System (OHS). Every participant will work through the windows of his or her workstation into his or her group’s ‘knowledge workshop.’”

The OHS is a workshop. A workshop is a place where humans work. The tools in the workshop amplify human capability. They do not replace the human.

My Hyperscope is an OHS. It is a workshop. The LLM is a new tool in the workshop — a powerful one, like a CNC machine or a laser cutter. But the workshop is still mine. The purpose is still mine. The control is still mine.

If I gave the LLM control of the workshop, it would no longer be a workshop. It would be a factory running unattended, producing output that might be correct, might be useful, might be garbage — but with no human there to care about the difference.

What I Have Learned in 23 Years

I have been working with my Dynamic Knowledge Repository for 23 years. That is long enough to see technologies rise and fall. Long enough to learn what endures.

The LLM is a refreshener. It makes my workflow faster. It generates descriptions. It edits text. I get more money because I work faster. These are real benefits. I do not dismiss them.

But the repository is dynamic because I am dynamic. Because I bring curiosity, judgment, purpose, and accountability to the knowledge within it. Because I decide what matters. Because I correct mistakes. Because I notice connections that the LLM would never notice because they matter to my life, not to some average of the training data.

Engelbart understood this. He spent his career developing tools to augment human intelligence, not to replace it. The Open Hyperdocument System was never about automating humans out of the loop. It was about giving humans better loops to work within.

Final Words

The LLM-Wiki pattern is an interesting experiment. It might be useful for certain narrow domains where consistency doesn’t matter, where privacy doesn’t matter, where relationships can be implicit, where scale remains small, and where the cost of being wrong is low.

But it is not a Dynamic Knowledge Repository as Engelbart envisioned it. It is not dynamic because the human is not in control. It is automatic, not dynamic. It is a static process masquerading as a living knowledge base.

My Hyperscope is dynamic because I am in control. Because I have 245,377 people and 95,211 hyperdocuments that I have curated, not just collected. Because every relationship in peoplerelations and hypeoplerelations and relatedhyperdocuments exists because a human (or a deterministic program acting on human rules) decided it mattered. Because the vc table records every change so I can audit, correct, and learn. Because the schema reflects my understanding of the world, not a generic LLM’s best guess.

The LLM is a tool. A powerful tool. A refreshener. But it is not the repository. It is not the curator. It is not the dynamic force.

I am.

And that is why, after 23 years, my Dynamic Knowledge Repository is still alive, still growing, still serving my purposes.

Not because of the machine. Because of me.

“The CODIAK capability is not only the basic machinery that propels our organizations, it also provides the key capabilities for their steering, navigating and self repair.”
— Douglas C. Engelbart, The CODIAK Process Cluster: Best Strategic Application Candidate (1990)

“Every participant will work through the windows of his or her workstation into his or her group’s ‘knowledge workshop.’”
— Douglas C. Engelbart, Toward High-Performance Knowledge Workers (1995)

“Generally, I expect people to be surprised by how much value will be derived from the use of these future tools, by the ways the value is derived, and by how ‘natural and easy to use’ the practices and tools will seem after they have become well established.”
— Douglas C. Engelbart, The CODIAK Process Cluster (1990)

References

Primary Sources: Douglas C. Engelbart

Engelbart, D. C. (1995). Toward Augmenting the Human Intellect and Boosting Our Collective IQ. Communications of the ACM, Vol. 38, pp. 30-33.
- Available at ACM Digital Library: https://dl.acm.org/doi/10.1145/208344.208352
- DOI: 10.1145/208344.208352
- A concise statement of the vision for open hyperdocument systems and collective intelligence .
Engelbart, D. C. (2000). Boosting Collective IQ. Doug Engelbart Institute.
- Available at: https://dougengelbart.org/content/view/225/
- Defines Collective IQ, CODIAK, and Dynamic Knowledge Repositories (DKRs) as core concepts for organizational capability .

Secondary Source: LLM-Wiki Pattern

Karpathy, A. (2026). LLM Wiki: A Pattern for Building Personal Knowledge Bases Using LLMs. GitHub Gist.
- Available at: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- Describes the pattern of using LLMs to incrementally build and maintain a persistent wiki of markdown files .

Related Works

Bush, V. (1945). As We May Think. The Atlantic Monthly, Vol. 176, No. 1 (July 1945), pp. 101-108.
- Available at: http://www.theatlantic.com/doc/194507/bush
- Describes the Memex, a visionary personal knowledge repository with associative trails between documents, directly influencing later hyperdocument systems.

Hyperscope System Documentation

About Dynamic Knowledge Repositories (DKR): https://www.dougengelbart.org/content/view/190/163/
The CODIAK Process Cluster: Best Strategic Application Candidate: https://www.dougengelbart.org/content/view/116/

Note on Engelbart’s Vision

All Engelbart references are preserved and maintained by the Doug Engelbart Institute (formerly the Bootstrap Institute). The institute continues to develop Engelbart’s work on Collective IQ, bootstrapping, and human augmentation.

For further inquiry:

Engelbart Academy: https://dougengelbart.org (free video presentations of Engelbart’s strategic vision)

“We need to note here that basic CODIAK processes have practically forever been a part of society’s activity. Whether the knowledge components are carried in peoples' heads, marked on clay tablets, or held in computers, the basic CODIAK process has always been important. What is new is a focus toward harnessing technology to achieve truly high-performance CODIAK capability.”

— Douglas C. Engelbart, The CODIAK Process Cluster (1990)