Recently, a Gist by Andrej Karpathy has been spreading widely in the tech community. After reading it, my first thought was: this idea has a deeper connection with Obsidian than most people realize. This article is about that.
Who is Karpathy
If you follow the AI world, this name should be familiar. But if you don’t know much about him, I think it’s worth explaining first.
Karpathy is not the kind of “AI leader managing products at a big company”; he is truly one of the own in the deep learning field.
He did his PhD at Stanford under Fei-Fei Li—the person who led ImageNet and essentially launched modern computer vision. After leaving Fei-Fei Li’s group, Karpathy went to OpenAI as one of its co-founders. In 2017, Tesla poached him to lead the vision perception system for Autopilot.
During his years at Tesla, many people now know the outcome: Tesla was almost the only autonomous driving company at the time that insisted on a pure vision approach—no lidar, just cameras + neural networks. This approach was heavily criticized at the time as too radical. The results later became clear to everyone.
He left Tesla in 2022, briefly returned to OpenAI in 2023, then left again to start his own AI education project, karpathy.ai.
What I find interesting about him is not just his resume, but that he maintains a rare state: able to do world-class engineering, yet willing to spend time writing articles and recording courses to explain the underlying logic of technology to ordinary people.
His nanoGPT and micrograd on GitHub are minimal reimplementations of GPT and backpropagation, specifically designed for ordinary people to truly understand. His CS231n course on YouTube has taught countless people deep learning.
So when he wrote a Gist on GitHub about “using LLMs to manage knowledge bases,” it quickly spread through the tech community. I think it’s worth a careful read.
What He Said
The Gist is titled LLM Wiki: A pattern for building personal knowledge bases with LLMs. The original link:
https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
His starting point is a feeling many people have had: Your knowledge base keeps growing, but what you can actually use keeps shrinking.
You bookmark articles, take reading notes in Notion, build a pile of notes in Obsidian, but the next time you need a certain piece of knowledge, the probability of finding it is not high. It’s not that you can’t find it—it’s too scattered, with no connections between them. Even if you find them, they are fragments that you have to piece together yourself.
He attributes this problem to the shortcomings of two existing solutions:
First: The bookmark approach.
You dump the original text in, do nothing, and rely entirely on search. The problem is that search finds documents, not answers. You still have to read, understand, and synthesize yourself.
Second: The RAG approach (Retrieval-Augmented Generation).
You provide a bunch of documents to the AI, which retrieves and generates answers on the fly. This is much better than bookmarks, but it’s always temporary, starting from scratch each time, with no accumulation.
His proposed LLM Wiki is a different idea: Don’t let the AI organize temporarily during search; instead, let the AI continuously maintain an ever-updating Wiki.
How LLM Wiki Works
The entire architecture has three layers:
Raw Sources
These are the things you usually read: articles, books, video subtitles, meeting notes. They are stored here as-is, as raw material.
The Wiki
A set of Markdown files, each corresponding to a topic, concept, or entity. For example, you might have a “Machine Learning - Overfitting” page, a “Reading Notes - Being in the Game” page, and a “People - Feynman” page.
These files are not written by you; they are written and continuously maintained by the AI. Each time new material comes in, the AI updates relevant pages; cross-references are established between pages; if contradictions arise, they are flagged.
The Schema
A configuration that tells the AI “what this Wiki should look like.” For example, what fields each note should contain, how to organize, what constitutes an orphan note, which concepts need their own page.
Then three core operations:
Ingest
Each time new material comes in, the AI reads it and updates 10 to 15 Wiki pages. Not just creating new ones, but also updating existing content, adding cross-references, and flagging areas that need further confirmation.
Query
You ask a question, and the AI synthesizes an answer from the Wiki. The key is: if the query itself produces valuable new integrations, the AI also writes them back into the Wiki. In other words, the more you use it, the richer the Wiki becomes.
Lint
This is an operation he specifically mentions, and I think it’s the smartest part of the scheme. The AI periodically performs a health check on the entire Wiki:
- Are there two pages with contradictory content?
- Are there statements that are outdated?
- Are there orphan pages with no other pages linking to them?
- Are there obviously missing cross-references?
Doing these things manually would be tedious and almost impossible to sustain. But for the AI, it’s pure grunt work.
The division of labor is as follows:
Humans are responsible for: curation (choosing what is worth including), critical judgment (is this conclusion correct?), supervision (periodically reviewing the AI’s updates)
AI is responsible for: bookkeeping—cross-references, consistency maintenance, orphan node cleanup, formatting
He uses the term: bookkeeping. This word is well chosen. It’s not about letting the AI think for you, but about handing over the maintenance tasks you know you should do but keep putting off, to the AI.
Why Obsidian Users Should Pay Special Attention
I’ve noticed something recently: some of my friends in the programming community, who were not interested in Obsidian before, have started using it one after another.
When asked why, the answers are mostly the same: because it’s so suitable for working with AI. Local files, plain Markdown, no lock-in—these used to be niche preferences, but now they’ve become advantages. Tools like Claude Code can directly read and write an Obsidian Vault without any extra configuration; what the AI can do, it can do directly.
Karpathy’s Gist, in a way, makes this even clearer.
I’ve been using Obsidian for a while myself, and after reading this Gist, I had a strong feeling:
The Wiki he describes is essentially an Obsidian Vault actively maintained by an AI.
Think about it: what is the core of Obsidian? A bunch of local Markdown files, connected by bidirectional links.
What is the core of LLM Wiki? A bunch of Markdown files, plus an AI that helps you create and maintain links, integrate content, and perform health checks.
The underlying medium is exactly the same. An Obsidian Vault is almost the most natural implementation of an LLM Wiki.
The things you do manually now—creating bidirectional links for notes, writing Maps of Content (MOCs), periodically organizing and archiving—a significant portion of these can be done by the AI as “bookkeeping work” in the LLM Wiki design.
Let me give you my own example: after I finish writing an article, the step of creating and organizing bidirectional links is now handled by a Skill. The AI scans my note vault, finds related articles, and automatically adds bidirectional links. I used to procrastinate on this step every time, but now I hardly worry about it.
Karpathy’s LLM Wiki just takes this further: not just running it once after writing an article, but keeping the entire knowledge base in a continuously maintained state, with Ingest, Query, and Lint all automated.
Of course, there are also voices that think this approach has problems.
Some in the tech community have drawn a comparison to Zettelkasten: traditional Zettelkasten emphasizes that the act of actively writing notes is itself the process of understanding—not collecting, but building connections through writing. If the AI summarizes and creates associations for you, doesn’t that understanding process disappear? You get a tidy knowledge base, but is there nothing in your brain?
This is a real question, and I think there is no standard answer.
But for Obsidian users, my own judgment is: these two things are not contradictory, provided you clarify which tasks are “truly need thinking” and which are “annoying bookkeeping.”
For example:
- Reading an article, extracting core ideas, writing your own feelings and reflections → this is thinking, should be done by yourself
- Checking which notes haven’t been linked in three months → this is bookkeeping, perfectly reasonable to delegate to AI
- Synthesizing multiple sources under a concept → can have AI draft, you review
- Maintaining frontmatter fields for a bunch of notes → this is pure grunt work, AI does it
The real risk is not that you stop thinking because you use AI, but that you equate “having the AI summarize this article” with “I have read this article.”
As long as you can distinguish this, the LLM Wiki approach is actually a quite valuable extension for Obsidian users.
Next Steps
Karpathy’s Gist is currently at the stage of “proposing a good pattern, but not providing an out-of-the-box tool.”
A few people in the community have started implementing this idea in different directions, but it’s still very early.
I plan to seriously upgrade my own setup: first reorganize my Obsidian note vault according to the LLM Wiki approach, then push my existing bidirectional link Skill further, trying to add Ingest and Lint logic to make it a more complete Skill.
Using Claude Code + Obsidian Vault, I’ll run through the entire process from start to finish—see what works, what the pitfalls are, and what needs redesign. If it works out, I’ll package the whole thing and share it, so others can use it directly without building from scratch.
The next chapter will cover this hands-on process.
Summary
What we learned today:
- Karpathy is a deep learning researcher from Stanford, the leader of Tesla’s pure vision approach for Autopilot, and a co-founder of OpenAI, currently focused on AI education.
- LLM Wiki is a pattern where “the AI actively maintains the knowledge base,” as opposed to passive retrieval in RAG.
- The core architecture has three layers: Raw Sources → Wiki (collection of Markdown files) → Schema (structure definition).
- Three operations: Ingest (ingest and update) / Query (query and write back) / Lint (health check).
- The core division of labor: humans do curation and judgment, AI does “bookkeeping”—cross-references, consistency maintenance, orphan node cleanup.
Key takeaways:
- An Obsidian Vault is itself a collection of Markdown files, highly consistent with the medium of LLM Wiki, making it almost the most natural implementation.
- The bidirectional links and MOCs you manually create now are exactly the cross-references automatically maintained by AI in LLM Wiki.
- Worrying about “AI thinking for you” is reasonable, but that’s different from having AI do “bookkeeping”—don’t conflate them.
- This pattern currently has no out-of-the-box tool; you need to build it yourself.
- The next chapter will walk through a hands-on implementation; if it works, it will be packaged as a Skill and shared.