2026-06-18

Understand Anything: Codebase Knowledge Graph (2026)

Understand Anything turns any codebase into an interactive knowledge graph. Open-source, 62k★, works with Claude Code, OpenClaw, Cursor, Codex.

2026-06-17

Understand Anything: Codebase Knowledge Graph (2026)

A practical walkthrough of Egonex-AI/Understand-Anything — the open-source Claude Code plugin with 62k+ stars that turns any codebase into an interactive knowledge graph you can search, explore, and ask questions about. Works with Claude Code, OpenClaw, Cursor, Codex, GitHub Copilot, Gemini CLI, and 10+ more AI coding platforms.

You just joined a new team. The codebase is 200,000 lines of code spread across three services, a shared library, and a worker. There is no documentation that is up to date. The previous senior engineer left six months ago. Your manager says "ask the AI assistant" — but which question do you even ask first? Where does auth live? Which service owns the payment state machine? What happens if the webhook handler throws after the database commit?

Reading the code blind takes weeks. Asking an LLM without context gets you confident-sounding guesses. Understand Anything takes a different approach: it runs a six-agent pipeline over your codebase, builds a deterministic knowledge graph of every file, function, class, and dependency, and gives you an interactive dashboard you can pan, zoom, search, and ask questions about — locally, in under five minutes for a typical mid-size repo. 62k+ stars on GitHub, MIT license, originally created by Lum1104, now maintained under the Egonex-AI org. This article covers what it actually does, how to install it on OpenClaw (and Claude Code, Cursor, Codex, and Copilot), the six-agent pipeline under the hood, and a real workflow for using it to onboard a junior dev in an afternoon instead of a month.

What Understand Anything actually does

The project tagline is "Graphs that teach > graphs that impress." The implementation matches: the output is a navigable JSON knowledge graph plus a local web dashboard, not a fancy 3D visualization that wows you and tells you nothing.

The core output is a single JSON file at .understand-anything/knowledge-graph.json plus an interactive dashboard served locally. Every node in the graph is one of: file, function, class, module, dependency edge, business domain, or guided-tour step. Every node has: a plain-English summary, the architectural layer (API, Service, Data, UI, Utility), tags, the source location, and the relationships to other nodes. Click any node in the dashboard and you see the source on the left and a summary of what it does and how it connects on the right.

Two design choices make this tool different from "just run an LLM on the codebase":

Deterministic structure from tree-sitter, semantic layer from LLMs. Structural edges (which file imports which, which function calls which) come from tree-sitter — a parser that produces the same output for the same input every time. The semantic layer (what a file is for, which architectural layer it belongs to, what business domain it represents) comes from LLM agents running on top of the parsed structure. The split means the structural graph is reproducible across runs — re-analyzing the same code gives you the same edges — while the semantic layer still captures intent. If you regenerate the graph tomorrow, the import edges stay identical; only the summaries might shift if you switched LLMs.

Six specialized agents, not one generalist. The pipeline orchestrates project-scanner (discover files, detect languages and frameworks), file-analyzer (extract functions, classes, imports), architecture-analyzer (assign layers), tour-builder (generate guided walkthroughs), graph-reviewer (validate referential integrity), and optionally domain-analyzer (extract business processes). File analyzers run in parallel — up to 5 concurrent, 20-30 files per batch — so the wall-clock time on a 200K-line repo is 5-10 minutes, not hours.

The output you actually use day-to-day is the dashboard, not the JSON. The dashboard gives you:

Feature What it does When you use it
Structural graph Pan/zoom/click the file → function → class graph Learning the shape of a new codebase
Domain view Horizontal graph of business processes (domains → flows → steps) Understanding what the system is for, not just how
Guided Tours Auto-generated walkthroughs ordered by dependency First 30 minutes on a new repo
Fuzzy + semantic search "Which parts handle auth?" returns relevant nodes When you know the concept but not the code path
Diff impact Visualize which nodes your current branch changes affect Before opening a PR
Persona UI Junior / PM / power-user detail levels Different audiences in the same repo
Layer visualization Color-code by architectural layer (API / Service / Data / UI / Utility) Spotting layer violations in code review
Language concepts 12 patterns (generics, closures, decorators, etc.) annotated in context Learning a new language from a real codebase

The 12 programming patterns and the persona-adaptive UI are the two features that make this tool unusually good for onboarding — they shift the same graph from "I am a junior learning the system" to "I am a PM who needs to know what this service does for the roadmap."

Install on Claude Code (the canonical path)

The most direct install is into Claude Code, Anthropic's AI coding CLI, because Understand Anything is built as a native Claude Code plugin.

# 1. Add the marketplace source
          /plugin marketplace add Egonex-AI/Understand-Anything

          # 2. Install the plugin
          /plugin install understand-anything

          # 3. Verify
          /help | grep understand
          # Expected: /understand, /understand-dashboard, /understand-chat, ...
          

Restart Claude Code, then run the analysis on any project:

cd ~/projects/your-monorepo
          /understand
          

First run on a 200K-line repo takes 5-10 minutes. Subsequent runs are incremental — only re-analyzing changed files since the last graph, which drops to seconds for typical commits.

Open the dashboard:

/understand-dashboard
          # Default: http://localhost:8765
          

The dashboard opens in your browser. Pan and zoom the structural graph; click any node to see source on the left and summary on the right; switch to the Domain view to see business processes; hit the Tours tab for guided walkthroughs.

Install on OpenClaw (and the other 12 platforms)

The same tool ships an installer for 13+ AI coding platforms. For OpenClaw on GolemWorkers, the installer is the install.sh script:

curl -fsSL https://raw.githubusercontent.com/Egonex-AI/Understand-Anything/main/install.sh | bash -s openclaw
          

What the installer does:

  1. Clones the repo to ~/.understand-anything/repo
  2. Creates the right symlinks for the target platform — for OpenClaw that means ~/.openclaw/skills/understand-anything/ and the agent definition at ~/.openclaw/agents/understand.json
  3. Registers the understand agent in your OpenClaw config

Restart the agent:

openclaw restart
          

Verify by listing the agent:

openclaw agents list | grep understand
          # Expected output: understand (registered)
          

You can now invoke the analysis from any OpenClaw session:

openclaw run --agent understand "analyze ~/projects/your-monorepo"
          # First run: ~5-10 min for 200K-line repo
          # Subsequent: incremental, ~5-30 sec for typical commits
          

The graph output is identical regardless of which platform installed the plugin — .understand-anything/knowledge-graph.json is the source of truth, and the dashboard is a static web app that reads from it. So a junior dev on Claude Code and a senior on OpenClaw and a PM in Cursor all see the same graph of the same repo.

The full platform list:

Platform Install method
Claude Code Native plugin marketplace (above)
OpenClaw install.sh openclaw
Cursor Auto-discovery via .cursor-plugin/plugin.json (clone + open)
VS Code + GitHub Copilot Auto-discovery via .copilot-plugin/plugin.json (v1.108+)
Copilot CLI copilot plugin install Egonex-AI/Understand-Anything:understand-anything-plugin
Codex install.sh codex
OpenCode install.sh opencode
Gemini CLI install.sh gemini
Pi Agent install.sh pi
Antigravity install.sh antigravity
Vibe CLI install.sh vibe
Hermes install.sh hermes
Cline install.sh cline
KIMI CLI install.sh kimi
Trae install.sh trae
Nanobot install.sh nanobot
Kiro CLI / IDE install.sh kiro

Step-by-step: the seven slash commands

Once installed, you have seven slash commands. The order matters — run them in this sequence the first time you point the tool at a new repo.

1. /understand — build the graph. First run scans the project, detects languages and frameworks, extracts every file/function/class/import, assigns architectural layers, generates summaries, builds guided tours, and validates the graph for referential integrity. Output: .understand-anything/knowledge-graph.json. First run: 5-10 min. Subsequent runs are incremental.

2. /understand-dashboard — open the visual explorer. Local web app at http://localhost:8765. Use this to verify the graph looks right before committing it.

3. /understand-chat "How does X work?" — ask the graph. The chat agent has the knowledge graph in context and answers questions grounded in your actual code. "How does the payment flow work?" returns a multi-hop walk through the relevant nodes with file:line citations. This is the feature that turns onboarding from "read the README" into "ask the graph."

4. /understand-diff — visualize your change impact. Before opening a PR, run this and see which nodes your branch touches, what depends on them, and what the ripple effect is. Useful for catching "I changed a function signature in the shared library and forgot the three call sites" before code review.

5. /understand-explain <file> — deep-dive a specific file or function. Returns a plain-English explanation with the surrounding context, callers, callees, and related concepts.

6. /understand-onboard — generate an onboarding guide. Auto-writes a ONBOARDING.md aimed at new team members: the architectural layers, the entry points, the most important files, the most-called functions, the things to read first. Cuts onboarding prep from "the senior eng spends a day writing this" to "the agent writes it in two minutes."

7. /understand-domain — extract business processes. Adds the sixth agent (domain-analyzer) to identify business domains (e.g. payments, auth, inventory), flows (e.g. checkout, refund, signup), and steps. Switch to the Domain view in the dashboard to see the system as a horizontal flow graph instead of a vertical dependency graph.

There is also /understand-knowledge <path> for non-code knowledge bases — point it at a Karpathy-pattern LLM wiki (markdown files with [[wikilinks]] and an index.md) and you get a force-directed knowledge graph with community clustering instead of a code dependency graph. Useful for product wikis, internal documentation, research notes.

The 6-agent pipeline under the hood

If you are curious what the tool is actually doing for those 5-10 minutes on first run:

Agent What it does Failure mode
project-scanner Walk the directory, detect languages, identify framework markers (package.json, go.mod, Cargo.toml, etc.) Misses files if .gitignore excludes them — fine, but means tests in excluded dirs are not analyzed
file-analyzer (×5 concurrent) Tree-sitter parse → extract functions, classes, imports. LLM pass → summarize, tag, assign layer Each file gets ~10-30s; long files (>2000 lines) get summarized in chunks
architecture-analyzer Aggregate file-level layers into module/service layers; identify layer violations Struggles on monorepos with mixed conventions — review the output
tour-builder Find the entry points (main, server.listen, etc.), trace dependency depth, generate ordered walkthroughs Tours are static — they do not update with code changes until you re-run /understand
graph-reviewer Validate edges: every import resolves, every called function exists, no orphan nodes Runs inline by default (fast, regex-based); use --review for full LLM audit (slow, deeper)
domain-analyzer (optional, via /understand-domain) Identify business processes from file naming, summaries, and call patterns Best on well-named codebases; struggles when function names are opaque

The split between tree-sitter (structural, deterministic) and LLM (semantic) is the reason the tool is fast on first run but the reason summaries can vary between LLM versions. If you switch the underlying model from Claude Sonnet to GLM-5.2, the import edges stay identical but the summaries shift. Both are fine — the structural graph is what you actually trust for "which function calls which," and the summaries are scaffolding for humans.

Diff impact analysis: the underused killer feature

Most code review tools tell you what changed. /understand-diff tells you what your change affects — visualized on the graph.

Run it before opening a PR:

git checkout -b feature/new-pricing-tier
          # ... make your changes ...
          /understand-diff
          

Output:

Direct changes (files you touched):
          - src/billing/pricing.ts      (+12, -3)
          - src/billing/checkout.ts     (+5, -2)

          Transitive impact (callers/callees of changed code):
          - 3 services depend on src/billing/pricing.ts
          - 11 tests reference src/billing/pricing.ts
          - src/api/routes/checkout.ts imports the changed export

          Risk hotspots:
          - src/legacy/payments-v1.ts (calls changed function, no tests covering new code path)
          

This catches the most common PR failure mode — "I changed the function signature and the three call sites I forgot to update." On a graph-aware codebase, the tool can tell you the full set of impacted callers in under 30 seconds.

On GolemWorkers, this composes nicely with the long-horizon refactor agent pattern from the GLM-5.2 article: let /understand-diff define the blast radius, then let the refactor agent make the changes with confidence that it caught every caller.

Share the graph with your team

The graph is just JSON. Commit it once, and the whole team gets the graph on git pull without re-running the pipeline — which is the difference between "useful" and "actually adopted."

What to commit:

# .gitignore — exclude scratch, keep the graph
          .understand-anything/intermediate/
          .understand-anything/diff-overlay.json
          
git add .understand-anything/
          git commit -m "chore: add Understand Anything graph for onboarding"
          

For larger repos where the graph exceeds 10 MB, track with git-lfs:

git lfs install
          git lfs track ".understand-anything/*.json"
          git add .gitattributes .understand-anything/
          

Auto-update on every commit with a post-commit hook:

/understand --auto-update
          # Adds a post-commit hook that re-runs /understand incrementally on every commit
          

A reference example: GoogleCloudPlatform/microservices-demo — a Go / Java / Python / Node reference monorepo with a committed graph. Clone it, run /understand-dashboard, and you have an interactive map of a real distributed system in under a minute.

Multi-language and the knowledge-base mode

The --language flag generates node summaries, dashboard labels, and guided tours in your preferred language. On first run, /understand detects the language you are conversing in; if it is not English, it asks you to confirm before generating. Your choice is persisted in .understand-anything/config.json. Supported languages: en (default), zh, zh-TW, ja, ko, ru. The README ships translated versions for zh-CN, zh-TW, ja, ko, es, tr, ru.

The knowledge-base mode (/understand-knowledge <path>) is the non-code sibling: point it at any wiki that follows the Karpathy pattern — markdown files with [[wikilinks]], an index.md, and category headers — and you get a force-directed graph of entities, claims, and implicit relationships. Useful for product wikis, research notes, internal RFC archives.

End-to-end: onboarding a junior dev in an afternoon

Here is a workflow that has been used to onboard new engineers on multi-service Go and Python monorepos with this tool:

Time What happens
0:00 Junior dev clones the repo, runs /understand
0:10 Graph builds. Junior opens /understand-dashboard
0:15 Hits the Guided Tours tab — auto-generated walkthrough of entry points
0:45 Switches to Domain view to see business processes (payments, auth, inventory) as horizontal flow
1:30 Asks /understand-chat "How does the refund flow work?" — gets a multi-hop walk through payments/refund.tsbilling/adjustment.tsdb/postgres-transaction.ts with file:line citations
2:30 Reads /understand-onboard output as ONBOARDING.md for a written summary
3:00 Picks a first ticket. Before opening a PR, runs /understand-diff to see blast radius
4:00 Opens PR with confidence that all callers are accounted for
afternoon Picks up next ticket. The graph is the documentation

Compared to the baseline of "shadow a senior for two weeks then read code blind for two more," this is roughly a 4-6× compression of the ramp-up curve.

Common pitfalls

  • Committing .understand-anything/intermediate/ to git. It contains scratch files from the LLM pass and grows fast. Add it to .gitignore before the first commit. The tool's own .gitignore template is the source of truth.
  • Running /understand on the whole monorepo without scoping. On a 2M-line monorepo the first run can take 30-60 minutes and produce a 50+ MB graph. Scope with /understand src/<service> for the first pass; expand later.
  • Trusting the semantic summaries without checking. The summaries are LLM-generated. They are scaffolding, not specification. For anything security-sensitive or business-critical, read the code yourself.
  • Skipping --review on the first run. The default graph-reviewer runs regex-based validation, which is fast but shallow. The first time you run on a new repo, do /understand --review to get the full LLM audit — it catches orphaned edges and missing summaries that the regex pass misses.
  • Forgetting to re-run on a long-lived branch. The graph is incremental, but if you branch from main and the base moves by 500 commits, the diff between your branch and the base graph can mislead /understand-diff. Re-run /understand on main periodically.

What to combine it with

Three pairings that compound the value:

  • GolemWorkers long-horizon refactor agents (e.g. on GLM-5.2): use /understand-diff to define blast radius, then run the refactor agent with the graph as context. The agent knows what to change and what not to break.
  • OpenClaw incident-response skill: when an alert fires, point the incident agent at .understand-anything/knowledge-graph.json so it knows which service owns which domain, instead of guessing from file paths.
  • Cursor / Claude Code for day-to-day coding: keep /understand-dashboard open in a tab while you code, so you can quickly jump from "I'm editing this function" to "what depends on it" without leaving the IDE.

FAQ

What is Understand Anything?

Understand Anything is an open-source plugin (originally by Lum1104, now maintained by Egonex-AI) that turns any codebase, knowledge base, or documentation set into an interactive knowledge graph you can explore, search, and ask questions about. It works as a Claude Code plugin, OpenClaw agent, Cursor extension, VS Code Copilot plugin, and 10+ more AI coding platforms. MIT-licensed, 62k+ stars on GitHub, written in TypeScript.

How does Understand Anything work?

It runs a six-agent pipeline over your codebase: project-scanner discovers files and detects languages, file-analyzer (5 concurrent) uses tree-sitter for deterministic structural parsing plus an LLM pass for semantic summaries, architecture-analyzer assigns layers, tour-builder generates guided walkthroughs, graph-reviewer validates referential integrity, and (optionally) domain-analyzer extracts business processes. The structural edges (imports, calls) come from tree-sitter and are reproducible; the semantic layer (summaries, tags, layer assignments) comes from LLMs and can vary between models.

Is Understand Anything free?

Yes, under the MIT license. You can use it commercially, modify it, redistribute it. There is no paid tier and no telemetry. The cost is the LLM API calls for the semantic layer — on Claude Sonnet 4.5, a 200K-line repo costs roughly $2-5 to analyze on first run and a few cents per incremental update.

Does Understand Anything work with OpenClaw?

Yes. Install with curl -fsSL https://raw.githubusercontent.com/Egonex-AI/Understand-Anything/main/install.sh | bash -s openclaw. The installer clones the repo to ~/.understand-anything/repo, creates the right symlinks for OpenClaw at ~/.openclaw/skills/understand-anything/ and ~/.openclaw/agents/understand.json, and registers the understand agent in your OpenClaw config. Restart with openclaw restart and invoke with openclaw run --agent understand "analyze ~/projects/your-repo".

How long does the first analysis take?

Roughly 5-10 minutes for a 200K-line repo on Claude Sonnet 4.5. The file-analyzers run 5-concurrent in batches of 20-30 files, so wall-clock time scales with the slower of (a) LLM throughput and (b) tree-sitter parse time. Incremental runs after that drop to seconds for typical commits. For monorepos over 1M lines, scope the first run to a subdirectory with /understand src/<service>.

Does it work on private repos?

Yes. The tool runs entirely locally — it analyzes your code, builds the graph, and serves the dashboard on localhost:8765. Nothing is uploaded. The only network calls are to the LLM API you configure (Claude, OpenAI, Z.ai, or whatever your OpenClaw agent already uses).

Can I use it on a non-code knowledge base?

Yes, via /understand-knowledge <path>. Point it at any wiki that follows the Karpathy pattern (markdown files with [[wikilinks]], an index.md, and category headers) and you get a force-directed knowledge graph of entities, claims, and implicit relationships. Useful for product wikis, research notes, internal documentation.

Does it replace code review?

No. It augments code review by visualizing the impact of a change before you open a PR. Use /understand-diff to see which callers and tests are affected, but the human review and the test suite still do the actual validation.

How is this different from Cursor's codebase indexing or Copilot's workspace awareness?

Cursor and Copilot index the codebase for retrieval-augmented prompts — they help the AI assistant find relevant context when answering a question. Understand Anything produces a persistent, deterministic, shareable graph artifact that you can commit to git and use across multiple AI tools, IDEs, and team members. The two approaches compose well: use Cursor/Copilot for in-flow AI coding, and Understand Anything for cross-tool codebase understanding.

Related articles