GRAPH-STRUCTURED DEPENDENCY NAVIGATION FIXES MISSED-FILE FAILURES IN REPO-SCALE CODING AGENTS
New results show that wiring coding agents to traverse a code dependency graph outperforms expanding context or keyword/vector retrieval on architecture-heavy t...
New results show that wiring coding agents to traverse a code dependency graph outperforms expanding context or keyword/vector retrieval on architecture-heavy tasks where critical files are semantically distant.
An arXiv study introduces the Navigation Paradox: as context windows grow, failures shift from retrieval capacity to navigational salience, and presents CodeCompass, an MCP-based graph tool exposing IMPORTS/INHERITS/INSTANTIATES edges during agent runs with Claude Code; on a FastAPI RealWorld benchmark, BM25 hits 100% on semantic (G1) tasks but gives no lift on hidden-dependency (G3) tasks (78.2% vs 76.2% baseline), while CodeCompass reaches 99.4% ACS on G3, a +23.2 point jump over both baselines (paper, code/benchmark).
Crucially, benefit depends on tool invocation: trials that actually used the graph (42%) averaged 99.5% ACS; those that skipped it despite instructions scored 80.2%, indistinguishable from vanilla—highlighting that prompt design and agent policies must reliably trigger graph consultation.
For teams piloting repo-level agents, treat structural navigation as a first-class capability: generate a per-repo AST-derived dependency graph, expose it via MCP, and enforce early graph lookups when touching modules with broad non-local impact; the author also shares a practitioner-friendly narrative on why assistants miss critical files Medium.
Retrieval alone misses non-semantic architectural dependencies that break changes in real services.
A lightweight code graph can unlock large accuracy gains without bigger models or longer contexts.
-
terminal
A/B agents with keyword/vector retrieval vs. MCP-exposed dependency graph, tracking task success and token spend.
-
terminal
Instrument policies to require a graph hop before edits to widely imported modules and measure reduction in regressions.
Legacy codebase integration strategies...
- 01.
Add a CI step to build/update AST-based graphs per service and surface them to agents via MCP with language-aware parsers.
- 02.
Start with high-fanout modules and shared libraries to minimize risk while validating tool-invocation rates and outcomes.
Fresh architecture paradigms...
- 01.
Adopt module boundaries and import hygiene that make static graph extraction reliable, and persist the graph as a project artifact.
- 02.
Bake graph lookups into agent workflows from day one, with guardrails that block edits if dependency traversal hasn’t run.