How Filename Lookups Flood an AI Coding Agent's Context Window

3 MIN READ

The Coder’s context included twenty barrel re-exports when it needed one.

The problem

The pipeline enriches a task manifest with full symbol details from a symbol registry. For each file in the manifest’s files_to_modify list, the enrichment step queries the registry with the path fragment from the manifest to get line ranges, types, and metadata. The Coder uses this to know exactly where to make changes.

The manifest said "path": "packages/api/src/server/pipeline/index.ts". The enrichment extracted the filename: index.ts. The registry returned every symbol in every file named index.ts.

The target codebase is a seven-package TypeScript monorepo with package-root barrels everywhere:

Twenty files. Nineteen of them are noise. A larger enterprise TypeScript estate can easily have fifty or a hundred of these barrels, one per package-root and one per directory a team decided was package-shaped.

Why it matters

Each enrichment result adds tokens to the Coder’s context. A barrel is small on its own, but twenty of them together is several thousand tokens of export * from "./..." lines the Coder should never see. On a hundred-barrel codebase it is over ten thousand tokens of re-exports dragged in for every ticket, every time (a different cause from the redundant tool-call inflation, but the same symptom).

Worse, the twenty files look structurally identical to the Coder: same filename, same re-export pattern, different contents. If the manifest’s path scope is ambiguous in any way, the Coder can read the wrong barrel and produce code that targets the wrong package. It has not happened yet (the path field has been clear enough), but the risk is structural, not a near miss.

The fix

One line changed in the enrichment code. The enrichment was extracting just the filename from the full path before passing it to the registry lookup, discarding the specificity already in the manifest. Passing the full path directly resolved it.

The registry matches on path substrings, so packages/api/src/server/pipeline/index.ts matches exactly one file while index.ts matches all of them. Twenty results down to one: same call, same registry, same lookup, just a longer search string.

The bug is not that filenames are a bad key. It is that filenames were ever used when full paths were sitting right there in the manifest.

The principle

This is a context-pollution pattern that appears anywhere filenames are reused:

If your RAG pipeline or context-injection system looks up symbols by filename instead of full path, you are pulling in every symbol from every file with that name. The noise scales linearly with project size.

The fix is always the same. Use the most specific identifier available. If you have a full path, use the full path. Do not discard specificity for convenience.

Two-panel diagram contrasting filename lookup with full-path lookup. Left panel shows a query for index.ts branching to five result files (shared/src/index.ts, shared/src/parsers, api/src/server, pipeline, services/backup) plus a label indicating fifteen more, totalling 20 matches. Right panel shows a query for packages/api/src/server/pipeline/index.ts resolving to exactly one file.

Polyglot by construction

The fix does not check the file extension, the language, or the framework. It uses whatever path string the manifest put in. Works for packages/api/src/server/pipeline/index.ts, src/auth/views.py, internal/handlers/search.go, or src/routes/mod.rs. No language-specific logic needed: the specificity comes from the manifest, not from the enrichment code.

Anywhere the pipeline has a choice between a more-specific and a less-specific identifier, the more-specific one is almost always free. It is usually already in the data.


The pipeline runs on real tickets against a ~100k line TypeScript monorepo. Still R&D.