The Registry Is Not a Fixed Input: Extend It, Don't Ask the Model

The pipeline I work on removes the language model from as many decisions as it can. The principle it runs on is that every fact a model copies is a fact it can hallucinate, so the model decides what to build and the pipeline supplies what exists, machine-extracted from a registry and the source. This is the Data Path Principle, and the registry is what makes it real: derived entirely from the codebase and kept in step with it as the code changes, it holds the facts about the code that the rest of the pipeline reasons over instead of asking the model.

The interesting failures happen at the edge of that registry, when a deterministic derivation needs a fact the registry does not store yet. There are two tempting wrong answers in that moment. One is to ask the model for the fact. The other is to scan the source for it, live, with a regex, at the point the decision is made. Both reintroduce exactly what the registry exists to remove. The model answer is a hallucination surface one layer up; the live-scan answer is a parser that drifts from the real one and breaks on the first case it did not anticipate. The right answer is the one that does not feel like progress in the moment: stop, and extend the registry to capture the fact.

That move is cheap, and the reason it is cheap is the whole point. The registry is not an input you receive and work around. It is an owned, fully-derived artifact: every fact in it is machine-extracted from the code, nothing is hand-authored, and all of it can be regenerated from source at any time. Syncs are incremental, touching only the entries the changed code affects rather than rebuilding the whole thing, so keeping it current is cheap. But the property that makes extension cheap is not the sync mechanics, it is that nothing in the registry is precious. Teaching it to record a new kind of fact needs no data migration, because there is no hand-curated history to preserve, only a new fact that starts being derived alongside all the others. Extending it is nearly free. It is the completion of “derive, don’t guess”: when the fact you need to derive a decision is missing, you do not fall back to guessing, you make the fact derivable. Three times this past month, a new ticket shape needed a fact the registry did not have, and each time it was a new extraction, not a new prompt.

A decision diagram. A deterministic derivation needs a missing fact. Two red branches, asking the model and regexing the source live, are marked as anti-patterns; the green branch, extending the registry, is the correct move.

Three facts a new shape forced into the registry

The first was the guard conditions on a call. The pipeline learned to handle a new flag that lands on an options object already carrying another flag, where both flags gate the same downstream call in opposite directions. Deciding which flag wins is a real decision, and it is the operator’s to make, but the pipeline first has to detect that the conflict exists at all. The naive version asks the model “do these two flags interact?” and the model is bad at it; in one probe it got the precedence right zero times out of five. The conflict is not a judgment call, though. It is mechanically present in the code: both flags appear as guard terms on the edge to the same callee. So the registry grew a place to store the guard conditions under which each call runs, extracted at sync time with Tree-Sitter, and a deterministic detector reads them and surfaces the conflict for the operator to resolve. The model no longer infers the interaction. It is read off the call edge, and the operator decides precedence against a fact rather than a guess.

The second was the member types behind a named return type. When the pipeline learned to thread two independent flags through one call chain, the behavioral oracle for the second flag had to make an upstream call return a success value so the code under test would reach the gated call. That upstream call returned a named type, Result, rather than an inline shape, and building a valid success value for it means knowing its fields. The registry stored that a function returned Result; it did not store what Result contained. So the Synthesizer built an empty object where the real success value needed a discriminator and a payload, the oracle ran against a malformed mock, and it failed two of its six cases.

Extending the registry to store the field types behind a named type, populated from the language server at sync time, was the correction. With the type’s members available as a machine fact, the success value is built correctly: the oracle went from 4/6 passing to 6/6, and the route test from 32/37 to 37/37, on the run that proved the multi-flag shape end to end.

The third was a server-assigned default. When the pipeline learned to build a brand-new endpoint from scratch, the new route returned a count of items in each status, and the response oracle had to assert every status, including the ones the test could never put an item into because the status is assigned by the server, not the request. To seed a record into one of those states, the test needs the default the schema assigns, and to assert the full set of statuses it needs the complete member list of the status enum. Neither was stored. So the registry grew the enum’s full member set and a per-language extractor for the schema column’s default value. With both as machine facts, the response oracle became exhaustive, seeding one record into every status through the application’s own create-then-update path, and the mutation gate killed all three contract mutants against real per-status values, full suite green at 956.

In each of the three cases what was missing was a fact the registry should have been recording all along, not a new prompt, a new model call, or a regex run at decision time. Each was added once, populated at sync, and read deterministically forever after.

A table mapping three new ticket shapes to the registry fact each one forced to be added and the deterministic decision it enabled: colliding flags needed guard conditions, multi-flag needed named-return-type members, greenfield needed enum members and column defaults.

The anti-pattern: green against a shape that never happens

The second of those three is also the one that taught me the failure mode to watch for, because before the registry grew the field-types fact, the logic that consumed it was already covered by unit tests, and those tests were green.

They were green because they ran against a hand-built type with its fields filled in. The test set up a named return type the way the code wished the registry would provide it, ran the success-value builder against that, and asserted the right object came out. It did, every time. The logic was correct. What the tests never checked was whether the registry actually produced that shape at sync time, and it did not, because the registry was not yet recording that fact. The live registry handed the same code an empty type, and the code did exactly what the tests proved it would do with a full one, which was the wrong thing for the input it actually got. The unit tests proved the logic correct against a data shape that production never produced.

A fixture you wrote is a record of what you expected; a frozen sample of real sync output is a record of what the registry actually emits, and only the second one can catch this.

This is the trap with deterministic pipelines that lean on a derived store. A unit test can fix the input shape to whatever the author imagined, and a populator or a Synthesizer keyed on a field that only the test fixture fills will pass its unit test forever while being dead on live data. The green is real, and it is meaningless, because the input it validates against does not occur. The remedy is never a better assertion. It is to make the live registry produce the shape the test assumes, which usually means extending extraction, and then to verify against a frozen sample of real output rather than a hand-built fixture. A fixture you wrote is a record of what you expected; a frozen sample of real sync output is a record of what the registry actually emits, and only the second one can catch this.

Extending is the easy half; populating live is the discipline

Adding the fact is cheap. Proving it populates on a real run is the expensive half, and that is where the cost of an extension really lives, not in declaring the new fact.

A single registry fact can have more than one population path: a sync fills the registry from several places, not one, and a new fact usually has to be wired into each of them. Wire it into one and miss the others, and it populates on some symbols and not others, which looks just like a subtle logic bug downstream. The only way to know every path is covered is to run a real sync and read the value back off the live registry, not to watch a unit test go green.

This is also where the registry’s nature as a long-lived service bites. The extraction code runs inside a supervisor process that caches its modules at startup, so even a correctly wired population path can be fully right in the source and completely absent from the live registry, because the resident extractor is still running the code it booted with, not the code you just wrote. That failure mode is disorienting enough to deserve its own post; here it just adds one clause to the rule above. The real sync you read the value back from has to be running on a freshly restarted stack, or the value you read is yesterday’s, and you have shipped a fact that is right in the code and missing from the registry.

Why this is the right cost

It would be faster, in the moment, to ask the model for the guard interaction, or to regex the source for the enum members, or to let the oracle guess a success value. Each of those is one fact, and each looks like a small, local shortcut. The cost is that you have re-imported non-determinism one fact at a time, into the exact layer you built the registry to keep deterministic, and you find out at the worst time, when a model guesses wrong on a real ticket and a green run ships a broken feature.

Treating the registry as extensible inverts that. When a derivation needs a fact, the fact becomes part of the owned, machine-extracted substrate, derived the same way every other fact is, and the decision built on it is deterministic for every future ticket of that shape and every shape that comes after. The registry stopped being a fixed input I worked around and became the thing I grow whenever a new capability needs a new fact. That is why the recent run of new ticket shapes came as cheaply as they did: most of the per-shape work was an extraction, not a prompt.

This is one TypeScript codebase, and still R&D, and each extension is a real population path to wire and verify rather than a free lunch. But the direction is settled.

When the pipeline does not know a fact, the answer is to extend the thing that knows facts, never to ask the thing that invents them.