Agent Pipeline Grounding Chat: From Free-Form Q&A to Typed Fields
Two open quality items are closed. Operator design decisions flow through typed handoff fields. The grounding chat has a structural defense against LLM drift.
Two open quality items are closed. Operator design decisions flow through typed handoff fields. The grounding chat has a structural defense against LLM drift.
The pipeline completed a multi-layer TypeScript feature with zero LLM code generation: 6/6 ticket tests, 959/959 suite, $0.308 versus $0.681.
A new pre-autonomous chat stage lets the operator ground design decisions in the registry before the pipeline runs, adding a new top to the trust hierarchy.
Seven pipeline runs, one ticket, four architectural eras. Per-test cost dropped from $0.385 to $0.074 by replacing LLM guesswork with structural derivation.
What shipped in the three weeks after the 248-run hallucination ceiling: removing the LLM from computable decisions and validating everything else.
The data model behind the symbol registry: per-symbol records, file-level hashes, call-graph edges, and the invalidation strategy that keeps it current.
Tree-Sitter tells you where a symbol is defined. It cannot tell you where it is called. That gap cost one pipeline run 33,000 tokens to find out.
A Haiku optimization made the L2 quality gate silently pass on every run. The fix was removing the LLM call entirely.
Not a model capability problem. An agent with the wrong codebase version produces output that is plausible but wrong in ways that are hard to catch.