LLM Hallucination

JUNE 10, 2026

The Decision an AI Coding Agent Can't Make Alone: Operator-Grounded Intent Capture

A deterministic resolver that passed every unit test then failed 40% of runs. Operator confirmation worked where a smarter algorithm did not.

MAY 14, 2026

Grounding an Autonomous Engineering Pipeline in Operator Design: The Pre-Autonomous Chat Stage

A new pre-autonomous chat stage lets the operator ground design decisions in the registry before the pipeline runs, adding a new top to the trust hierarchy.

MAY 9, 2026

From LLM Luck to Structurally Guaranteed: One Ticket Across Four Architectural Eras

Seven pipeline runs, one ticket, four architectural eras. Per-test cost dropped from $0.385 to $0.074 by replacing LLM guesswork with structural derivation.

MAY 6, 2026

Engineering Around LLM Non-Determinism: The Architectural Follow-Up to 248 Runs

What shipped in the three weeks after the 248-run hallucination ceiling: removing the LLM from computable decisions and validating everything else.

APRIL 17, 2026

Per-Field Hallucination Fixes Hit a Ceiling: 248 Runs on an AI Coding Agent

Bernoulli model predicted 36% first-pass success across 248 pipeline runs. Measured: 21%. The gap explains why per-field hallucination fixes have a ceiling.