Why Architecture Gaps Need a Close Condition, Not a Backlog

Most software projects track technical debt in a backlog. Few track it with a record of why each problem exists, what risk it carries, and under what specific condition it must be closed. The architecture-gaps document is the second thing.

I have been maintaining one for an autonomous engineering pipeline, an AI coding agent that plans, writes tests, writes code, and reviews its own output. A previous post on this blog (Fixture-First Development as an Early Warning System for AI Pipelines) introduced the catalogue-vs-backlog distinction in the context of a specific run. This post is the deeper look at what makes the document itself work.

The document currently has over 240 entries. About half are closed, some are deferred, and the rest are open. Every entry follows the same structure, and that structure is the reason the document works.

What a gap entry requires

A useful gap entry has three fields that cannot be omitted. Without all three, the entry degrades into a backlog item that gets periodically reviewed and periodically deferred.

What the gap is. Not a title. A description of the specific deficiency, including what was observed when the gap was discovered. “The Coder appended a duplicate symbol on retry instead of replacing the existing one. File grew past 1,600 lines. Next attempt timed out.” That is not a summary. It is the incident that made the gap visible, preserved so the next reader understands the severity without having to reproduce it.

Why it is not closed yet. This is the field most backlogs omit. “Fix this later” is not a reason. The reason has to be specific. “The fixture files are 50 to 200 lines, small enough that the workaround holds. The risk is deferred to the first production-sized file.” That tells the reader exactly what is being bet on and what would break the bet. The reader can disagree, but the disagreement is informed.

The close condition. This is what turns a gap into a tripwire rather than a wish. A close condition is not “eventually.” It is a concrete trigger. “Must resolve before the pipeline runs against any file over 500 lines.” When the trigger fires, the gap is no longer deferrable. The decision to defer was correct at the time it was made, and the close condition is how the document knows when “at the time” has expired.

Three shapes a gap takes after it is filed

Not every gap gets fixed in the way the entry originally proposed. The document tracks three outcomes, and the reasoning for each is preserved, not just the status.

Closed by fix. The trigger fired, the gap was fixed. The entry records what changed and when. This is the expected path and the least interesting one.

Invalidated. A later architectural decision made the gap irrelevant. One early gap proposed building import-path resolution into the registry via Tree-Sitter walkers, so the Planner could tell the Coder which import paths to use. When the language server (LSP) was integrated, it provided the same resolution semantically and more accurately. The gap was marked invalidated with a one-line note: “LSP provides this. Do not build.” Three other gaps in the same area were invalidated the same way, by the same architectural decision, on the same day. The entry stays in the closed archive so a reader who has the same idea later finds the reasoning rather than proposing the same work.

Closed by construction. The code path the gap described was deleted entirely. When the pipeline moved from a file-reconstruction approach to an AST-based operation model, six gaps related to reconstruction corruption became unreachable: the code they described no longer existed. The entries were moved to the closed archive with a note linking to the design document that retired them. A reader examining “what happened to the reconstruction bugs” finds the answer without having to trace the git history.

A small sample from the live document:

Gap	Why deferred	Close condition	Outcome
Coder appends duplicate symbols on retry instead of replacing. File grows past capacity, next attempt times out.	Fixture files are 50–200 lines; workaround holds at that scale.	First production-sized file (500+ lines).	Closed by construction. Code path deleted when the pipeline moved to AST-based operations.
No structural gate preventing Coder from writing to test files. Prompt rule only.	Cost per occurrence is low on the fixture.	First real-project run where a test-file write burns an expensive retry.	Closed by fix. Sandbox enforcement added.
Registry-based import-path resolution proposed via Tree-Sitter walkers.	Not yet needed; Planner infers paths from context.	When import resolution accuracy blocks a ticket.	Invalidated. Language server (LSP) provides the same resolution semantically. Do not build.
Mock-target field is LLM-emitted; Planner can hallucinate the module path.	Backstop validator catches the common case.	When the validator’s reject-and-retry loop fails to converge on a real ticket.	Open. Feasibility gate designed but not yet built. Close condition has not fired.

The fourth entry is included because it represents the state most gap entries spend the longest in: open, deferred, with a close condition that has not yet fired. The document’s value is not in the entries that are closed. It is in the entries that are waiting.

What the document is not

The architecture-gaps document is not a backlog, not a sprint board, not a prioritisation tool. It does not answer “what to work on next.” It answers “what is currently being bet will not break, and what would change that bet.”

The document is also not a replacement for a failure-modes document. (Earlier posts on this blog use the two names loosely; this post uses them precisely.) Gaps and failure modes are different kinds of things. A gap is a missing architectural feature. A failure mode is an observed failure. A failure mode can be caused by a gap, but it can also be caused by a model behavioural pattern, a prompt phrasing, or an environmental issue. The two documents cross-reference each other, but they track different dimensions.

Where the document earns its keep

The document earns its keep at two specific moments.

The first is when a new capability arrives and several existing gaps suddenly become relevant. When the language server was integrated, three gaps about import resolution were invalidated in one pass, and two gaps about blast-radius detection moved from “deferred” to “in progress” because the technology they had been waiting for was now available. Without the document, those two gaps would have required someone to remember they existed. With the document, the close condition was already written, and grepping for “LSP” or “language server” surfaced them immediately.

The second is when a decision about scope has to be made under pressure. The pipeline hit a problem where the Planner’s output contained a field that was structurally unsatisfiable, a mock target that could never intercept the call it was supposed to mock. The obvious fix was to mechanise the mock-target field from the registry. The gaps document had an entry that said the underlying problem was not the mock target but the criterion shape, with a close condition tied to the feasibility gate being designed. Without that entry, the quick fix would have shipped and the real fix would have been deferred indefinitely, because the quick fix would have looked like it worked. With the entry, the scope decision was informed by a written record of the actual root cause.

The honest limitation

The document works only if it is read. It is discipline, not enforcement. Writing a gap down does not prevent it from being ignored. It does not guarantee the close condition is checked when the trigger fires. It does not prevent someone from shipping the quick fix anyway.

What it does is make the risk visible.

A gap that is documented and deferred is a conscious bet. A gap that is undocumented and deferred is an invisible one.

The first can be evaluated, challenged, and re-evaluated when conditions change. The second is discovered in production, under pressure, with no prior analysis to accelerate the diagnosis.

The difference is not between perfect and broken. It is between “this was known, and here is why it was deferred” and “this was not known.” That gap in visibility is, in my experience, where most of the expensive engineering surprises come from: not from problems that are hard, but from problems that were known and lost.

What I would tell someone starting one

Start it the first time a decision is correct now but will be wrong later. Do not wait until there is a formal process. Do not wait until there are enough entries to justify a document. One entry is enough if it has the three fields.

The overhead is small. An entry takes a few minutes to write if you write it while the decision is fresh. Writing it later, when the context has faded and the reasoning has to be reconstructed, takes much longer and produces a worse entry.

The three fields are non-negotiable. Without the close condition, the entry is a note. Without the reasoning for deferral, the entry is a task. With all three, the entry is a decision record that remains useful long after the person who wrote it has moved on to something else.

The pipeline runs inside Docker on real tickets. Gap entries referenced in this post are from the project’s live architecture-gaps document, which has over 240 entries spanning the full lifecycle of the system. Still R&D.