Termination Is the First Thing to Design in an Agent, Not the Last

An agent is a loop, and a loop that only knows how to continue is a loop with no way out. Its strength is exactly this: it keeps choosing a next action in light of what the last one returned, so it can pursue goals no fixed script anticipates. Its characteristic failure is the same trait with nothing holding it back. An agent that has never been told what “done” looks like, how much work is too much, or when a problem has left its competence will keep perceiving, reasoning, acting, and observing until something outside it intervenes: a quota, a bill, or a person noticing the damage. How the loop ends is not the epilogue of the design. It is one of the parts the whole system leans on, and it is usually the part left implicit until an unattended agent makes the omission expensive.

Treating this as a performance question, a matter of not wasting compute, undersells it badly. When to stop, when to retry, and when to hand off are reliability decisions, and getting them wrong produces consequences no amount of throughput makes up for. That is the case for designing termination first. An agent whose exits are specified in advance is one you can operate; an agent whose only exit is exhaustion is one you can only demo.

The supervisor you designed out without noticing

The reason termination gets deferred is that it costs nothing during development. Someone runs the agent by hand, watches it work, sees an answer appear, and closes the session. In that setting the exit condition is a human being paying attention, and it works so smoothly that it never registers as a component. What that developer built and tested was not an agent. It was an agent plus an attentive operator, and only one of the two was ever written down. The moment the same code runs in a pipeline or against a stranger’s request, the operator is gone and nothing has taken over the job they were quietly doing. Building that second half on purpose, before the first unattended run rather than after the first incident, is the whole difference.

A loop should end for four different reasons, and success is only one

Most first implementations check for exactly one exit: the work got done. That is the outcome the agent exists to produce, so it is the one the author thinks to detect. It is also the rarest of the reasons a running agent actually needs to quit, and building for it alone leaves the loop helpless in every case that is not the happy path.

Completion is the first reason and the cleanest. The agent has produced the output it was asked for, and no further step remains. The other three are the ones that get skipped. A run can accumulate failures, one after another, to the point where pressing on almost never recovers and usually deepens the hole; past that count the loop should give up rather than keep paying to fail. A run can hit the ceiling its runtime imposed on how much work it is permitted, at which point it has to halt on the spot and return whatever it managed, finished or not. And a run can be told to stop, directly, by an operator or the user, and it has to drop what it is doing and comply rather than insisting on completing its current thought.

The bug that hides in a completion-only loop is quiet precisely because completion behaves well. A repeatedly failing agent never notices, because failing is not finishing. An agent that has vastly overshot a sensible amount of effort does not object, because it was never keeping count. An agent told to stop has nowhere to route the instruction. Each of those three exits has to be a first-class test evaluated on every pass of the loop, sitting alongside “am I done” as an equal. Left out, they go missing in exactly the situations where stopping is the only correct move.

Three separate mechanisms answer three separate questions

The four exits describe when a loop should end. Making them real requires three mechanisms that people routinely blur together under “when the agent stops,” even though they live at different layers and answer different questions.

The first, a termination condition, is a criterion written into the task’s own logic that lets the agent conclude it has succeeded: the answer exists, the file is written, the question is settled. Absent one, the agent has no affirmative reason to stop and every local reason to take one more step. The second, a turn budget, is a ceiling on iterations enforced not by the agent but by the layer running it, and it fires when the count is reached regardless of what the agent believes about its progress. It is an enforced limit, not advice, and it never consults the agent. The third, escalation, is a handoff: when the agent can go no further under its own power, it passes the problem to a person or a fallback path rather than continuing to churn.

These are not three labels for one idea. The completion criterion answers whether the work is done. The iteration ceiling answers whether the agent has run long enough that its own judgment can no longer be trusted to say so. The handoff answers what to do when the agent is neither finished nor safe to keep looping. A serious agent carries all three, because each rescues a case the others cannot see. A completion criterion by itself assumes every run reaches a tidy end. A ceiling by itself stops a runaway but discards any chance of an orderly exit. A handoff by itself has no trigger unless something is watching for the moment to fire it. Specified together, they close every branch by which the loop can end.

An unbounded loop is a failure you design against

When an agent runs forever, the cause is almost always one of a small, familiar set, and none of them requires a strange input to trigger. The plainest is simply the absence of a completion criterion: with no definition of success, the agent finds a reason to continue at every step and never a reason to stop. Close behind is an operation that fails identically every time, retried on a loop with no cap on retries, so the agent spends turn after turn on a call that was never going to return anything else. The hardest to spot is a reasoning cycle, where the model arrives back at the same fork on iteration after iteration, with nothing new in hand to decide it differently, and chooses plausibly and identically each time while the work goes nowhere. Nothing raises an error. The agent just circles.

These are not corner cases waiting on an adversary. They surface on ordinary tasks, during ordinary development, the first time an agent is handed enough latitude to reach them. What makes them tractable is that the countermeasure is fixed and known ahead of time: a real completion criterion together with a firm cap on iterations, present in the loop before its first run. That is worth saying plainly, because it overturns the usual reflex. An unbounded loop is not a defect you locate after it shows itself. It is a category you rule out by construction, by refusing to ship a loop that has no bound. The cap costs almost nothing and forecloses a class of failure that, left open, arrives eventually with certainty.

The turn budget only works if the runtime enforces it

One decision inside the turn budget decides whether it protects anything at all: who counts. The count and the cutoff have to sit in the code that drives the agent, never in the agent’s own reasoning. The tempting alternative is to tell the model to keep a tally of its passes and quit once it has spent enough, and as a safety mechanism this fails on inspection. A model asked to police its own iteration count is exactly as dependable as the model, subject to the same drift, context loss, and misreading that made a hard limit necessary to begin with. Handing the confused party responsibility for detecting its own confusion is circular by design.

Enforcement in the runtime has the property the model lacks: it is deterministic. The runtime holds the count as an ordinary variable, tests it against the configured maximum on each pass, and when the maximum is reached it stops the agent and returns whatever partial work exists. It does not deliberate about whether stopping is warranted. It stops. That indifference is the entire value. A limit that lives in code still holds when the agent’s judgment has come apart, which is the exact circumstance in which the limit has to hold.

What the maximum should be is a property of the task, and it deserves a deliberate choice rather than a default. A narrowly scoped lookup may honestly warrant only a few passes, and setting the bound near that turns a stuck run into a fast, cheap failure instead of a slow, costly one. A genuinely open-ended investigation may warrant many times as many. The figure encodes a judgment about how much work the task ought to require, made in advance by someone who understands it. What matters is less the exact number than the fact that a number exists. A loop with a bound has a worst case you can state and reason about. A loop without one has a worst case set by whatever its surroundings will absorb before they step in.

Spinning looks like work and moves nothing

A turn budget will catch a runaway, but only after however many wasted passes it takes to reach the ceiling. The more useful capability is noticing early that an agent has quit making headway, and that is hard because a stalled agent almost never looks stalled. It looks occupied. It is issuing calls, emitting tokens, accumulating context. Telling genuine advance apart from busy stagnation is the skill worth developing.

Advance leaves a trace on every pass. Some fact the agent did not have arrives. A call comes back carrying something other than what the last one carried. The unresolved part of the problem gets smaller, and the distance to the goal shrinks pass over pass. Stagnation produces the motion without the trace. The same sorts of calls go out and the same sorts of results come back. The approach holds still. Context piles up while the separation between where the agent is and where it needs to be refuses to narrow. A stuck agent spends resources at the identical rate as a productive one and delivers nothing, and the one dependable way to distinguish them is to ask whether the state is actually changing.

Catching this takes intentional instrumentation, though the instrumentation can be crude and still earn its keep. Diffing one result against the one before it covers a lot of ground: when consecutive results come back all but identical, nothing is being learned. Noticing when the agent reissues the same call with the same arguments covers more; a query fired three times unchanged is not investigation, it is a stall. Checking whether the distance to the goal has moved across the last handful of passes catches the slower version, where each step looks a little different but the whole makes no progress. Even a bare counter that trips once recent results stop changing will head off most spinning before the cost mounts. Sophistication is optional. Having any check at all is not.

The choice between another pass and a handoff is a design decision

An agent that runs into a wall it cannot immediately clear is standing at a real fork, and both directions are legitimate depending on the situation. One direction is to try again, with a new input or a changed approach. The other is to hand off, to a person or a fallback. Which direction is correct is not a matter of the agent’s disposition; it turns entirely on whether another attempt could plausibly accomplish anything.

Another attempt is warranted when the agent is holding something it was not holding before. The previous result hinted at a better move. A failure looks transient and might clear on a second try. An overly broad first pass suggests a narrower one worth making. In each of these the next iteration is not more spinning, because the agent has gained something that would change what it does. A handoff is warranted when that something has run out. The agent has reached the limit of what it is permitted to do. It needs information it has no means of getting. The call belongs to a human by its nature. In cases like these, further iteration cannot help as a matter of structure, and looping on is just a slower, more expensive way of quitting.

The point that matters is that this fork has to be drawn deliberately into the agent, not left to whatever the model happens to do. And a handoff has to be treated as a built feature, not an admission of defeat. An agent that recognizes it has reached its limit and steps aside cleanly earns more trust than one that keeps circling on the chance that things sort themselves out. The clean handoff is a valve you install on purpose. Leaving it out is what converts a solvable situation into an agent that either loops until its budget kills it or, worse still, commits to an action no one authorized.

A designed exit is what separates an agent from a runaway process

The consequences of getting this wrong are not measured in wasted cycles. An agent that cannot stop can drain the quotas and budgets it draws on, turning a logic error into an open-ended cost with no ceiling of its own. An agent that cannot hand off can commit to actions no one had the chance to refuse, doing things in the world a person would have blocked, for the sole reason that the design never created the moment to ask. And an agent with no stated exits is opaque in a way that worsens everything else: with no principled boundary on its behavior, there is nothing firm to reason against when you try to understand it, audit it, or track down what it did.

The opposite is what makes termination worth putting first. An agent with defined exits is predictable. When it stops, the reason is plain, and its record shows what it was attempting, how far it got, and where it came to rest. That transparency is not a courtesy. It is the precondition for ever improving the thing, because behavior you cannot account for is behavior you cannot fix. Reliability is usually pictured as the work of making the right outcomes happen. Fully as much of it is making sure the wrong ones end, on terms chosen beforehand, cleanly enough that the reason is visible afterward. A design is not finished while the question of how it stops is still open, because a loop with no exit anyone designed is not an autonomous system. It is an unbounded process that has not yet run into the edge of what its environment will bear.