This essay addresses a mistake that has become easy to make and hard to see. As artificial intelligence systems grow more fluent and more autonomous, we increasingly describe their failures as if they were failures of understanding, alignment, or intent. We ask why a model “decided” what it did, or who should be blamed when its actions produce harm. These questions feel natural. They are also increasingly misdirected.
In earlier work, I argued that we have crossed a structural threshold: inference and evaluation no longer travel together. Reasoning can now be generated outside the human mind, at scale and at speed, without carrying the conditions that once anchored judgment. This essay takes the next step. It argues that our persistent confusion about agency in AI systems is not a failure of explanation, but a category error about where judgment now resides.
The Error That Persists After the Break
Large language models are widely understood not to “understand” what they say. It is equally understood that they do not judge. And yet, when something goes wrong, the same question returns with remarkable consistency: Who decided this?
The answers we reach are rarely satisfying. Sometimes the model is named. Sometimes the user. Sometimes the system as a whole. Each of these responses points to something real, but none of them locates judgment. What they reveal instead is a persistent category error—one that survives even after the separation of inference and evaluation has been acknowledged.
Gilbert Ryle illustrated such an error with his well-known example of a visitor to Oxford. After being shown the colleges, libraries, and laboratories, the visitor asks where the university itself is. The mistake, Ryle argued, was not ignorance of the facts but confusion about the kind of thing a university is. It is not an additional object alongside its buildings; it is the organized system formed by them. Asking where it is amounts to misclassifying an institution as a thing.
A structurally identical mistake now appears in discussions of AI. We point to the model, the interface, the human prompt, or the execution layer and ask where the agent is. The assumption underlying this search is familiar: that judgment must reside somewhere concrete and local, because under earlier conditions it usually did. But those conditions no longer hold.
This persistence is not accidental. It follows directly from a change in the architecture of agency. In AI’s Machiavellian Moment, I argued that we have crossed a threshold in which inference and evaluation no longer travel together. Large language models generate fluent, coherent continuations without possessing the evaluative machinery that once stabilized reasoning inside a single mind. This fracture is not a defect to be corrected; it is a structural transformation in the conditions under which judgment operates.
Once inference is externalized, the inherited picture of agency becomes unstable. Language can continue without commitment. Intent can be voiced without authority. Execution can proceed without reflection. Judgment does not disappear. It relocates.
The difficulty we now face is not the absence of judgment, but the persistence of an older habit of looking for it where it can no longer be found.
Traditional Agents and the Collapse of Judgment
Most autonomous systems still follow a familiar pattern. Perception, decision, and action are fused into a single loop. The system observes the environment, selects a course of action, executes it, and then observes the result. When such a system fails, responsibility collapses neatly into the agent itself. The agent made a bad decision. The model hallucinated. The system chose poorly.
This collapse is not accidental. It is the architectural inheritance of a world in which judgment and execution were assumed to coincide. When a single process both decides and acts, failure appears local. There is a clear subject to blame, correct, or retrain. Accountability remains psychologically and institutionally legible, even if it is often misplaced.
Agentic AI systems increasingly strain this pattern. As soon as inference is externalized and scaled, the loop begins to fracture. Human intent enters as language rather than command. Model output arrives as continuation rather than commitment. Execution occurs through mechanisms that neither intend nor evaluate. Yet many systems continue to present themselves as unified agents, encouraging the same attribution of judgment even as the conditions that once sustained it erode.
This is why contemporary discussions of agent failure so often stall. We add logging. We add explanations. We add reasoning traces. But the underlying assumption remains intact: that if we could only see the chain of decisions clearly enough, judgment would reveal itself somewhere inside it.
What this misses is that judgment does not reside in the chain. It resides in the structure that determines which links in that chain are permitted to bind the world.
As long as perception, inference, and execution are architecturally fused, that distinction remains invisible. Failure appears as error rather than authorization. Responsibility collapses inward. The system looks agentic because we have not yet forced ourselves to see where authority actually lives.
Hybrid intelligence makes that evasion harder to sustain.
Cubert and the Limits of Traceability
The difficulty of tracing decisions in agentic systems has not gone unnoticed by practitioners. Those building autonomous systems have discovered, often through failure, that fluent behavior does not imply intelligible judgment. Decision paths are instrumented. Planning is separated from execution. Intermediate state is recorded. Systems are rendered observable in an effort to understand why they behave as they do. Cubert (developed by Tony Foster) emerges from this sensibility—not from its absence.
Cubert is deliberately modest. It operates in a bounded environment with visible hazards, limited actions, and irreversible consequences. It separates inference from execution and constrains communication between them through explicit interfaces. Behavior can be reconstructed. State can be inspected. Nothing is hidden.
And still, when Cubert fails—when it acts on stale state, prioritizes speed under pressure, or enters a visible hazard—the same question arises: where was the judgment that allowed this to occur?
The difficulty is not obscurity. The sequence of events can be recovered. One can see what the model inferred, what actions were requested, and how the environment responded. What remains difficult to locate is not behavior, but authority.
Cubert does not fail because a model “decides badly” in isolation. The model produces continuations it is permitted to produce. The execution layer carries out actions it is permitted to carry out. Human intent enters deliberately underspecified. None of these components, taken alone, authorize the outcome.
Authorization occurs elsewhere.
In systems of this kind, inference, execution, and evaluation occupy distinct roles. One component generates continuations; another enacts them in the world. Judgment does not reside in either. It resides in the structure that governs when continuation is permitted to become action. In Cubert’s own architecture, these roles appear as the Brain, the Body, and the system that authorizes their interaction. The distinction is not terminological but structural: Brain and Body can be inspected independently, while judgment appears only in the conditions under which their interaction is allowed to proceed.
Logs can tell us what happened. They cannot tell us who was empowered to accept the risk.
This is where traceability reaches its limit. Logs describe what happened. Metrics describe frequency. Explanations describe internal consideration. None of these, by themselves, specify who or what was empowered to accept the risk.
Cubert makes this visible precisely because it refuses to collapse judgment into execution. By separating inference from action, it exposes the space in which authorization must occur. When that space is left implicit, responsibility diffuses. When it is named, governed, and constrained, judgment becomes locatable again.
The confusion this produces is not technical. It is categorical. We continue to search for judgment inside components—models, interfaces, operators—because those are the places earlier systems trained us to look. Hybrid systems require a different orientation. The question is no longer whether a system can explain its reasoning, but whether it specifies, in advance, the conditions under which reasoning is permitted to bind the world.
The Evaluation Layer as Institution
Once we stop looking for judgment inside components, the question changes. We are no longer asking which part of the system decided, but what structure authorized the decision to become binding at all. That structure is not cognitive. It does not reason, infer, or experience. It evaluates.
In hybrid systems, judgment resides in the evaluation layer: the institutional arrangement that determines which continuations are permitted to pass into action, under what conditions, and with what acceptance of consequence. This layer does not generate language. It governs its force.
Crucially, the evaluation layer is not a function that can be folded back into the model, nor a responsibility that can be deferred to human intent. The model proposes; the human pressures. Evaluation authorizes. When that authorization is implicit, judgment still occurs—but without ever being named, owned, or constrained.
This is why the search for agency inside the “AI” repeatedly fails. We are asking a question appropriate to an earlier configuration of mind and action. In that configuration, the same agent typically inferred, judged, and acted. Hybrid intelligence breaks this unity. Inference can be generated without commitment; intent can be voiced without authority; execution can proceed without reflection. What binds these elements together is not intelligence, but institutional permission.
The evaluation layer performs the work that institutions have always performed, even when that work went unnoticed. It sets thresholds. It enforces halts. It determines when uncertainty is tolerable and when escalation is required. It decides which risks may be accepted silently and which demand justification. These are not computational operations. They are constitutional ones.
To understand this, it helps to abandon the language of “agents” altogether. The system that authorizes action is not an agent among others; it is the University in Ryle’s sense—the organized structure that gives actions their standing. Looking for judgment in the model or the operator is like searching for the university among its buildings. The error is not technical. It is categorical.
Once this structure is named, a number of confusions dissolve. We no longer need to pretend that a model “chose” in the human sense, or that a user’s intent carried binding force simply by being expressed. Agency attaches where authorization occurs. Responsibility follows from that attachment.
This is also where the limits of purely technical remedies become apparent. Interpretability cannot substitute for explicit evaluation. Transparency cannot resolve ambiguity about authority if the system never specifies who or what is empowered to decide. Where no evaluation layer is articulated, hybrid systems continue to exercise judgment implicitly—through defaults, timing, and omission.
Judgment has not vanished. It has become structural.
The Trade-Off Hybrid Intelligence Imposes
Human institutions have always depended on a certain degree of ambiguity. Judgment could remain partially implicit because responsibility was socially negotiable. When something went wrong, intent could be clarified after the fact. Norms could be invoked retroactively. Accountability could be distributed, delayed, or softened by discretion.
This flexibility was not an ethical failure. It was an affordance of human-led systems. Judgment lived close enough to action that social repair was possible.
Hybrid intelligence removes that affordance.
Once inference is automated and execution is delegated to systems that operate at speed and scale, judgment can no longer be smuggled in informally. There is no shared pause in which uncertainty can be absorbed. No tacit understanding that intentions will be re-interpreted charitably. No stable assumption that someone will step in before consequences harden.
The trade-off is therefore not between autonomy and control, or efficiency and safety. It is more basic than that.
Either judgment is specified explicitly as an institutional function, or it will still be exercised—silently, implicitly, and without ownership.
There is no third option.
Hybrid systems do not eliminate judgment. They displace it into defaults: into what is allowed to proceed automatically, into what is never escalated, into what is treated as “normal operation.” These defaults function as decisions precisely because they authorize action without reconsideration. When no one is required to say yes, the system says yes by construction.
This is why attempts to treat agentic AI as “just a tool” fail so consistently. Tools do not authorize outcomes; institutions do. As soon as a system interprets intent, sequences action, and commits effects to the world, it is exercising institutional authority whether or not that authority has been named.
In human-led systems, it was often possible to pretend otherwise. We could attribute failures to misunderstanding, to bad actors, to unforeseeable circumstances. Hybrid intelligence collapses those escape routes. When a system acts repeatedly, consistently, and at scale, its patterns become policy. Its omissions become permissions. Its speed becomes authority.
Hybrid intelligence removes the institutional luxury of implicit judgment.
The temptation, in response, is to retreat into reassurance: to insist that humans remain “in the loop,” that oversight exists somewhere, that governance can be layered on later. But these gestures reproduce the same mistake at a higher level. Oversight that is not structurally binding is not oversight. Evaluation that can be bypassed under pressure is not evaluation. Responsibility that cannot be located before action occurs is not responsibility at all.
What hybrid intelligence demands is therefore not better intentions, but explicit constitutional design. The evaluation layer must be named, bounded, and empowered to halt, escalate, or refuse action. It must be able to say no—not as an exception, but as a function of its role.
This is the trade-off we face.
We can preserve the convenience of implicit judgment and accept the emergence of unowned authority. Or we can accept the friction of explicit evaluation and retain the possibility of accountable agency. Hybrid intelligence does not allow us to have both.
Artifacts of Evaluation
If judgment must be made explicit, it must also take form. Institutions do not evaluate by introspection; they evaluate by producing artifacts that bind authority to consequence over time. Laws, charters, minutes, rulings, and budgets are not administrative residue. They are how institutions speak responsibly.
Hybrid intelligence requires analogous artifacts.
Decision ledgers, review workflows, and governance-facing documents are often dismissed as bureaucratic overhead—records of what has already happened. This misreads their function. These artifacts do not merely record judgment; they perform it. They force evaluation to occur at the moment authority is exercised, rather than after the fact.
A decision ledger, for example, is not a log of reasoning steps. It is a commitment. It requires someone to state what is being authorized, under what assumptions, with what risks accepted, and with what alternatives explicitly rejected. It binds a continuation to a responsible agent or body. It creates institutional memory where individual memory would otherwise dissolve under tempo.
Similarly, governance workflows—such as board-facing decision decks—are not instruments of persuasion alone. At their best, they function as evaluative checkpoints. They slow inference long enough for judgment to intervene. They require claims to be admissible, assumptions to be surfaced, and consequences to be owned. They transform fluent narrative into accountable commitment.
These artifacts matter because hybrid systems act across time. Inference may be generated in seconds, but its effects persist. Evaluation that is not preserved cannot govern behavior that unfolds longitudinally. Without durable evaluative records, institutions re-litigate the same decisions repeatedly, or worse, allow them to fade into default. Authority migrates into habit. Judgment becomes invisible again.
What these artifacts provide is not certainty, but legibility. They make it possible to answer questions that hybrid systems otherwise obscure: Who authorized this action? What conditions were considered sufficient? What risks were knowingly accepted? Where could intervention have occurred?
This is why such artifacts cannot be treated as optional safeguards or post-hoc compliance measures. In hybrid intelligence, they are the means by which institutions retain agency at all. Without them, systems still act—but no one can say, with confidence, who acted.
It is tempting to imagine that improved models, richer explanations, or more refined interfaces will render such structures unnecessary. This temptation repeats the original mistake. No amount of inference, however fluent, can substitute for evaluation. No amount of transparency can replace authority.
The choice is not between automation and bureaucracy. It is between implicit judgment that cannot be contested and explicit judgment that can.
Decision ledgers, evaluative workflows, and governance artifacts are how institutions accept the burden that hybrid intelligence imposes. They are not solutions to intelligence. They are commitments to responsibility.
Closing — What Persists
Inference is now cheap, fast, and portable. The familiar anchors of institutional continuity have already begun to dissolve. Language no longer stabilizes meaning. Tools no longer confer advantage. Even expertise, once embodied in individuals or entrenched workflows, has become fluid as systems reproduce its surface forms with ease. What appears, from the outside, as acceleration is experienced internally as instability.
Institutions have long responded to such instability in a predictable way. They do not seek more intelligence. They seek continuity: some structure that preserves why decisions were made, where judgment halted rather than continued, and which risks were knowingly accepted. When everything can be regenerated, replayed, or replaced, what matters is not what a system can produce, but what it has learned to treat as significant.
This is where continuity settles once inference commoditizes. Not into data, which can be extracted. Not into models, which can be swapped. But into the accumulated evaluative structure formed through use: the thresholds, escalations, tolerances, and refusals that emerge only when a system participates in real decisions over time. What persists is not intelligence, but judgment—externalized, scaffolded, and remembered.
The ghost, in this sense, is no longer hidden in the machine. It is the machine: not as cognition, but as institution. In a world where language can continue without responsibility, the systems that endure are those in which judgment has been made explicit, continuous, and difficult to abandon. Everything else is already easy to leave.
Amanda Ross writes about worldhood, judgment, and hybrid intelligence.

