Skip to content
AI

Putting an LLM in the Loop Without Losing Control

Autonomous agents are powerful and, left unchecked, dangerous in financial systems. The pattern that makes them safe is simple: the model proposes, a deterministic gate disposes, and everything is explainable.

Adrian Vance

Founder & Managing Partner

6 min read

There is enormous appetite to put AI agents in charge of consequential decisions — routing trades, allocating capital, responding to on-chain events. The technology is genuinely capable. But a model that occasionally does something inexplicable is acceptable in a chatbot and catastrophic in an execution system. The question is not whether to use agents, but how to bound them so that their failure modes are contained and their behaviour is auditable.

The model proposes, the system disposes

The single most important pattern is to separate proposal from authority. The agent, whether a reinforcement-learning policy, an LLM, or a hybrid, only ever proposes an action. A deterministic, well-tested gate decides whether that action is actually permitted. In an execution agent, the policy might suggest sending a child order to a particular venue at a particular size; the gate checks it against hard limits on notional, participation rate, and venue concentration, and rejects or clamps anything outside the envelope. No single model decision can ever exceed the limits, regardless of what the model "wanted" to do.

Degrade to something boring

Agents fail in unfamiliar ways: a data feed goes stale, the input drifts outside the training distribution, an inference call times out. A safe agent treats low confidence and missing data as first-class signals and falls back to a vetted, conservative heuristic, such as a simple adaptive schedule, a hold, or a no-op. The system should always have a boring, predictable behaviour to retreat to, and it should retreat automatically rather than waiting for a human to notice.

Make every decision explainable

A black box that cannot say why it did something will never get past a risk committee, and rightly so. We pair the decision layer with a reasoning layer — often an LLM — that produces a plain-language rationale and post-hoc analysis for each significant action. For an execution agent that means a transaction-cost-analysis report explaining the venues chosen, the impact paid, and the alternative paths considered. This is not decoration; it is what lets compliance, risk, and the desk trust the system enough to widen its mandate over time.

Validate in shadow before you validate with money

Before an agent touches real risk, it should run in shadow mode against the incumbent system, making decisions that are recorded but not executed, so you can compare outcomes order-for-order. We typically run several weeks of shadow operation and only promote the agent once it demonstrably beats the baseline on the metric that matters. Fault injection — killed feeds, lagged venues, adversarial inputs — proves that the gate and the fallback hold under stress, not just in the happy path.

Latency budgets are part of the design

In trading, a decision that arrives late is a wrong decision. The reasoning and training can be heavy, but the inference on the hot path must be fast — often distilled into a compact model that runs in single-digit milliseconds. The architecture separates the slow, offline learning loop from the fast, online decision loop so that explainability and adaptation never come at the cost of execution speed.

Used this way, agents stop being a leap of faith. They become an engineering component with a defined envelope, predictable failure modes and a full audit trail. That is the only form in which they belong anywhere near real capital.

Adrian Vance

Founder & Managing Partner

Founder of Web3Software. Twelve years building distributed systems and capital-markets infrastructure, the last six dedicated to blockchain, on-chain settlement, and quantitative trading platforms for institutional clients.

Subscribe

Get the next deep-dive in your inbox.

Occasional, substantive engineering write-ups from the team. No spam, unsubscribe anytime.

Subscribe to our newsletter

No spam. Unsubscribe at any time.