From one answer to a loop of actions

A chatbot answers in one shot. An agent does something harder: it can call tools, read what they return, decide what to do next, and loop until a task is actually done. That loop — reason, act, observe — is most of what “agentic AI” means, and understanding it explains both why agents are powerful and why they fail in characteristic, expensive ways.

What makes a model an agent

A plain language model maps a prompt to a response and stops. An agent wraps the same model in a loop and gives it tools — functions it can call, like searching the web, running code, querying a database, or editing a file. The model no longer just answers; it decides what to do next, takes an action, sees the result, and continues. The model supplies the judgment; the loop and the tools supply the ability to act on the world.

The ReAct loop: reason, act, observe

The pattern most agents follow was crystallized by the ReAct paper (2022): interleave reasoning and acting. The model thinks a step (“I need the current price, so I'll search”), emits an action (a tool call with arguments), the system runs the tool and feeds the observation (the result) back into the context, and the model reasons again with that new information. Repeat until the model decides it has enough to answer.

This is why agents can handle tasks a single forward pass can't: they gather information incrementally and correct course based on real feedback, rather than committing to an answer up front. It pairs naturally with test-time compute — both spend extra inference to reach a better result.

Step through a real loop

Words make the loop sound tidy; watching it run shows why it earns its cost. Below is one agent answering a question it can't answer in a single pass — it needs two facts it doesn't have, hits a real tool error partway, and recovers. Step through it and watch three things: which of the three phases the model is in, the context filling up step by step, and the iteration counter climbing.

The loop · every turn the model does exactly one of these

Reason→

Act→

Observe

↻ untilAnswer

The task

Is the Anaheim depot’s current scissor-lift day rate higher than it was in Q2, and by how much?

A single forward pass can’t answer this — it doesn’t know either price. Watch the agent get them.

loop iterations: 0context: ~84 tok

Reason· step 1

— not reached yet —

Act· step 2

— not reached yet —

Observe· step 3

— not reached yet —

Reason· step 4

— not reached yet —

Act· step 5

— not reached yet —

Observe· step 6

— not reached yet —

Reason· step 7

— not reached yet —

Act· step 8

— not reached yet —

Observe· step 9

— not reached yet —

Answer· step 10

— not reached yet —

Trajectory is scripted to illustrate the loop; token counts are a rough chars-÷-4 estimate, not a real tokenizer. A live agent emits the same reason / act / observe structure.

How function calling actually works

Under the hood, tool use is less magical than it looks. The model is given a list of tools with names, descriptions, and argument schemas (typically JSON Schema). When it wants to use one, it doesn't run anything — it emits a structured request: the tool name and a JSON object of arguments. Your code (or the API runtime) parses that, actually executes the function, and returns the result as a new message. The model was trained — via fine-tuning on tool-use traces — to produce well-formed calls and to interpret the results.

So “the model called an API” really means “the model produced text saying which API to call, and trusted infrastructure called it.” That separation is also the security boundary: the model proposes, your code disposes, and whatever you let it call is exactly the blast radius — which is why tool allowlisting and sandboxing matter.

The failure modes — and the cost trap

Loops that act on the world fail in ways one-shot answers can't. Error cascades: a wrong early step poisons every later one, since the bad observation stays in context. Loops: the agent retries the same failing action forever, or oscillates between two. Context bloat: every observation is appended, so long agent runs fill the context window and degrade.

And the one that bites hardest in production: cost blowups. Each loop iteration is a full model call, and tokens accumulate across the whole trajectory, so an agent left to run unbounded can burn an enormous bill on a single task. Hard caps on iterations, token budgets, and timeouts aren't polish — they're load-bearing. An agent without a budget ceiling is a runaway waiting to happen.

Multi-agent orchestration

Beyond a single loop, you can compose multiple agents: a planner that breaks a task into subtasks, specialist agents that each own one, and a synthesizer that combines results. Done well, this parallelizes work and lets each agent keep a focused context. Done carelessly, it multiplies every failure mode above — more loops, more tokens, more places for an error to cascade — at multiplied cost. The discipline is the same: bound it, scope each agent tightly, and verify outputs rather than trusting them.

The throughline: an agent is a model plus a loop plus tools plus guardrails. The model is the easy part. The loop is where the capability comes from — and where the cost and the failures come from too.

EyesInAI·Loading explainers…

Explainers

Agents · tool use

From one answer to a loop of actions

What makes a model an agent

The ReAct loop: reason, act, observe

Step through a real loop

The loop · every turn the model does exactly one of these

Reason→

Act→

Observe

↻ untilAnswer

The task

Is the Anaheim depot’s current scissor-lift day rate higher than it was in Q2, and by how much?

A single forward pass can’t answer this — it doesn’t know either price. Watch the agent get them.

loop iterations: 0context: ~84 tok

Reason· step 1

— not reached yet —

Act· step 2

— not reached yet —

Observe· step 3

— not reached yet —

Reason· step 4

— not reached yet —

Act· step 5

— not reached yet —

Observe· step 6

— not reached yet —

Reason· step 7

— not reached yet —

Act· step 8

— not reached yet —

Observe· step 9

— not reached yet —

Answer· step 10

— not reached yet —

Trajectory is scripted to illustrate the loop; token counts are a rough chars-÷-4 estimate, not a real tokenizer. A live agent emits the same reason / act / observe structure.