← ThisIsTheWay.to

Build with AI Agents

The Five Levers

What you can change to improve agent behavior

When an agent misses, the fix usually maps to one of five levers. Understanding these gives you a vocabulary for diagnosing problems and choosing interventions.

1. Observability

Can the agent see what it needs to see? This includes access to source files, schemas, logs, and runtime state.

Most hallucinations trace back here. The agent invents an API field because the schema wasn't in scope. It guesses a file path because the directory structure wasn't visible. It fabricates a CLI flag because the help text wasn't accessible.

The fix isn't "try harder"—it's making the source of truth discoverable. Put the schema in the repo. Add a manifest that lists valid paths. Ensure documentation is where the agent will look for it.

Reach for this when: the agent confidently invents things that don't exist, or misses context that seems "obvious" to you.

2. Instructions

Does the agent know what "good" looks like? Instructions encode norms, constraints, and conventions.

This is where you capture the implicit rules that live in your head. "We always use named exports." "Error messages should include the operation that failed." "Never commit directly to main." These feel obvious—until the agent violates them because nobody wrote them down.

Instructions work best when they're specific and scoped. A wall of text gets ignored. A short rule at the right moment changes behavior. Put instructions close to where they matter: in the repo, in the project config, in a skill definition.

Reach for this when: the agent does something "wrong" that isn't technically incorrect—it just violates a norm you haven't articulated.

3. Tools

Can the agent take the right actions? Tools extend what the agent can do—fetching data, running commands, interacting with systems.

Sometimes the agent knows what it needs but can't get there. It needs to query a database, hit an API, run a build, or check the state of a service. Without tools, it's forced to guess or ask you to do it manually.

Good tools are focused and composable. They do one thing well and return structured output the agent can reason about. A tool that dumps too much data is almost as bad as no tool—the signal gets lost in noise.

Reach for this when: the agent needs runtime information it can't get from static files, or when manual copy-paste is becoming a bottleneck.

4. Guardrails

What should the agent not do? Guardrails constrain behavior, preventing classes of mistakes before they happen.

Freedom is expensive. An agent with unlimited scope will eventually wander into dangerous territory: modifying files it shouldn't touch, running commands with side effects, making changes that are hard to reverse.

Guardrails work by shrinking the action space. Allowlists beat blocklists—it's easier to say "only touch these directories" than to enumerate everything that's off-limits. The best guardrails are invisible when you're doing the right thing and loud when you're not.

Reach for this when: the agent keeps making the same category of mistake, or when the cost of a wrong action is high enough that you'd rather prevent it than detect it.

5. Verification / Evals

How does the agent know it succeeded? Verification provides feedback loops—tests, linters, visual checks, evals.

Without verification, the agent is flying blind. It can write code that looks plausible, claim it's done, and move on—never knowing whether the change actually worked. This is how "taste-misses" and "process-misses" happen: technically complete, practically broken.

Verification closes the loop. Tests prove behavior. Linters enforce style. Type checkers catch structural errors. Visual evals catch UI regressions. The more immediate the feedback, the better—a failure the agent sees during the task is worth ten it discovers later.

Reach for this when: the agent claims "done" but the result doesn't match expectations, or when you're catching errors in review that should have been caught automatically.

The Loop

  1. 1

    Miss

    Capture what went wrong

  2. 2

    Diagnosis

    What didn't it see?

  3. 3

    Primitive

    Which lever to pull

  4. 4

    Artifact

    Encode the fix

  5. 5

    Gate

    Enforce when ready