Code of the Day
AdvancedAgentic Coding

Directing agents effectively

Scope tasks precisely, apply the one-goal principle, checkpoint with git, and recognize when to interrupt a running agent.

Using AIAdvanced13 min read
Recommended first
By the end of this lesson you will be able to:
  • Write an agent task instruction that is scoped precisely enough to produce a reviewable result
  • Explain the one-goal-at-a-time principle and why compound tasks cause drift
  • Use git checkpoints to make agent runs safe and reversible
  • Recognize the signals that indicate an agent is going off track and intervene effectively

Knowing what an agent can do is only half the skill. The other half is knowing how to tell it what to do in a way that produces useful results. Agents follow instructions with machine precision — which means ambiguous instructions produce precisely the wrong output, at machine speed, across many files.

This lesson is about the craft of giving agents work.

From prompting to directing

The prompting skills from earlier in this track transfer directly to agent direction: be specific about the task, supply the context the agent cannot infer, say what you do and do not want. But agent direction has three concerns that conversational prompting does not:

Scope. An agent with file system and shell access can make changes you did not intend. A well-scoped instruction limits the blast radius. "Add a --dry-run flag to the CLI in cli.py, then update the test in tests/test_cli.py to cover it" is better than "add a dry run mode." The first instruction names the files and the scope. The second leaves the agent to decide what "mode" means, which files to touch, and whether to update documentation, the README, the changelog, and the tests while it is at it.

Reversibility. You need a way to undo what the agent did if you do not like it. This is what git is for.

Observation. You need to watch what the agent is doing and be ready to interrupt it. Agents are not always right. An agent that is confidently wrong and running unchecked will be wrong at scale.

The one-goal principle

The single most reliable habit in agent direction is this: give the agent one goal at a time.

It sounds limiting. It is not. A task like "add the feature, write the tests, update the documentation, and refactor the related function" is four goals, not one. Each goal is achievable. Bundled together, they create a task where:

  • The agent must decide the right order (it may guess wrong).
  • A failure in one step can contaminate others.
  • The diff is large and hard to review.
  • Halfway through, the agent may drift from the original intent while trying to satisfy all four goals simultaneously.

The alternative is to run four sequential tasks. This takes slightly longer to direct. The resulting diffs are small and focused. When one step produces a bad result, you revert just that step and try again — not the whole thing.

"One goal" does not mean "one line of code." A goal can be substantial: "implement the full authentication middleware, including the token validation logic and the error responses for expired and invalid tokens." That is complex work, but it is one coherent thing with a clear definition of done. The test is whether you can describe what "done" looks like before the agent starts.

What a good task instruction contains

Apply the same Task/Context structure from the Prompt Crafting module, but with an additional field: definition of done.

Task:    What do you want the agent to do?
Context: What does it need to know that it cannot read from the files?
Scope:   Which files or components should it touch? Which should it leave alone?
Done:    How will we know the task is complete?

In practice, this looks like:

Task: Add a --output flag to the CLI that writes results to a file instead of stdout.

Context: The CLI currently always prints to stdout. The output format is plain text, one result per line.

Scope: Only cli.py and tests/test_cli.py. Do not touch the parser module or the README.

Done: python -m reporter --output report.txt creates the file, and pytest tests/test_cli.py -v passes.

That instruction is longer to write. It is also unambiguous. The agent knows what to build, where to build it, what to leave alone, and how to verify its work.

Checkpointing with git

Before every non-trivial agent task, commit your current state:

git add -p          # stage thoughtfully
git commit -m "checkpoint before agent: add --output flag"

This is your undo button. If the agent's work is not what you wanted, you have two options:

git diff HEAD       # review exactly what changed
git checkout .      # discard all changes and start over

The thirty seconds this costs is one of the highest-value habits in agentic coding. Agents move fast. A task that touches ten files can diverge significantly from your intent before you have a chance to review the first file.

Do not rely on the agent to remember where things stood before it started. Agents do not maintain an internal undo history. Git is your history. Use it.

Recognizing and interrupting drift

Agents do not always stay on the path you intended. Common drift patterns:

Scope creep. The agent notices related code while working on the assigned task and decides to "improve" it. This is often benign but always increases review surface. If you see the agent touching files outside the scope you specified, it is time to interrupt.

Rabbit holes. The agent encounters an unexpected error and spends multiple tool calls trying to solve it — not the original task, but the obstacle. Sometimes this is right. Sometimes it is wasted effort on a problem you could solve in ten seconds with a clarification.

Over-engineering. The agent produces a solution that is technically correct but far more complex than the task warranted. Three hundred lines of code for a twenty- line feature is a signal to stop, revert, and give a more constrained instruction.

Silent assumption changes. The agent makes the requested change but also changes a related thing it noticed — renaming a variable, restructuring a function, updating a comment. These are hard to catch without a careful diff review.

When you see drift starting, interrupt immediately. In Claude Code, press Ctrl-C. Then give a correction:

"Stop. You are modifying parser.py but the task is limited to cli.py only. Please revert parser.py and continue only in cli.py."

The cost of a mid-task interruption is small. The cost of discovering unwanted changes after a twenty-step run is much higher.

Directing agents effectively

  1. 1.
    A developer asks an agent to "add the feature, write tests, update the docs, and clean up the related functions." What is the primary risk of this instruction?
  2. 2.
    Which of the following are signs that an agent is drifting from the intended task? Select all that apply.
  3. 3.
    Committing a git checkpoint before an agent task is only necessary for large tasks — small tasks with a clear scope do not need one.

Where to go next

You know how to give agents good instructions and how to keep runs safe. The final lesson in this module covers the other side of the loop: reviewing agent work — reading the diff, running the tests, and being the quality gate that keeps agentic coding reliable.

Finished reading? Mark it complete to track your progress.

On this page