Multi-step agent workflows
Orchestrate complex tasks across many files and phases by treating the agent as an engineer you manage, not a tool you run once.
- Break a complex task into phases and run the agent on each phase independently
- Use the output of one agent task as the structured input to the next
- Describe the orchestrator mental model and which decisions must stay with the human
- Recognise when a task is too large for a single agent run and how to split it
The Agentic Coding module taught you the fundamental loop for a single task: write a good instruction, run the agent, review the diff, verify the tests. That loop works well for features that fit inside one focused session.
Real projects do not always fit inside one focused session.
A production feature might require: updating a database schema, writing a migration, generating model code from the schema, updating API handlers to use the new model, and writing integration tests that cover the whole path. That is five distinct phases, each depending on the previous. Running all five as one agent instruction produces a diff that is very large, very hard to review, and with mistakes in early phases that compound through later ones.
The pattern for this is phased orchestration.
The orchestrator mental model
Think of yourself as a project manager and the agent as an engineer. A good project manager does not hand an engineer an entire feature specification and say "build this." They break the work into deliverables, hand one deliverable at a time, review each before the next starts, and make the architectural decisions themselves.
The agent is the engineer. You are the PM. The work gets done faster when the PM is actively managing the handoffs — not absent.
What you must never delegate to the agent:
- Architectural decisions. Which approach to use, how components are structured, which technology to select.
- Security choices. How authentication works, what gets logged, which data is exposed.
- Understanding what the code does. You need to be able to read and explain the code the agent produced. If you cannot, you cannot review it or debug it.
Everything else can be delegated, in well-scoped pieces.
The phased approach
Here is the multi-step workflow pattern, concrete and repeatable:
Phase 1: Design. Ask the agent to produce a design artifact — a schema, an interface, an API contract, a list of functions — but not the implementation. Review this artifact. It is small, conceptual, and easy to catch problems in. This is where you catch architectural mistakes before they are written into code.
Phase 2: Implement from the design. Give the agent the design from Phase 1 as explicit context. "Here is the schema we agreed on. Implement the migration." The agent now has a precise target, not an open-ended specification.
Phase 3: Test. Give the agent the implementation from Phase 2. "Here is the migration code. Write integration tests that verify the migration runs cleanly on a fresh database." The agent has the actual code to test, not a description of it.
Phase 4: Review each phase before starting the next. This is the step most people skip. The cost of skipping it is that Phase 3 tests test what Phase 2 implemented — including Phase 2's mistakes.
Passing the output of one phase as the explicit context for the next phase is what makes this reliable. Do not assume the agent remembers what it produced in Phase 1 when you start Phase 2. Copy the artifact into your next instruction, or reference the specific file it is in.
A concrete example
Suppose you are adding a new entity — Tag — to a web application. Here is how
the phased approach looks:
Phase 1 instruction:
Design the data model for a
Tagentity. A tag has a name (string, unique per user), a colour (hex string, optional), and a creation timestamp. Output a SQL schema definition only — no migration, no application code, no tests. Just theCREATE TABLEstatement.
Review the schema. Does it have the right columns, types, and constraints? Is the primary key strategy consistent with the rest of your schema? Fix any issues now, before writing a line of application code.
Phase 2 instruction:
Here is the schema we agreed on: [paste the schema].
Write the SQLAlchemy model class for
Taginmodels/tag.py. Follow the patterns inmodels/user.py. Do not create the migration yet.
Review the model. Does it match the schema? Are the relationships defined correctly? Does it follow the same conventions as the rest of the models directory?
Phase 3 instruction:
Here is the
Tagmodel: [pastemodels/tag.py].Write the Alembic migration that creates this table. The migration should be in
migrations/versions/. Check the existing migrations in that directory for the correct format.
Review the migration. Does it produce the table the schema describes? Does the
downgrade() function correctly drop the table?
Each phase produces a small, reviewable artifact. Mistakes are caught before they compound.
When a task is too large
How do you know when a task needs to be split into phases? These signals:
- The instruction takes more than a paragraph to write. Long instructions often describe multiple distinct deliverables bundled together.
- The expected diff touches more than three or four files. Larger diffs are harder to review, and review is your quality gate.
- The task involves a dependency between components. "Write the API endpoint that uses the new model" depends on the model being correct first. Make correctness a checkpoint, not an assumption.
- You are not sure what "done" looks like until you see Phase 1. This is actually the most common case. A design phase resolves ambiguity before the implementation starts.
Phased orchestration takes more agent interactions than one large prompt. It does not take more total time when you account for the cost of debugging large, compounded mistakes. The discipline of small phases is an investment that pays off in review time and reduced debugging.
Chaining outputs
The key technique is treating each phase's output as a first-class artifact:
- Ask the agent to produce only the artifact for this phase.
- Review and, if necessary, edit that artifact.
- In the next phase instruction, paste or reference the artifact explicitly.
Never assume the agent's context window reliably connects Phase 2 to Phase 1. Large projects have large context windows, but even within a session, the agent may summarise earlier outputs rather than recalling them precisely. Explicit reference is more reliable than implicit continuity.
Multi-step agent workflows
- 1.A developer asks an agent to "design the database schema, write the migration, implement the model, and add tests" in one instruction. What is the primary structural problem?
- 2.Which of the following decisions should the human retain and never delegate to the agent? Select all that apply.
- 3.When starting Phase 2 of a workflow, you can rely on the agent to accurately recall what it produced in Phase 1 without explicitly referencing it.
Where to go next
Phased orchestration makes complex tasks manageable. The next lesson extends your control further — hooks and automation let you build an environment that enforces your standards automatically, so the agent operates within guardrails without you having to specify them in every instruction.
Lab: Build a feature with an agent
End-to-end practice — write a CLAUDE.md, give the agent a feature request, review the diff, fix failures, and reflect.
Hooks and automation
Use Claude Code hooks to run linters, tests, and guards automatically on agent events — making the agent's environment opinionated and safe.