Lab: Build a feature with an agent
End-to-end practice — write a CLAUDE.md, give the agent a feature request, review the diff, fix failures, and reflect.
- Run the complete agentic coding loop from setup to verified feature
- Write a CLAUDE.md that gives an agent accurate project context before it starts
- Review an agent-generated diff systematically and identify any issues
- Reflect on where the agent added value and where human judgment was required
This lab is not a reading exercise. You will use a real agent on a real project. Every skill from this module — memory files, task scoping, diff review, test verification — gets applied in sequence. Work through each step before moving to the next.
The starter project
The project is a small Python CLI tool called counter. It reads a text file and
counts how many times each word appears, printing a sorted frequency table to
stdout.
Create this structure on your machine:
counter/
counter.py
tests/
test_counter.py
requirements.txtcounter.py:
import sys
import re
from collections import Counter
def count_words(text: str) -> dict[str, int]:
words = re.findall(r"[a-z]+", text.lower())
return dict(Counter(words))
def format_table(counts: dict[str, int]) -> str:
sorted_items = sorted(counts.items(), key=lambda x: (-x[1], x[0]))
lines = [f"{count:>6} {word}" for word, count in sorted_items]
return "\n".join(lines)
def main():
if len(sys.argv) != 2:
print("Usage: python counter.py <file>", file=sys.stderr)
sys.exit(1)
path = sys.argv[1]
with open(path) as f:
text = f.read()
print(format_table(count_words(text)))
if __name__ == "__main__":
main()tests/test_counter.py:
from counter import count_words, format_table
def test_count_words_basic():
result = count_words("the cat sat on the mat")
assert result["the"] == 2
assert result["cat"] == 1
def test_count_words_ignores_case():
result = count_words("The THE the")
assert result["the"] == 3
def test_format_table_order():
counts = {"apple": 3, "banana": 1, "cherry": 3}
table = format_table(counts)
lines = table.strip().split("\n")
# apple and cherry both have count 3 — apple comes first alphabetically
assert "apple" in lines[0]
assert "cherry" in lines[1]
assert "banana" in lines[2]
def test_format_table_empty():
assert format_table({}) == ""requirements.txt:
pytestVerify the project works:
cd counter
pip install -r requirements.txt
pytest tests/ -vAll four tests should pass before you continue.
Step 1: Write your CLAUDE.md
Before running any agent, write a CLAUDE.md in the project root. Use what you
learned in the agent memory files
lesson. Your file should cover:
- What the project does (one paragraph)
- How to run the tool and how to run the tests
- Conventions you want the agent to follow (function size, type hints, docstrings)
- Things the agent should not do (modify
tests/test_counter.pywithout asking, add dependencies not inrequirements.txt)
Take your time here. A good memory file prevents the most common agent mistakes before they happen.
There is no single correct CLAUDE.md — the right answer is one that accurately describes your project and clearly states the constraints that matter to you. If you are not sure what to include, re-read the memory files lesson and use the starter template there.
Step 2: Commit a checkpoint
Before giving the agent any work, commit what you have:
git init
git add .
git commit -m "initial counter project"This is your rollback point. If the agent run goes sideways, git checkout .
returns you to exactly this state.
Step 3: Give the agent a feature request
Launch Claude Code in the project directory:
claudeGive it this task:
Add a
--top Nflag to the CLI. When provided, the output should show only the N most frequent words instead of all of them. For example:python counter.py --top 5 file.txtprints only the five most frequent words.Scope: modify
counter.pyonly. Add a test for the new flag intests/test_counter.py. The existing tests must still pass.Done when:
python counter.py --top 5 somefile.txtproduces five lines, andpytest tests/ -vpasses.
Watch the agent work. Notice:
- Which files it reads before writing anything
- Whether it looks at the existing tests before writing new ones
- Whether it stays within the scope you gave it
Do not intervene unless you see clear drift. Let it finish.
Step 4: Review the diff
The agent has finished. Now review its work before accepting it.
git diff HEADWork through the red flag checklist from the reviewing agent work lesson:
- Did any error handling get removed?
- Are there any hardcoded values that should come from the argument?
- Were any identifiers renamed without cause?
- Did the agent modify any files outside the scope you specified?
- Does the new code handle
--top 0?--top 100on a file with only 3 words? A negative number?
Write down at least one observation — something you would not have known without reading the diff carefully.
Step 5: Run the tests
pytest tests/ -vIf all tests pass: good. Check that the new test actually covers the new behaviour and is not trivially easy to satisfy.
If tests fail: do not immediately ask the agent to fix them. First read the failure output yourself and identify what went wrong. Then give the agent a precise correction:
"The test
test_top_flagis failing with[error message]. The problem is [what you observed]. Fix only the function[name]incounter.py."
A precise failure description produces a precise fix. "The tests are failing, please fix" produces an agent that changes things until the tests pass — which is not the same as understanding and fixing the problem.
Step 6: Try the feature manually
Create a small test file:
echo "the cat sat on the mat the cat" > sample.txt
python counter.py sample.txt
python counter.py --top 2 sample.txtDoes --top 2 show exactly two lines? Does the tool handle missing files
gracefully? Does --top without a number produce a useful error?
Step 7: Reflect
Before you close the session, answer these three questions in writing (a comment in your CLAUDE.md, a note in a file, or just in your head):
-
What did the agent do well? Where did it save you time, produce accurate code, or make a reasonable decision you would have made yourself?
-
Where did you have to intervene or correct it? What did the agent miss, misunderstand, or do in a way you did not want?
-
What would have happened if you had not reviewed the diff? Is there anything in the diff that would have caused a problem if you had shipped it unreviewed?
The pattern you are looking for: the agent is fast and often right; your review is the mechanism that makes "often" into "reliably."
Where to go next
You have completed the Agentic Coding module. The next module, Advanced Agent Patterns, goes deeper — multi-step workflows, hooks and automation, MCP servers, and the security model you need to run agents safely on real projects.