Agentic Delegation
Agentic delegation means giving Claude a multi-step task and letting it execute autonomously — making decisions, calling tools, and producing results without prompting at each step. This unlocks significant leverage but requires careful scoping, guardrails, and a systematic review process.
What Makes a Task Suitable for Agentic Delegation
Not all tasks benefit from agentic execution. Good candidates share several characteristics:
Good candidates
- Tasks with a clear, verifiable success condition
- Tasks where steps are well-understood and repeatable
- Tasks where mistakes are low-cost or reversible
- Tasks that would take many manual prompts if done step-by-step
- Tasks in a bounded domain (e.g., process all files in this folder)
Poor candidates
- Tasks with ambiguous success criteria (Claude can't tell when it's done)
- Tasks where mistakes are expensive, irreversible, or affect others
- Tasks requiring value judgements or subjective decisions mid-way
- Tasks touching production systems, financial transactions, or personal data
- Tasks where you'd want to review every sub-decision before proceeding
Defining Scope: What Claude Can and Cannot Do
Clear scope boundaries are the most important part of an agentic task prompt. State both what Claude should do AND what it must not do:
- Permitted actions: "You may read and write files in the /output folder. You may call the listed APIs with the provided credentials. You may create new files but not delete existing ones."
- Hard stops: "Do not send any emails. Do not push to any remote branch. Do not modify files in /config."
- Decision boundaries: "If you encounter a file type I haven't listed, stop and ask rather than guessing."
- Ambiguity handling: "When you're unsure whether an action is within scope, ask rather than proceed."
Claude Code and API-based agents enforce scope through tool permissions — only expose the tools Claude is authorised to use. Permission boundaries are more reliable than instruction-only scope controls.
Checkpoints and Approval Gates
For longer agentic tasks, build in explicit checkpoints rather than running fully end-to-end:
- Phase checkpoints: "After completing the research phase, summarise what you found and wait for my approval before proceeding to drafting."
- Count-based checkpoints: "Process 5 files and show me the results. Wait for my approval before processing the rest."
- Exception-based checkpoints: "If you encounter an error or anything unexpected, stop and report rather than continuing."
- Confidence-based checkpoints: "If your confidence in an action drops below [threshold], ask for input rather than guessing."
Checkpoints add latency but significantly reduce the cost of course-correction. A checkpoint after 10% of work is far less expensive than a full redo from the beginning.
Reviewing Agentic Outputs Systematically
Agentic outputs require more structured review than single-turn responses, because multiple decisions were made autonomously:
- Ask Claude to produce a summary of what it did — every action taken, every decision made, in order
- Review the action log against your intended scope: did it stay within boundaries?
- Spot-check a sample of the outputs (not just the final result) — intermediate steps reveal reasoning errors
- Verify the success condition: does the output actually meet the stated goal?
- Check for unintended side effects: files created, APIs called, state changed beyond the intended scope
Asking Claude: "What decisions did you make that I should review?" often surfaces the highest-risk choices for targeted scrutiny.
Failure Recovery: When an Agentic Task Goes Wrong
Agentic tasks fail in different ways than single-turn interactions:
- Scope violation: Claude acted outside the defined boundaries. Assess what was done, undo any actions that exceeded scope if possible, and add more explicit scope constraints to the next attempt.
- Wrong direction: Claude completed the task but misunderstood the goal. Clarify the success condition and retry — the action log from the failed run is useful input for the next prompt.
- Partial completion: Claude stopped mid-task. Determine the last successful step, provide its output as context, and resume from there rather than starting over.
- Compounding errors: An early mistake caused downstream problems. Roll back to a known good state (if possible), identify the root cause, fix the constraint or instruction that allowed the early error, then restart.
After any failure, update the task spec before retrying — the failure reveals a gap in your scope definition or success criteria.
Checklist: Do You Understand This?
- Good agentic candidates: clear success condition, reversible mistakes, bounded domain, repetitive multi-step work
- Define scope explicitly — permitted actions, hard stops, and how to handle ambiguity — before delegating
- Use phase or count-based checkpoints to catch errors early rather than running fully end-to-end
- Review agentic outputs via the action log: what decisions were made, did they stay in scope, any unintended side effects?
- After any failure, update the task spec to close the gap that allowed the failure before retrying