Building Effective Agents · 2026.04

Agentic AI
Patterns Guide

Eight core patterns for designing production LLM systems.
When to use which, how to combine them, and what their failure modes are.

8
Core Patterns
3
Multi-agent Topologies
10+
Hand-crafted Diagrams

Workflow vs Agent

Workflows orchestrate LLMs along predefined paths. Predictable and easy to debug.
Agents let the LLM decide tools and paths by itself. Flexible, but more expensive and error-prone.

Most problems are solved with a single LLM call + RAG + a well-written prompt. Add complexity only when there is a measurable performance gain. This guide is based on Anthropic's "Building effective agents" taxonomy.

🤖
See also
Want to learn what agents are and how they've evolved from the ground up? The AI Agent Complete Guide covers 7 maturity levels and Memory/RAG/Guardrails architecture in depth.
Agent Guide →

8 Core Patterns

Input / User LLM Call Decision / Check Tool / Worker Memory / Storage Success Output Error / Exit
00
Augmented LLM — the foundation block
Foundation

The smallest unit of an agentic system: LLM + Tools + Memory + Retrieval. Every pattern below reduces to the question of how to compose multiple of these blocks.

🧱 Building Block · Hub & Spokes The atomic unit for every pattern
🔧
Tools
function calls
web db api
💾
Memory
state
short long
🧠
LLM
reasoning core
📚
Retrieval
RAG · search
vector bm25
Core Principle
The LLM decides when and how to invoke tools / memory / retrieval on its own
Interface
Function calling · Tool use API · Retrieval augmentation
Why It Matters
Every pattern below is a question of how to compose this block
01
Prompt Chaining
Sequential Simple

Break a task into sequential steps, where each LLM's output becomes the next one's input. Insert gates (validators) in between to halt or retry on failure.

📝 Sequential Pipeline Example: Marketing Copy Generation
📥
Brief
input
New earbuds · Gen Z target
1
📋
Outline
LLM · structure
Hook / features / CTA structure
🚦
Gate
check
length tone keywords
↩️
Retry
on fail
Regenerate outline or abort
pass ✓
2
✍️
Draft
LLM · write
Writes draft from outline
3
🔍
Refine
LLM · polish
Grammar & tone polish
Copy
final
Finished copy
When to use
When the task decomposes into a fixed set of steps
Canonical example
Marketing copy: outline → draft → translate → review
Failure mode
Errors in early steps propagate downstream → gates are mandatory
02
Routing
Classifier Simple

Classify the input and route it to the right specialized handler. Each handler stays focused on its domain, so quality goes up.

🧭 Classifier + Specialized Handlers Example: Customer Support Ticket Routing
Incoming tickets
💳 "I paid — can I get a refund?"
🐛 "Login button doesn't work 500 error"
❓ "Where can I change my account password?"
🧭
Router LLM
intent classifier
Haiku · lightweight
💳 Billing / Refund
💰
Billing Handler
Sonnet
Queries payment DB · cites refund policy
conf
94%
🐛 Technical / Bug
⚙️
Tech Handler
Sonnet + Tools
Pulls logs · asks for repro steps
conf
88%
❓ General / FAQ
📖
FAQ Handler
Haiku + RAG
FAQ vector search · concise answers
conf
79%
When to use
When inputs fall into clearly distinct categories that need different handling
Canonical example
Customer support: refund / technical / general → specialized agents
Tip
Small model (Haiku) for Router, right-sized model for Handlers → lower cost
03
Parallelization
Parallel Sectioning · Voting

Split the task and run pieces in parallel, then aggregate. Sectioning = split into independent subtasks. Voting = run the same task N times and take the majority.

Parallel Execution Two variants: Sectioning / Voting
1. Sectioning Independent aspects Review different aspects in parallel, then merge
📄
PR Diff
input
+124 / -87 lines
🔒
Security Review
SQL injection, XSS, auth check
🚀
Performance Review
N+1, Big-O, memory leaks
🎨
Style Review
naming, lint, readability
📊
Merged Report
aggregated
Integrated issues per aspect
2. Voting Higher reliability Run the same task N times → majority vote
📝
Claim
fact-check
"Vaccines cause autism"
Run 1
🧠
FALSE · Wakefield paper retracted
Run 2
🧠
FALSE · Large CDC study
Run 3
🧠
UNCLEAR · insufficient info
Run 4
🧠
FALSE · meta-analysis result
🗳️
Vote
3/4 FALSE
Final: FALSE
confidence 75%
When to use
When subtasks are independent or you need higher reliability
Canonical example
Code review: security / performance / style inspected in parallel
Benefit
Lower latency + separate contexts let each aspect focus deeply
04
Orchestrator-Workers
Dynamic Advanced

A central Orchestrator LLM dynamically decomposes a task at runtime and delegates subtasks to workers. Best for problems where subtasks cannot be defined in advance.

🎯 Dynamic Decomposition Example: A Coding Agent Fixing a Bug
👤 "Fix the 500 error in the login API. Tests must pass."
📋 Shared Context
✓ auth.py:42 found
✓ test_auth.py loaded
~ edit applied
· pytest pending
🎯
Orchestrator LLM
plans & delegates
💭 "Find relevant files → read tests → find root cause → fix → verify"
1
🔍
Grep Worker
find
grep "login"
→ auth.py
2
📖
Read Worker
inspect
auth.py
test_auth.py
3
✏️
Edit Worker
patch
Add null check
4
⚙️
Bash Worker
verify
pytest -v
PR Created
done
tests: 23/23 pass
↻ On failure, Orchestrator re-plans
When to use
When subtasks are determined at runtime and can be parallelized
Canonical example
Claude Code sub-agents, research agents
Key distinction
Parallelization = fixed split. Orchestrator-Workers = dynamic split
04+
Multi-agent Topologies — Orchestrator extension
LangGraph Extension

Orchestrator-Workers is one form of the Supervisor topology. In practice, three control structures exist: Supervisor (central coordination) · Swarm (peer handoff) · Hierarchical (multi-layer).

🕸️ Multi-agent Control Topologies LangGraph's standard 3 categories
🎯 Supervisor Centralized
🎯
Super
👷
W1
👷
W2
👷
W3
A central supervisor decomposes tasks, assigns each to a worker, and merges results.
Use when: clear role division · central oversight needed · result aggregation
🐝 Swarm Peer handoff
🤖
A
🤖
B
🤖
C
🤖
D
No central coordinator — agents hand off control to each other ("you take it") when specialization is needed.
Use when: exploratory tasks · next step depends on interim results · dynamic collaboration
🏛️ Hierarchical Multi-layer
🎯
Top
🎯
A
🎯
B
👷
👷
👷
👷
Multiple layers of supervisors. Top orchestrates teams, lower layers execute. Scales large systems.
Use when: large multi-domain systems · team-level specialization · separation of concerns
Supervisor
Our Orchestrator-Workers. Coding agents, Claude Code sub-agents
Swarm
Customer support multi-specialist, multi-role RPG, OpenAI Swarm SDK
Hierarchical
Enterprise agent platforms (accounting / legal / CX teams running in parallel)
05
Evaluator-Optimizer
Iterative Feedback Loop

One LLM generates (Generator), another provides feedback (Evaluator). Iterate until quality criteria are met.

🔄 Generator ⇄ Evaluator Loop Example: EN→KO Technical Document Translation
✍️
Generator
Translator LLM
Re-translate with feedback
EN: "This async function..."
KO: "이 비동기 함수는..."
draft feedback
🔍
Evaluator
Rubric-based
RUBRIC
☐ Terminology accuracy
☐ Tone consistency
☐ Natural Korean
☐ Meaning preservation
Iteration — Quality Score Progression
v1
54%
Term mismatch · uneven tone
v2
78%
Terms fixed · paragraph 3 awkward
v3
96% ✓
Passed · accepted as final
When to use
No single correct answer, but quality criteria are clear
Canonical example
Translation, essay refinement, code refactoring
Caveat
Cap max iterations (e.g. 3) to avoid infinite loops
06
Autonomous Agent
Autonomous High Complexity

The LLM decides on its own which tools to use and what path to take. Feedback from the environment shapes its plan until a termination condition. Flexible but expensive and risky.

🤖 ReAct Loop · Think-Act-Observe Example: Research Agent
👤 "Summarize top 3 LLMs from 2026 benchmarks in a table"
1
🧠
Reason
Think
"Find benchmark sites first"
2
⚙️
Act
Tool call
web_search("LLM bench 2026")
3
👁️
Observe
Env feedback
10 results · top 3 found
4
💾
Update
Memory
Save step 4
↻ loop · until goal met (step 4 / max 20)
📦 Available Tool Pool
🔍web_search 📄fetch_page 📊create_table 💾save_file 🧮calculator 🐍python_exec
✅ Success Exit
Goal reached → emit final answer
🙋 Human Check-in
Stuck · before critical decisions
⛔ Fail-safe
Max steps · cost budget exceeded
When to use
Path is completely open and autonomy is a core value
Canonical example
Claude Code, Computer Use, research agents
Required guardrails
Max steps · cost budget · human check-in · test sandbox
07
Human-in-the-Loop — Pause · Approve · Resume
Production-grade Safety

The agent pauses before critical decisions, waits for a human to review / approve / edit, and then resumes. The #1 production pattern for balancing autonomy and safety. LangGraph implements this as checkpoint + interrupt.

🙋 Pause → Human Review → Resume Example: Approve before an External API Call
🤖
Agent
running
"Ready to send email to 50 people"
⏸️
Pause
state saved
checkpoint #42
👤
Human Review
Review decision
can edit
✓ Approve ✎ Edit ✗ Reject
▶️
Resume
continue
Apply decision & continue
When to pause: 💸 Payment · purchase 📧 External send 🗑️ Destructive op ⚖️ Legal decision 🔐 Permission change 🤔 Low confidence
When to use
Any situation where the cost of a mistake is large and irreversible
Canonical example
Merging a PR · wire transfer · sending email · booking · deleting resources
Implementation
State serialization (checkpoint) · async wait · notifications · resumable message queue

Which Pattern to Use?

Rule of Thumb

The higher you go, the simpler · cheaper · more predictable. The lower you go, the more complex · expensive · but more capable. Always try from the top, and only descend when you hit real limits.

🌲 Pattern Selection Tree Answer each question top-down
Q1 Is a single LLM call enough?
YES NO
Single LLM + RAG
simplest
Most cases end here
Q2 Predictable workflow?
YES · fixed NO · dynamic
Q3 Sequential?
📝
Prompt Chaining
if YES
🧭
Routing
if branching
Parallelization
if parallel
🔄
Evaluator-Optimizer
if iterative
Q4 Autonomy level?
🎯
Orchestrator-Workers
dynamic split
Subtasks decided at runtime
🤖
Autonomous Agent
fully autonomous
LLM decides the path itself

Cost & Latency Comparison

Pattern Complexity Cost Latency Predictability
Single LLM 💰 ⚡⚡⚡ ✅✅✅
Prompt Chaining ⭐⭐ 💰💰 ⚡⚡ ✅✅✅
Routing ⭐⭐ 💰💰 ⚡⚡ ✅✅
Parallelization ⭐⭐ 💰💰💰 ⚡⚡⚡ ✅✅
Orchestrator-Workers ⭐⭐⭐ 💰💰💰
Evaluator-Optimizer ⭐⭐⭐ 💰💰💰 ✅✅
Autonomous Agent ⭐⭐⭐⭐ 💰💰💰💰
Human-in-the-Loop ⭐⭐⭐ 💰💰 🕒 (await human) ✅✅✅

Choosing a Multi-agent Topology

When scaling Orchestrator-Workers, pick one of three structures — based on scale and level of autonomy.

Topology Control Autonomy Debugging Scale
🎯 Supervisor Centralized Low Easy Small · Mid
🐝 Swarm Peer handoff High Hard Mid
🏛️ Hierarchical Multi-layer Medium Medium Large

Real-world Use Case Mapping

Prompt Chaining
Marketing Copy Generation
Brief → outline → draft → edit → translate. Each stage has a quality gate.
Routing
Customer Support Automation
Classify refund / tech / general tickets and route. Haiku for Router, Sonnet for Handlers.
Parallelization
Automated Code Review
Run security / performance / style reviewers in parallel; merge into one report. Specialized prompts per aspect.
Orchestrator-Workers
Coding Agent
Like Claude Code: dynamically spawn Grep/Read/Edit/Bash workers at runtime.
Evaluator-Optimizer
High-quality Translation
Generator translates; Evaluator scores terminology, tone, naturalness. Max 3 iterations.
Autonomous Agent
Browser Automation
Computer Use: give a goal and let it click, type, and observe autonomously.
🐝 Swarm Topology
Multi-role Customer Support
Billing → Tech → Legal agents hand off to each other autonomously. No central supervisor.
🏛️ Hierarchical Topology
Enterprise Agent Platform
Top supervisor → accounting/legal/CX team supervisors → team workers. Large-scale separation of concerns.
Human-in-the-Loop
Payment & Send Approval
Agent pauses just before sending 50 emails → human reviews → resume. For anywhere the cost of a mistake is large.

Hybrid Combinations

In practice, combinations are more common than single patterns:

  • Router → Orchestrator: route by input type, then decompose dynamically
  • Chain + Evaluator: quality gate at every chain step
  • Agent + Parallelization: agent invokes parallel sub-agents
  • Agent + HITL: run autonomously, but require approval right before irreversible ops — the production standard
  • Hierarchical + Swarm: hierarchical at the top, swarm within each team — enterprise-scale multi-agent
  • Orchestrator + Evaluator: evaluator verifies worker results; re-delegates on failure

Design Principles

🎯 Start simple. If a single LLM call solves it, stop there. Every layer of pattern makes debugging harder.

📏 Measure before adding complexity. Only add complexity when it yields a measurable performance gain.

🧱 Augmented LLM is the building block. Every pattern is a question of how to compose "LLM + Tools + Memory + Retrieval".

🚨 Autonomous agents require guardrails. Max steps, cost budget, human check-in, sandbox — all four, no exceptions.

💡 Transparency is trust. Log and show what the agent is doing and why. Black boxes don't survive production.

🔄 Evaluation-driven development. Build the eval set before the agent. Without it you can't detect improvement or regression.

🧠 Context Engineering. Deciding what goes into the context window is the biggest bottleneck for long-horizon agents. Design what memory to compress, summarize, or discard. (Harrison Chase, 2026)

🙋 HITL for irreversible actions. Payment · send · delete · permission change — autonomy is tempting, but Pause → Approve → Resume is the production standard. Design the intervention points up front.

💾 State persistence = an agent's undo. Saving every step with checkpoints lets you rewind on failure and time-travel during debugging.