Eight core patterns for designing production LLM systems.
When to use which, how to combine them, and what their failure modes are.
Workflows orchestrate LLMs along predefined paths. Predictable and easy to debug.
Agents let the LLM decide tools and paths by itself. Flexible, but more expensive and error-prone.
Most problems are solved with a single LLM call + RAG + a well-written prompt. Add complexity only when there is a measurable performance gain. This guide is based on Anthropic's "Building effective agents" taxonomy.
The smallest unit of an agentic system: LLM + Tools + Memory + Retrieval. Every pattern below reduces to the question of how to compose multiple of these blocks.
Break a task into sequential steps, where each LLM's output becomes the next one's input. Insert gates (validators) in between to halt or retry on failure.
Classify the input and route it to the right specialized handler. Each handler stays focused on its domain, so quality goes up.
500 error"Split the task and run pieces in parallel, then aggregate. Sectioning = split into independent subtasks. Voting = run the same task N times and take the majority.
A central Orchestrator LLM dynamically decomposes a task at runtime and delegates subtasks to workers. Best for problems where subtasks cannot be defined in advance.
Orchestrator-Workers is one form of the Supervisor topology. In practice, three control structures exist: Supervisor (central coordination) · Swarm (peer handoff) · Hierarchical (multi-layer).
One LLM generates (Generator), another provides feedback (Evaluator). Iterate until quality criteria are met.
async function..."The LLM decides on its own which tools to use and what path to take. Feedback from the environment shapes its plan until a termination condition. Flexible but expensive and risky.
The agent pauses before critical decisions, waits for a human to review / approve / edit, and then resumes. The #1 production pattern for balancing autonomy and safety. LangGraph implements this as checkpoint + interrupt.
The higher you go, the simpler · cheaper · more predictable. The lower you go, the more complex · expensive · but more capable. Always try from the top, and only descend when you hit real limits.
| Pattern | Complexity | Cost | Latency | Predictability |
|---|---|---|---|---|
| Single LLM | ⭐ | 💰 | ⚡⚡⚡ | ✅✅✅ |
| Prompt Chaining | ⭐⭐ | 💰💰 | ⚡⚡ | ✅✅✅ |
| Routing | ⭐⭐ | 💰💰 | ⚡⚡ | ✅✅ |
| Parallelization | ⭐⭐ | 💰💰💰 | ⚡⚡⚡ | ✅✅ |
| Orchestrator-Workers | ⭐⭐⭐ | 💰💰💰 | ⚡ | ✅ |
| Evaluator-Optimizer | ⭐⭐⭐ | 💰💰💰 | ⚡ | ✅✅ |
| Autonomous Agent | ⭐⭐⭐⭐ | 💰💰💰💰 | ⚡ | ❌ |
| Human-in-the-Loop | ⭐⭐⭐ | 💰💰 | 🕒 (await human) | ✅✅✅ |
When scaling Orchestrator-Workers, pick one of three structures — based on scale and level of autonomy.
| Topology | Control | Autonomy | Debugging | Scale |
|---|---|---|---|---|
| 🎯 Supervisor | Centralized | Low | Easy | Small · Mid |
| 🐝 Swarm | Peer handoff | High | Hard | Mid |
| 🏛️ Hierarchical | Multi-layer | Medium | Medium | Large |
In practice, combinations are more common than single patterns:
🎯 Start simple. If a single LLM call solves it, stop there. Every layer of pattern makes debugging harder.
📏 Measure before adding complexity. Only add complexity when it yields a measurable performance gain.
🧱 Augmented LLM is the building block. Every pattern is a question of how to compose "LLM + Tools + Memory + Retrieval".
🚨 Autonomous agents require guardrails. Max steps, cost budget, human check-in, sandbox — all four, no exceptions.
💡 Transparency is trust. Log and show what the agent is doing and why. Black boxes don't survive production.
🔄 Evaluation-driven development. Build the eval set before the agent. Without it you can't detect improvement or regression.
🧠 Context Engineering. Deciding what goes into the context window is the biggest bottleneck for long-horizon agents. Design what memory to compress, summarize, or discard. (Harrison Chase, 2026)
🙋 HITL for irreversible actions. Payment · send · delete · permission change — autonomy is tempting, but Pause → Approve → Resume is the production standard. Design the intervention points up front.
💾 State persistence = an agent's undo. Saving every step with checkpoints lets you rewind on failure and time-travel during debugging.