Claude Code$20/mo
Terminal-native autonomous coding agent from Anthropic
Foundations · Updated July 3, 2026
AI agents are classified two ways. The classic computer-science taxonomy has five types by decision-making sophistication: simple reflex, model-based reflex, goal-based, utility-based, and learning agents. The practical 2026 taxonomy groups LLM-powered agents by autonomy (copilot vs autonomous), domain (coding, sales, support, research, general), and architecture (single-agent vs multi-agent).
| Type | How it decides | Everyday example |
|---|---|---|
| Simple reflex agent | Fixed condition-action rules on the current input only | A thermostat; a spam filter keyed on exact rules |
| Model-based reflex agent | Rules plus an internal model of state it can't directly see | A robot vacuum mapping rooms it has already cleaned |
| Goal-based agent | Searches and plans action sequences that reach an explicit goal | GPS navigation planning a route to a destination |
| Utility-based agent | Scores possible outcomes and picks the highest-utility one | A trading system weighing risk against expected return |
| Learning agent | Improves its own decision-making from feedback over time | A recommendation system that adapts to your behavior |
The textbook taxonomy — from Russell and Norvig's standard AI text — orders agents by how sophisticated their decision-making is. A simple reflex agent maps what it perceives right now to an action using fixed rules; it has no memory. A model-based reflex agent adds an internal model of the world, so it can act on state it can't directly observe. A goal-based agent plans: it searches possible action sequences for one that reaches an explicit goal. A utility-based agent goes further, scoring outcomes on a utility function so it can trade off competing goals rather than just satisfying one. A learning agent wraps any of these with a feedback loop that improves its own performance over time.
This ladder still matters because it describes capability, not technology. When you evaluate any modern agent — even an LLM-powered one — you're implicitly asking where it sits: does it just react (reflex), does it track state (model-based), does it plan toward goals, does it weigh trade-offs (utility), and does it improve with use (learning)? Most production LLM agents in this index are goal-based agents with learning components: they plan multi-step tasks toward an explicit objective, and the best ones incorporate feedback from tests, replies, or corrections.
For anyone choosing an agent today, three practical axes matter more than the textbook five. The first is autonomy: copilot-style agents (Cursor, GitHub Copilot) work under continuous human supervision, while autonomous agents (Devin, Claude Code, Manus) take a whole task and return a finished result. Neither is 'better' — supervision suits high-stakes, ambiguous work; autonomy suits well-scoped, verifiable work.
The second axis is domain. Production agents in 2026 are specialists: coding agents (Claude Code, Devin, Cursor), sales agents and AI SDRs (Clay, Ava by Artisan, AiSDR), customer-support agents (Intercom Fin, Sierra, Decagon), research agents (Elicit, GPT Researcher), website builders (Lovable, v0, Bolt), and general-purpose agents (Manus, ChatGPT agent). The third axis is architecture: single-agent systems put one model in a tool loop, while multi-agent systems (Atoms) divide a task across specialized planner, executor, and critic roles. Multi-agent designs can outperform on complex separable work but cost more tokens and add coordination failure modes.
Match the type to the task's stakes and shape. For reversible, well-scoped digital work — refactors with tests, list enrichment, report drafting — an autonomous goal-based agent gives the most leverage. For work that's costly to get wrong or hard to specify — architecture decisions, sensitive customer moments, financial actions — choose a supervised copilot or an agent with explicit human checkpoints, like Claude Code's permission prompts or Robinhood's guard-railed trading account.
Domain fit usually decides more than raw capability: a mid-tier support agent wired into your helpdesk and policies beats a frontier general agent that isn't. And be skeptical of 'learning agent' marketing — most current agents improve between model releases, not continuously in deployment; genuine in-deployment learning (like an AI SDR adapting to reply data) is worth verifying before you pay for it.
Real, verified agents from our index that illustrate the concept above.
Terminal-native autonomous coding agent from Anthropic
The autonomous AI software engineer you assign tickets to
AI-first code editor with a powerful built-in agent mode
General AI agent that plans and executes whole tasks in the cloud
A team of AI agents that builds and ships full apps from a prompt
The market-leading AI support agent, priced per resolution
AI research agents over 100+ data sources for outbound
The classic taxonomy: simple reflex agents (fixed rules, no memory), model-based reflex agents (rules plus an internal world model), goal-based agents (plan actions toward an explicit goal), utility-based agents (score outcomes and pick the best trade-off), and learning agents (improve from feedback over time).
The base chat product is not an agent — it responds rather than acts. ChatGPT's agent mode, however, is a goal-based agent: given a task, it plans steps and uses a virtual computer to browse, fill forms, and produce results, with checkpoints before consequential actions.
Both are goal-based autonomous agents with learning elements: they plan multi-step work toward an explicit objective (a passing test suite, a completed ticket), act through tools, and adjust from feedback like failing tests. Devin adds a fire-and-forget delegation model; Claude Code is terminal-native.
A single-agent system puts one model in a loop with tools. A multi-agent system divides the task across specialized roles — planner, executors, critic — that coordinate. Multi-agent setups can handle complex, separable work better but multiply token costs and add coordination failure modes.
Yes, most visibly in trading and recommendation systems, where the agent scores possible actions against a utility function (expected return vs risk, engagement vs diversity). Agentic trading tools like Composer encode this as rules-based strategies that weigh outcomes rather than chase a single goal.
One that improves its own decision-making from experience — not just one built on a trained model. True in-deployment learning is rarer than marketing suggests: most agents improve between model releases. Verify claims like 'learns from your replies' (some AI SDRs genuinely do) before paying for them.
Start with a domain-specific, goal-based agent for a bounded, verifiable task: support ticket deflection, list enrichment, or well-scoped coding tickets. Keep a human approving irreversible actions. Broad autonomous generalists and multi-agent systems are step two, once you've built evaluation muscle.
Yes, as a capability lens. Most LLM agents are goal-based (they plan toward objectives), the best add utility-style trade-off reasoning and learning loops. The taxonomy describes what an agent can decide, independent of whether it's built on rules, search, or a language model.