AI Agents Explained: What Actually Changed in 2026
A calm, plain-English explainer on AI agents: how they differ from chatbots, the core loop, real examples, why 2026 feels different, and honest limits.

A few weeks ago I asked an AI tool to fix a bug in one of my projects. I didn't paste the code. I just said: "The contact form isn't sending emails, figure out why." It opened the project, read through the files, found the place where the email service was being called, noticed the API key was missing, checked the configuration, wrote a fix, ran the tests, saw two of them fail, adjusted the fix, ran them again, and told me it was done. I watched it work for about four minutes. I typed one sentence.
That is the thing people mean when they say "AI agent." Not a smarter chatbot. Something that takes a goal and then does a sequence of things to reach it, checking its own work along the way. The difference sounds small. In practice it changes what these tools are for.
A chatbot answers. An agent acts.
Let me draw the line clearly, because the words get muddy.
A chatbot is a conversation. You send a message, it sends one back. ChatGPT in its simplest form, the thing most people met first, is this: a very good text predictor that responds to what you typed. It doesn't go anywhere. It doesn't check anything. It produces an answer based on what it already knows and then stops. If the answer is wrong, it has no way of finding out.
An agent is a loop. You give it a goal, and instead of answering once, it works in steps. It can decide what to do, do it, look at what happened, and decide what to do next. The key word is next. A chatbot has no "next." An agent's whole nature is that it keeps going until the job is done or it gives up.
If you've read How Large Language Models Actually Work, you already know the engine underneath both of these is the same: a model that predicts text. An agent is that same engine, wrapped in a system that lets it take actions and feed the results back in. The model didn't suddenly become a different creature. We built a harness around it.
The core loop: goal, plan, tool, observe, repeat
Almost every agent runs on one cycle. Once you see it, you'll spot it everywhere.
- Goal. You state what you want. "Find the cheapest flight to Nairobi next month." "Refactor this function." "Summarize what these twenty reviews are complaining about."
- Plan. The model breaks the goal into steps. Not a perfect plan, just a first move. "I should search for flights first."
- Tool use. This is the part that's new. The agent doesn't just think about searching, it actually calls a search tool, or runs code, or hits an API, or reads a file. It reaches out of the conversation and touches the real world.
- Observe. The tool returns a result. A list of flights. An error message. A test that failed. The agent reads it.
- Repeat. Based on what it observed, it decides the next step. Maybe the search came back empty, so it tries different dates. Maybe the code threw an error, so it fixes the error. It loops back to planning and goes again.
The loop ends when the agent decides the goal is met, or when it hits a limit you set. That's the whole architecture. Everything fancy is built on this simple spine.
What the tools actually are
When I say an agent "uses tools," I mean something concrete. A tool is a function the model is allowed to call. The developer defines a small menu: here is a search function, here is one that reads files, here is one that runs code, here is one that sends an email. The model, mid-thought, can say "call the search function with these words," and the system runs it for real and hands back the output.
This is the bridge between language and action. The model produces text; the harness around it watches for "I want to use this tool," runs the actual tool, and pastes the result back into the conversation so the model can read it. Round and round.
If you've worked with how APIs actually work, tools are basically the model calling APIs on its own. That's the unlock. A model that can call APIs can do almost anything a program can do.
Real examples, not hype
Let me ground this in things that exist today.
Coding agents. The bug-fixing story I opened with. These read a codebase, write changes across many files, run tests, and iterate. They're the most mature category because the feedback is so clean: code either compiles and passes tests, or it doesn't. The agent gets an honest signal every loop.
Research agents. You ask a question that needs digging. The agent runs several searches, opens pages, reads them, notices gaps, runs more searches to fill them, and writes up an answer with sources. It's doing what a careful person does with twelve browser tabs, except it doesn't lose the thread.
Customer-support agents. A customer writes in with a problem. The agent looks up their order in the real system, checks the return policy, maybe issues a refund through an actual API, and replies. The leap from old chatbots is that it can change things, not just recite a help article.
Why 2026 feels different
People have been promising agents for years. Most early attempts were frustrating demos. So what changed? A few things stacked up at once.
The models got more reliable. Earlier models could plan a task but would lose the plot halfway through, or hallucinate a tool that didn't exist. Newer ones follow multi-step instructions and use tools far more dependably. The loop only works if each step is mostly trustworthy, and we crossed a threshold where it usually is.
Tool use became native. Instead of bolting tools on with brittle tricks, models are now trained to call them as a first-class skill. They're better at choosing the right tool and formatting the call correctly.
Context got longer. An agent has to hold the whole task in its head: the goal, what it's tried, what came back. Bigger context windows mean it can run more steps without forgetting the beginning.
Standard tool protocols arrived. This is quieter but important. Things like MCP (the Model Context Protocol) give tools a common shape, so any agent can plug into any tool without custom wiring. It's the difference between every appliance needing its own outlet and everyone agreeing on one plug. That standardization is what turns a demo into an ecosystem.
None of these is a miracle on its own. Together they moved agents from "impressive when it works" to "useful most of the time."
The honest limits
I'd be doing you a disservice if I stopped at the good part. Agents are genuinely useful and genuinely flawed, both at once.
They get stuck. An agent in a loop can keep trying the same broken approach, confidently, ten times in a row. It doesn't always recognize a dead end. A human would stop and rethink; an agent sometimes just keeps banging on the door.
Reliability isn't there yet. "Usually right" is wonderful for drafting and terrible for anything where being wrong is expensive. An agent that books the wrong flight, deletes the wrong file, or refunds the wrong customer creates real damage. The longer the chain of steps, the more chances for one bad step to derail everything.
Cost adds up. Every loop is a fresh call to the model, and a task might take dozens. A single chatbot reply is cheap. An agent grinding through a hard problem for ten minutes can cost real money, and you don't always know the bill until it's done.
They still need a human in the loop. The smart pattern right now isn't "let it run free." It's "let it propose, you approve." Give it the boring legwork, keep your hand near the off switch for anything that matters. I let coding agents write and test, but I read the changes before they ship.
What this means for ordinary people and workers
I don't think the honest takeaway is "agents will replace everyone tomorrow." I dug into that question separately in Will AI Replace Developers?, and the short version is: the work changes shape more than it disappears.
For ordinary people, agents are starting to handle the tedious multi-step errands of digital life: comparing options, filling forms, chasing down information. For workers, the most valuable skill is shifting. It's less about doing every step yourself and more about defining the goal clearly, knowing what good output looks like, and catching the moments the agent gets it wrong. That's a real skill, and it's not going away. Someone still has to decide what's worth doing and whether it was done right.
My honest advice: try one. Watch it work on something low-stakes. You'll learn more in ten minutes of watching an agent loop than in any explainer, including this one.
Frequently asked questions
- What's the simplest difference between a chatbot and an AI agent?
- A chatbot answers one message and stops. An agent takes a goal and works in a loop: it plans, uses tools to take real actions, looks at the results, and keeps going until the job is done. A chatbot talks; an agent acts.
- What does it mean when an agent 'uses tools'?
- A tool is a function the agent is allowed to call, like searching the web, running code, reading a file, or hitting an API. Instead of only producing text, the agent can trigger these real actions and read what comes back, which is how it gets things done in the world rather than just describing them.
- Are AI agents reliable enough to trust on their own?
- Not fully, and not yet. They're useful for drafting, research, and repetitive multi-step tasks, but they can get stuck, make confident mistakes, and rack up cost on long jobs. The safe pattern is to let the agent propose and do the legwork while a human reviews anything that actually matters before it ships.
- Why do people say 2026 is different for agents?
- Several things matured at once: models got more reliable at following multi-step instructions, tool use became a native trained skill, context windows got long enough to hold a whole task, and standard tool protocols like MCP made it easy to plug agents into many tools. Together they moved agents from shaky demos to genuinely useful.
- Do I need to be a programmer to use an AI agent?
- No. Coding agents are popular because the feedback is clean, but research agents, support agents, and everyday task agents are built for anyone. The skill that matters most isn't coding, it's describing your goal clearly and being able to recognize whether the result is actually good.
Further reading on this site
- How Large Language Models Actually Work
- Will AI Replace Developers?
- How APIs Actually Work: A Plain-English Guide
- Browse Technology
If this was useful, subscribe to the newsletter and I'll send the next plain-English explainer straight to you.
The Newsletter
Liked this essay?
Get the next one in your inbox. One thoughtful email a week, nothing more.
Keep reading
Related articles

How Large Language Models Actually Work (Plain English)
A calm, honest explainer on how large language models work: tokens, training, parameters, fine-tuning, context windows, and why LLMs hallucinate.
June 5, 2026 · 11 min read

What Is RAG (Retrieval-Augmented Generation)? Plain English
A calm, plain-English guide to RAG (retrieval-augmented generation): why it exists, how the pipeline works, where it's used, and its honest limits.
May 14, 2026 · 9 min read

Open vs Closed AI Models: What's the Difference?
A calm, plain-English guide to open and closed AI models: what each one means, the real tradeoffs, and how to choose between them.
April 20, 2026 · 10 min read