What Is an AI Agent — structure and how it works
AI Agent란 무엇인가 — 구조와 작동 원리
“AI agent” is everywhere, but it is still hard to define. This note separates agents from plain LLM calls, walks through the usual loop, memory, tools, and common frameworks—so you can reason about new stacks without re-learning from zero.
1. Agent vs LLM — what’s different?
An LLM answers once per prompt. An agent keeps acting until a goal is met. The core difference is the loop.
| LLM (single shot) | AI agent | |
|---|---|---|
| Execution | One input → one output | Repeat until the goal is satisfied |
| Tools | None | APIs, DBs, code runners, … |
| Memory | Context window only | Short-term + long-term + vector stores |
| Planning | None | Decomposes goals into subtasks |
| Autonomy | Low | Higher—chooses next actions |
For “Summarize last week’s news,” a bare LLM answers from training data. An agent can search the web, read results, and iterate until the summary is grounded.
2. Agent loop — ReAct pattern
The dominant pattern is ReAct (Reasoning + Acting): think, act, observe, repeat.
Sketch in code:
while not goal_achieved:
thought = llm.think(context, goal)
action = parse_action(thought)
result = execute_tool(action)
context.add(result)
goal_achieved = llm.evaluate(context)
return context.final_answer()
3. Memory layout
Production agents mix three memory styles.
Short-term memory
The live context window: chat turns and tool traces. Ephemeral to the session.
Long-term memory
Durable records—runs, preferences, facts—in Postgres, Redis, etc.
Semantic memory
Embeddings for “find similar” retrieval—pgvector, Pinecone, …
4. Tools (function calling)
Tools let the model delegate what text alone cannot do. The LLM emits structured intents; your runtime validates and executes.
Information
- Web search
- Crawl / fetch
- SQL
- Files
Compute
- Python sandbox
- Math
- Vision APIs
- External HTTP
Comms
- Slack / Telegram
- Calendar
- Push notifications
Control
- Browser automation
- Shell (careful!)
- Other agents
- Workflow triggers
Example tool schema (Claude-style):
tools = [{
"name": "web_search",
"description": "Search the web for recent information",
"input_schema": {
"type": "object",
"properties": { "query": {"type": "string", "description": "Search query"} },
"required": ["query"]
}
}]
response = anthropic.messages.create(
model=“claude-sonnet-4-20250514”,
tools=tools,
messages=[{“role”: “user”, “content”: “Summarize recent AI agent news”}]
)
5. Agent shapes
6. Framework cheat sheet
| Framework | Language | Notes |
|---|---|---|
| LangChain | Python / JS | Largest ecosystem; agents, chains, memory, tools. |
| LlamaIndex | Python | Strong for RAG + structured data connectors. |
| AutoGen | Python | Microsoft—multi-agent chat workflows. |
| CrewAI | Python | Role-based teams of agents. |
| LangGraph | Python | Graph-shaped control flow on top of LangChain ideas. |
| n8n | No-code | Visual orchestration; quick integrations. |
7. Design checklist
Iteration caps
Agents can spin—set max steps (often 10–20) and graceful failure paths.
Tool errors
External APIs fail; teach the loop to retry, switch tools, or exit safely.
Context growth
Summarize or prune old observations so you stay inside token limits.
Cost
Each loop tick can invoke a large model—cache, route to smaller models, batch when possible.
Human-in-the-loop
Gate payments, outbound email, destructive writes behind explicit approval.
Agents are still a fast-moving layer. If you internalize loop + tools + memory, swapping frameworks stops feeling like starting over.