AI Agent LLM ReAct Architecture

“AI agent” is everywhere, but it is still hard to define. This note separates agents from plain LLM calls, walks through the usual loop, memory, tools, and common frameworks—so you can reason about new stacks without re-learning from zero.

1. Agent vs LLM — what’s different?

An LLM answers once per prompt. An agent keeps acting until a goal is met. The core difference is the loop.

LLM (single shot)AI agent
ExecutionOne input → one outputRepeat until the goal is satisfied
ToolsNoneAPIs, DBs, code runners, …
MemoryContext window onlyShort-term + long-term + vector stores
PlanningNoneDecomposes goals into subtasks
AutonomyLowHigher—chooses next actions

For “Summarize last week’s news,” a bare LLM answers from training data. An agent can search the web, read results, and iterate until the summary is grounded.

2. Agent loop — ReAct pattern

The dominant pattern is ReAct (Reasoning + Acting): think, act, observe, repeat.

1. Perceive
Take input—user message, webhook, schedule, …
2. Think
The LLM analyzes state and picks the next action.
3. Act
Invoke tools—search, HTTP, SQL, code, …
4. Observe
Append tool output to working memory / context.
5. Evaluate
Check goal satisfaction; if not, go back to Think.
6. Output
Return, persist, or notify.

Sketch in code:

while not goal_achieved:
    thought = llm.think(context, goal)
    action  = parse_action(thought)
    result  = execute_tool(action)
    context.add(result)
    goal_achieved = llm.evaluate(context)

return context.final_answer()

3. Memory layout

Production agents mix three memory styles.

Short-term memory

The live context window: chat turns and tool traces. Ephemeral to the session.

Long-term memory

Durable records—runs, preferences, facts—in Postgres, Redis, etc.

Semantic memory

Embeddings for “find similar” retrieval—pgvector, Pinecone, …

4. Tools (function calling)

Tools let the model delegate what text alone cannot do. The LLM emits structured intents; your runtime validates and executes.

Information

  • Web search
  • Crawl / fetch
  • SQL
  • Files

Compute

  • Python sandbox
  • Math
  • Vision APIs
  • External HTTP

Comms

  • Email
  • Slack / Telegram
  • Calendar
  • Push notifications

Control

  • Browser automation
  • Shell (careful!)
  • Other agents
  • Workflow triggers

Example tool schema (Claude-style):

tools = [{
    "name": "web_search",
    "description": "Search the web for recent information",
    "input_schema": {
        "type": "object",
        "properties": { "query": {"type": "string", "description": "Search query"} },
        "required": ["query"]
    }
}]

response = anthropic.messages.create( model=“claude-sonnet-4-20250514”, tools=tools, messages=[{“role”: “user”, “content”: “Summarize recent AI agent news”}] )

5. Agent shapes

ReAct agent
Default single-agent loopTool-using loop for one objective. Common in LangChain / LlamaIndex starters.
Plan-and-execute
Plan first, then run stepsGood for long pipelines—planner emits a task list, executor runs items.
Multi-agent
Several agents collaborateOrchestrator routes work to specialists—AutoGen / CrewAI patterns.
Autonomous agent
Long-running, low touchScheduled jobs, monitoring, always-on pipelines.

6. Framework cheat sheet

FrameworkLanguageNotes
LangChainPython / JSLargest ecosystem; agents, chains, memory, tools.
LlamaIndexPythonStrong for RAG + structured data connectors.
AutoGenPythonMicrosoft—multi-agent chat workflows.
CrewAIPythonRole-based teams of agents.
LangGraphPythonGraph-shaped control flow on top of LangChain ideas.
n8nNo-codeVisual orchestration; quick integrations.
Pick n8n for simple automation, LangChain/LangGraph when you need code-level control, CrewAI/AutoGen when multiple personas must coordinate.

7. Design checklist

Iteration caps

Agents can spin—set max steps (often 10–20) and graceful failure paths.

Tool errors

External APIs fail; teach the loop to retry, switch tools, or exit safely.

Context growth

Summarize or prune old observations so you stay inside token limits.

Cost

Each loop tick can invoke a large model—cache, route to smaller models, batch when possible.

Human-in-the-loop

Gate payments, outbound email, destructive writes behind explicit approval.


Agents are still a fast-moving layer. If you internalize loop + tools + memory, swapping frameworks stops feeling like starting over.