Module 22 · Section 22.1

Agent Frameworks

LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Claude Agent SDK, Smolagents, PydanticAI, and Google ADK compared
★ Big Picture

The agent framework landscape has matured rapidly, offering multiple approaches to the same fundamental challenge: orchestrating LLM-powered reasoning, tool use, and multi-step execution. Each framework makes different trade-offs between abstraction level, flexibility, and ease of use. LangGraph provides low-level graph primitives for maximum control. CrewAI offers high-level role-based collaboration out of the box. AutoGen focuses on conversational multi-agent patterns. Native provider SDKs (OpenAI, Anthropic, Google) give direct, minimal-overhead access to each model's strengths. Understanding these trade-offs is essential for choosing the right tool for your specific use case.

1. The Agent Framework Landscape

Before diving into individual frameworks, it helps to understand what an agent framework actually provides. At minimum, every framework handles three concerns: defining how agents reason and act, managing the state that flows between steps, and orchestrating the execution loop that drives everything forward. Where frameworks differ is in how much structure they impose on each of these concerns and how they handle advanced requirements like persistence, streaming, and human oversight.

Agent Framework Abstraction Spectrum Low-Level High-Level Native SDKs OpenAI, Anthropic LangGraph Graph primitives PydanticAI Type-safe agents AutoGen/AG2 Conversational agents CrewAI Role-based crews Low-Level Benefits Maximum control No hidden behavior Provider-specific features Minimal dependencies Easier debugging High-Level Benefits Faster prototyping Built-in patterns Less boilerplate Multi-agent ready Opinionated defaults
Figure 22.1: Agent frameworks span a spectrum from low-level SDKs to high-level orchestration platforms

2. LangGraph: Graph-Based Agent Orchestration

LangGraph models agent workflows as directed graphs where nodes are functions and edges define the flow of execution. State flows through the graph as a TypedDict, and each node receives and returns updates to that shared state. Conditional edges let you branch based on the current state, enabling dynamic routing. LangGraph's checkpoint system serializes state at each step, supporting time-travel debugging, resumption after failure, and human-in-the-loop interruption.

2.1 Core Concepts: Nodes, Edges, and State

from typing import TypedDict, Annotated, Literal
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages

# State is a TypedDict with reducer annotations
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]  # append-only message list
    next_action: str
    iteration_count: int

# Nodes are plain functions that receive and return state
def reasoning_node(state: AgentState) -> dict:
    """Analyze the situation and decide what to do next."""
    messages = state["messages"]
    response = llm.invoke(messages)
    return {
        "messages": [response],
        "next_action": parse_action(response),
        "iteration_count": state["iteration_count"] + 1
    }

def tool_node(state: AgentState) -> dict:
    """Execute the selected tool and return results."""
    tool_call = state["messages"][-1].tool_calls[0]
    result = execute_tool(tool_call)
    return {"messages": [result]}

# Conditional edges route based on state
def should_continue(state: AgentState) -> Literal["tools", "end"]:
    if state["iteration_count"] > 10:
        return "end"
    if state["next_action"] == "finish":
        return "end"
    return "tools"

# Build the graph
graph = StateGraph(AgentState)
graph.add_node("reason", reasoning_node)
graph.add_node("tools", tool_node)
graph.set_entry_point("reason")
graph.add_conditional_edges("reason", should_continue, {
    "tools": "tools",
    "end": END
})
graph.add_edge("tools", "reason")  # Loop back after tool execution

app = graph.compile()
◆ Key Insight

LangGraph's add_messages reducer is critical. Without it, returning messages from a node would overwrite the entire message list. The reducer annotation tells LangGraph to append new messages instead. This pattern of annotated reducers extends to any state field where you need merge semantics rather than replacement.

2.2 Checkpointing and Persistence

from langgraph.checkpoint.sqlite import SqliteSaver

# Enable checkpointing for state persistence
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")
app = graph.compile(checkpointer=checkpointer)

# Each invocation uses a thread_id for state isolation
config = {"configurable": {"thread_id": "user-session-42"}}

# Run the graph; state is checkpointed after each node
result = app.invoke(
    {"messages": [("user", "Research quantum computing advances")],
     "next_action": "",
     "iteration_count": 0},
    config
)

# Resume later from the same thread
state = app.get_state(config)
print(state.values["iteration_count"])  # See where we left off

3. CrewAI: Role-Based Agent Collaboration

CrewAI takes a fundamentally different approach by modeling agents as team members with defined roles, goals, and backstories. You compose a "crew" of agents and assign them tasks with specific expected outputs. The framework handles delegation, context sharing, and execution order. CrewAI supports both sequential and hierarchical process modes, where a manager agent can delegate subtasks to specialist agents.

from crewai import Agent, Task, Crew, Process

# Define agents with roles and expertise
researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive, accurate information on the given topic",
    backstory="You are an experienced research analyst with a talent for "
              "finding reliable sources and synthesizing complex information.",
    tools=[search_tool, web_scraper],
    llm="gpt-4o",
    verbose=True
)

writer = Agent(
    role="Technical Writer",
    goal="Create clear, engaging content from research findings",
    backstory="You are a skilled technical writer who transforms complex "
              "research into accessible, well-structured articles.",
    llm="gpt-4o",
    verbose=True
)

# Define tasks with expected outputs
research_task = Task(
    description="Research the latest advances in {topic}. "
                "Focus on key breakthroughs from the past 6 months.",
    expected_output="A detailed report with at least 5 key findings, "
                    "each supported by sources.",
    agent=researcher
)

writing_task = Task(
    description="Write a technical blog post based on the research findings.",
    expected_output="A 1000-word blog post with introduction, key sections, "
                    "and conclusion.",
    agent=writer,
    context=[research_task]  # Receives output from research_task
)

# Assemble and run the crew
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,  # or Process.hierarchical
    verbose=True
)

result = crew.kickoff(inputs={"topic": "quantum error correction"})

4. AutoGen/AG2: Conversational Multi-Agent Patterns

AutoGen (now evolving as AG2) pioneered conversational multi-agent interaction. Its core abstraction is the conversable agent: agents communicate by sending messages to each other, much like participants in a group chat. AutoGen provides several built-in agent types, including AssistantAgent (LLM-powered), UserProxyAgent (executes code and relays human input), and GroupChat (manages multi-party conversations). A distinctive feature is built-in Docker sandbox support for safe code execution.

from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

# LLM-powered assistant
assistant = AssistantAgent(
    name="research_assistant",
    system_message="You are a helpful research assistant. Analyze data "
                   "and provide insights. Write Python code when needed.",
    llm_config={"model": "gpt-4o", "temperature": 0}
)

# Proxy that executes code and relays human feedback
user_proxy = UserProxyAgent(
    name="executor",
    human_input_mode="TERMINATE",  # Ask human only at end
    code_execution_config={
        "work_dir": "workspace",
        "use_docker": "python:3.11"  # Sandboxed execution
    },
    max_consecutive_auto_reply=5
)

# GroupChat for multi-agent conversation
critic = AssistantAgent(
    name="critic",
    system_message="You review analysis for accuracy and suggest improvements.",
    llm_config={"model": "gpt-4o"}
)

group_chat = GroupChat(
    agents=[user_proxy, assistant, critic],
    messages=[],
    max_round=12,
    speaker_selection_method="auto"  # LLM picks next speaker
)
manager = GroupChatManager(groupchat=group_chat)

# Kick off the conversation
user_proxy.initiate_chat(
    manager,
    message="Analyze the dataset in data.csv and create visualizations."
)
⚠ Warning

AutoGen's use_docker parameter is essential for production use. Without it, code generated by the LLM executes directly on your host machine with full access to the filesystem and network. Always use Docker sandboxes when running untrusted or LLM-generated code. Set human_input_mode="ALWAYS" during development to review every action before execution.

5. Native Provider SDKs

5.1 OpenAI Agents SDK

The OpenAI Agents SDK provides a lightweight, opinionated framework built directly on OpenAI's API. It introduces the concept of an Agent with instructions, tools, and handoff capabilities. Agents can transfer control to other agents via handoffs, enabling multi-agent workflows without external orchestration frameworks.

from openai import agents

# Define tools as decorated functions
@agents.tool
def search_knowledge_base(query: str) -> str:
    """Search the internal knowledge base for relevant documents."""
    results = vector_store.similarity_search(query, k=5)
    return "\n".join([doc.page_content for doc in results])

# Create agents with handoff capabilities
research_agent = agents.Agent(
    name="researcher",
    instructions="You research topics using the knowledge base. "
                 "Hand off to the writer when research is complete.",
    tools=[search_knowledge_base],
    handoffs=["writer"]
)

writer_agent = agents.Agent(
    name="writer",
    instructions="You write clear, structured content based on research. "
                 "Hand off to the reviewer for quality checks.",
    handoffs=["reviewer"]
)

# Run the agent loop
result = agents.run(
    agent=research_agent,
    messages=[{"role": "user", "content": "Write about quantum computing"}]
)

5.2 Anthropic Claude Agent SDK

Anthropic's Claude Agent SDK emphasizes safety and control. It provides a structured agent loop with built-in support for tool use, extended thinking, and computer use. The SDK is designed to give developers fine-grained control over the agent's behavior while leveraging Claude's native tool use capabilities.

5.3 Other Frameworks at a Glance

Smolagents (by Hugging Face) provides a minimalist agent framework that supports both tool-calling and code-based agents. It focuses on simplicity and works well with open-source models. PydanticAI brings type safety to agent development, using Pydantic models for structured inputs and outputs. Google ADK (Agent Development Kit) integrates with Google's Gemini models and provides tools for building agents that work within the Google Cloud ecosystem.

6. Framework Comparison

Framework Abstraction Multi-Agent Checkpointing Best For
LangGraph Graph primitives Custom via subgraphs Built-in (SQLite, Postgres) Complex, custom workflows
CrewAI Role-based agents Native (crews, delegation) Limited Team collaboration patterns
AutoGen/AG2 Conversational agents Native (GroupChat) Manual Code execution, group chat
OpenAI Agents SDK Handoff-based Via handoffs Platform-managed OpenAI-native workflows
Claude Agent SDK Tool-use loop Manual orchestration Manual Safety-critical agents
PydanticAI Type-safe agents Manual orchestration Manual Structured, validated outputs
Smolagents Minimal wrapper Basic None Quick prototyping, open models
Google ADK Tool-use agents Via sub-agents Session-based Gemini and Google Cloud
ⓘ Note

Framework choice is not permanent. Many production systems combine frameworks. You might use LangGraph for your core workflow orchestration while using native SDKs for individual agent nodes. The key is understanding what each framework provides so you can mix and match effectively. Start with the simplest approach that meets your requirements and add complexity only when needed.

7. Lab: Build the Same Agent in Three Frameworks

The best way to understand framework trade-offs is to build the same agent in multiple frameworks. Let us build a simple research agent that takes a topic, searches for information, and produces a summary. We will compare how LangGraph, CrewAI, and the OpenAI native SDK handle this identical task.

Lab: Same Agent, Three Frameworks User Query "Research topic X" LangGraph CrewAI OpenAI SDK Search + LLM Same tools Summary Output Compare results Compare: lines of code, execution time, flexibility, debuggability
Figure 22.2: The lab exercise builds identical functionality in three frameworks for direct comparison
◆ Key Insight

When comparing frameworks, measure what matters for your use case. For a quick prototype, CrewAI gets you running fastest. For a production system that needs fine-grained state management and resumability, LangGraph is more appropriate. For minimal dependencies and tight provider integration, native SDKs win. There is no universally "best" framework; there is only the best framework for your specific requirements and constraints.

Knowledge Check

1. What is the purpose of the add_messages annotation in LangGraph state?

Show Answer
The add_messages annotation acts as a reducer that tells LangGraph to append new messages to the existing list rather than replacing the entire list. Without it, returning a messages field from a node would overwrite all previous messages, losing conversation history.

2. How does CrewAI's context parameter on a Task work?

Show Answer
The context parameter accepts a list of other Task objects whose outputs should be passed as context to the current task. This creates a dependency chain where one agent's output feeds into another agent's input, enabling sequential collaboration without manual message passing.

3. What are the three human_input_mode options in AutoGen, and when would you use each?

Show Answer
"ALWAYS" asks for human input after every agent response, best for development and debugging. "TERMINATE" asks for human input only when the conversation ends or a termination condition is met, suitable for supervised production use. "NEVER" runs fully autonomously without human intervention, appropriate only for well-tested, low-risk workflows.

4. Why might you choose native provider SDKs over a framework like LangGraph or CrewAI?

Show Answer
Native SDKs offer maximum control with no hidden behavior, minimal dependencies, full access to provider-specific features (like extended thinking or computer use), easier debugging since there are fewer abstraction layers, and typically faster iteration during development. They are ideal when you need tight integration with a specific provider's capabilities or when the overhead of a framework is not justified by your use case.

5. What role does Docker play in AutoGen's code execution, and why is it important?

Show Answer
Docker provides a sandboxed execution environment that isolates LLM-generated code from the host system. This is critical because LLM-generated code can be unpredictable and potentially harmful, whether through bugs or adversarial prompt injection. The Docker container limits filesystem access, network access, and system resources, preventing code from damaging the host machine or exfiltrating data.

Key Takeaways