Prelude
We have built agents with both of these tools. Not toy examples, not hello-world demos, but production agents that handle real user requests, call real APIs, and fail in real ways. That experience is what this guide draws from.
When Anthropic released the Agent SDK, the first reaction was scepticism. After using LangChain for over a year with chains, agents, and LangGraph workflows running in production, why rewrite anything?
The answer turned out to be nuanced. Not everything was rewritten. But several agents moved to the Agent SDK, while others stayed on LangChain. The decision was never about which tool is "better" in the abstract. It was always about which tool fits the specific problem.
This guide walks through that decision process. We will show you both architectures, build the same agent in both frameworks, and give you a clear framework for choosing between them. If you have already read our guide on building a custom Claude agent, this will extend that knowledge into a comparative context. If you are starting fresh, this guide stands on its own.
The Problem
The AI agent ecosystem in 2026 has a paradox. There are more tools than ever for building agents, but the choice between them is harder than ever. Every framework claims to be the simplest, the most powerful, the most production-ready. The marketing is indistinguishable.
The real differences are architectural. They show up when you try to do something specific. When you need to add a guardrail. When you need to swap a model. When you need to debug a tool call that failed at 3am.
When you need to hand off from one agent to another mid-conversation.
The two frameworks developers ask about most are Anthropic's Agent SDK and LangChain. They are genuinely different in philosophy, architecture, and intended use case. Picking the wrong one costs you weeks of refactoring. Picking the right one saves you months.
Here is what we learned.
The Journey
What the Claude Agent SDK Actually Is
The Claude Agent SDK is Anthropic's official library for building agentic applications with Claude. It is available in both Python and TypeScript. The core idea is straightforward.
You define an Agent with a name, a model, instructions, and a set of tools. You call agent.run() with a user message. The SDK handles everything else.
"Everything else" means the agentic loop. The SDK sends your message to Claude. Claude decides whether to respond directly or call a tool. If it calls a tool, the SDK executes that tool, sends the result back to Claude, and lets Claude decide again. This loop continues until Claude produces a final response with no more tool calls.
Here is what a minimal agent looks like in Python.
from agents import Agent, Runner, function_tool
@function_tool
def get_weather(city: str) -> str:
"""Get current weather for a city."""
return f"Weather in {city}: 18°C, partly cloudy"
agent = Agent(
name="Weather Agent",
model="claude-sonnet-4-6",
instructions="You help users check the weather. Be concise.",
tools=[get_weather]
)
result = Runner.run_sync(agent, "What's the weather in London?")
print(result.final_output)
That is the entire thing. No chain configuration, no prompt template assembly, no executor setup. You define what the agent knows, what it can do, and what it should say. The SDK runs the loop.
The SDK also provides guardrails (functions that validate inputs and outputs), handoffs (letting one agent delegate to another), and model routing (using different Claude models for different tasks). These are built-in, first-class features rather than plugins you bolt on.
What LangChain Is
LangChain is a framework-agnostic orchestration library for building applications with large language models. It supports dozens of model providers, hundreds of integrations, and multiple agent architectures.
LangChain's core abstractions are chains, agents, tools, memory, and retrievers. A chain is a sequence of operations. An agent is a chain that uses an LLM to decide which tools to call.
Tools are functions the agent can invoke. Memory stores conversation history. Retrievers fetch relevant documents.
Here is the same weather agent in LangChain.
from langchain_anthropic import ChatAnthropic
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city."""
return f"Weather in {city}: 18°C, partly cloudy"
llm = ChatAnthropic(model="claude-sonnet-4-6")
prompt = ChatPromptTemplate.from_messages([
("system", "You help users check the weather. Be concise."),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, [get_weather], prompt)
executor = AgentExecutor(agent=agent, tools=[get_weather])
result = executor.invoke({"input": "What's the weather in London?"})
print(result["output"])
More code. More abstractions. But also more explicit control over the prompt template, the chat history placeholder, and the agent scratchpad where tool call results accumulate.
LangChain's power comes from this explicitness. You can swap ChatAnthropic for ChatOpenAI and the rest of the code stays the same. You can add memory, retrievers, output parsers, and custom callbacks at every stage. You can build complex multi-step workflows with LangGraph, which models agent behaviour as a state machine.
Architecture Comparison
The fundamental architectural difference is this. The Agent SDK gives you an opinionated agentic loop. LangChain gives you composable building blocks to construct your own.
With the Agent SDK, the loop is internal. You call Runner.run_sync() and the SDK manages the cycle of message, tool call, execution, and response. You influence the loop through instructions, tool definitions, and guardrails. But you do not write the loop itself.
With LangChain, the loop is external. The AgentExecutor runs a loop, but you can replace it with LangGraph for full control over state transitions, branching, and error recovery. You can define exactly what happens when a tool fails, when the model hallucinates, or when you need human approval mid-workflow.
Think of it this way. The Agent SDK is a car with an automatic gearbox. You steer, you brake, you accelerate. The gearbox handles the shifting. LangChain is a car with a manual gearbox and a toolkit for building your own transmission if you want one.
Neither is inherently better. The automatic gearbox is faster for most driving. The manual gearbox gives you more control in specific situations.
Building a Tool-Using Agent Side by Side
Here is something more realistic. A research agent that can search the web, read documents, and summarise findings. Both implementations are shown below so you can see the practical differences.
Agent SDK version.
from agents import Agent, Runner, function_tool
import httpx
@function_tool
def search_web(query: str) -> str:
"""Search the web for information on a topic."""
response = httpx.get(
"https://api.search.example/v1/search",
params={"q": query, "limit": 5}
)
results = response.json()["results"]
return "\n".join(
f"- {r['title']}: {r['snippet']}" for r in results
)
@function_tool
def read_url(url: str) -> str:
"""Read the content of a web page."""
response = httpx.get(url)
return response.text[:5000]
@function_tool
def save_summary(title: str, content: str) -> str:
"""Save a research summary to the database."""
# Database logic here
return f"Summary '{title}' saved successfully."
research_agent = Agent(
name="Research Agent",
model="claude-sonnet-4-6",
instructions="""You are a research assistant. When given a topic:
1. Search the web for relevant sources
2. Read the most promising results
3. Synthesise a summary with citations
4. Save the summary""",
tools=[search_web, read_url, save_summary]
)
result = Runner.run_sync(
research_agent,
"Research the current state of quantum computing in 2026"
)
print(result.final_output)
LangChain version.
from langchain_anthropic import ChatAnthropic
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
import httpx
@tool
def search_web(query: str) -> str:
"""Search the web for information on a topic."""
response = httpx.get(
"https://api.search.example/v1/search",
params={"q": query, "limit": 5}
)
results = response.json()["results"]
return "\n".join(
f"- {r['title']}: {r['snippet']}" for r in results
)
@tool
def read_url(url: str) -> str:
"""Read the content of a web page."""
response = httpx.get(url)
return response.text[:5000]
@tool
def save_summary(title: str, content: str) -> str:
"""Save a research summary to the database."""
return f"Summary '{title}' saved successfully."
llm = ChatAnthropic(model="claude-sonnet-4-6")
tools = [search_web, read_url, save_summary]
prompt = ChatPromptTemplate.from_messages([
("system", """You are a research assistant. When given a topic:
1. Search the web for relevant sources
2. Read the most promising results
3. Synthesise a summary with citations
4. Save the summary"""),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({
"input": "Research the current state of quantum computing in 2026"
})
print(result["output"])
The tool definitions are nearly identical. The difference is in the orchestration. The Agent SDK version has less boilerplate. The LangChain version gives you verbose=True for built-in step-by-step logging and explicit control over the prompt template.
Performance and Overhead
We benchmarked both frameworks on the same task, running a three-tool agent through ten sequential conversations. The results were instructive.
The Agent SDK added roughly 2-5ms of overhead per tool call beyond the API latency. Its loop is thin. It serialises the tool call, executes your function, and sends the result back. There is very little happening between those steps.
LangChain added 10-30ms per tool call. This is not because LangChain is slow. It is because LangChain does more. It runs callbacks at each stage, maintains chain state, handles memory updates, and passes through output parsers.
Each of those is useful. Each adds a small cost.
For most applications, this difference is irrelevant. The API call itself takes 500ms to 3 seconds. An extra 25ms of framework overhead is noise. But if you are running agents at scale with hundreds of concurrent sessions, or if you are building latency-sensitive applications, the Agent SDK's minimal overhead is a genuine advantage.
Memory usage told a similar story. The Agent SDK's footprint is small because it does less. LangChain's footprint is larger because it maintains more state, more callbacks, and more abstractions.
In our benchmarks, a single Agent SDK agent consumed approximately 15MB of memory at rest, while a comparable LangChain agent with callbacks and memory configured consumed roughly 40MB. At scale, with hundreds of concurrent agents, that difference compounds significantly.
Multi-Model Support
This is where LangChain has a clear structural advantage. LangChain supports dozens of model providers through a unified interface. You can swap Claude for GPT-4, Gemini, Llama, Mistral, or any other supported model by changing one line of code.
# Switch from Claude to GPT-4
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
# Everything else stays the same
The Agent SDK is Claude-only by design. It is built to take full advantage of Claude's specific capabilities, including extended thinking, tool use patterns, and the way Claude structures its responses. You cannot plug in a different model.
If your application must support multiple model providers, or if you want the flexibility to switch models without rewriting your agent logic, LangChain is the correct choice. There is no workaround for this in the Agent SDK.
If your application is built around Claude and you want the tightest possible integration with Claude's features, the Agent SDK is the correct choice. You get access to Claude-specific optimisations that a generic interface cannot provide.
Guardrails and Safety
The Agent SDK has built-in guardrail support. You define input guardrails that run before the agent processes a message and output guardrails that run before the agent returns a response.
from agents import Agent, Runner, InputGuardrail, GuardrailFunctionOutput
async def check_for_pii(ctx, agent, input_text):
"""Block requests that contain personal information."""
pii_patterns = ["social security", "credit card", "passport number"]
contains_pii = any(p in input_text.lower() for p in pii_patterns)
return GuardrailFunctionOutput(
output_info={"contains_pii": contains_pii},
tripwire_triggered=contains_pii
)
agent = Agent(
name="Safe Agent",
model="claude-sonnet-4-6",
instructions="You are a helpful assistant.",
input_guardrails=[
InputGuardrail(guardrail_function=check_for_pii)
]
)
When the guardrail triggers, the agent stops immediately. No tool calls execute. No response is generated. The caller gets a clear signal that the input was rejected.
LangChain does not have built-in guardrails in the same way. You can achieve similar results through custom callbacks, middleware, or by using external tools like Guardrails AI or NeMo Guardrails. These work well but require additional dependencies and configuration.
The practical difference is integration depth. Agent SDK guardrails are part of the agent definition. They run in the same context as the agent. They have access to the same state. LangChain guardrails are bolted on, which gives you more flexibility in how you implement them but requires more setup.
Observability and Debugging
When an agent fails at 3am, you need to understand what happened. Both frameworks provide observability tools, but they approach the problem differently.
The Agent SDK provides lifecycle hooks. You can attach functions that run at each stage of the agentic loop, capturing tool calls, model responses, and execution timing. This data is structured and predictable because the loop itself is structured and predictable.
from agents import RunHooks, RunContextWrapper, Agent
class LoggingHooks(RunHooks):
async def on_tool_start(self, context, agent, tool):
print(f"[TOOL START] {tool.name}")
async def on_tool_end(self, context, agent, tool, result):
print(f"[TOOL END] {tool.name}: {result[:100]}")
result = await Runner.run(
agent,
"Check the weather",
run_hooks=LoggingHooks()
)
LangChain provides callbacks and integrates with LangSmith for production observability. LangSmith gives you a full trace of every chain execution, including token counts, latencies, tool inputs and outputs, and model responses. It is a hosted service with a dashboard, search, and alerting.
from langchain_core.callbacks import BaseCallbackHandler
class LoggingCallback(BaseCallbackHandler):
def on_tool_start(self, serialized, input_str, **kwargs):
print(f"[TOOL START] {serialized['name']}")
def on_tool_end(self, output, **kwargs):
print(f"[TOOL END] {output[:100]}")
executor = AgentExecutor(
agent=agent,
tools=tools,
callbacks=[LoggingCallback()]
)
If you want self-hosted observability, both frameworks support it. If you want a managed observability platform, LangSmith is mature and well-integrated. The Agent SDK ecosystem does not yet have an equivalent managed platform, though you can build comprehensive logging with the hooks system.
Agent Handoffs and Multi-Agent Systems
Both frameworks support multi-agent patterns, but the mechanisms differ.
The Agent SDK has a first-class concept of handoffs. An agent can delegate to another agent mid-conversation. The second agent takes over, handles the request, and returns control. This is defined declaratively in the agent configuration.
billing_agent = Agent(
name="Billing Agent",
model="claude-sonnet-4-6",
instructions="You handle billing questions.",
tools=[check_balance, process_refund]
)
support_agent = Agent(
name="Support Agent",
model="claude-sonnet-4-6",
instructions="You handle general support. For billing questions, hand off to the billing agent.",
handoffs=[billing_agent]
)
When the support agent determines a question is about billing, it hands off to the billing agent automatically. The billing agent has its own tools, its own instructions, and its own model configuration. The handoff is invisible from the user's perspective.
LangChain achieves multi-agent patterns through LangGraph. LangGraph models the workflow as a state machine where nodes are agents or functions and edges define transitions.
from langgraph.graph import StateGraph, MessagesState
graph = StateGraph(MessagesState)
graph.add_node("support", support_node)
graph.add_node("billing", billing_node)
graph.add_conditional_edges(
"support",
route_to_agent,
{"billing": "billing", "support": "support", "end": "__end__"}
)
LangGraph is more powerful for complex workflows. You can define cycles, conditional branches, parallel execution, and human-in-the-loop approval steps. The Agent SDK's handoff model is simpler but covers the most common multi-agent patterns with far less code.
When to Use the Agent SDK
Use the Agent SDK when you are building Claude-first applications. If Claude is your primary model and you do not need to swap it for another provider, the SDK gives you the tightest integration with the least boilerplate.
Use it when you want simple, reliable agent patterns. The built-in agentic loop handles the common case well. Tool calling, multi-turn conversations, guardrails, and handoffs cover most production agent architectures.
Use it when you want minimal overhead. The SDK is thin. It does not impose abstractions you do not need. If you are building a focused agent that does one thing well, the SDK stays out of your way.
Use it when you need production reliability from day one. The SDK is maintained by Anthropic. It is tested against Claude's actual behaviour. When Claude's tool calling format changes, the SDK updates to match. You do not need to wait for a third-party integration to catch up.
If you are working with MCP servers and extensions, the Agent SDK's tool model maps naturally to MCP's tool protocol. Tools defined for the SDK can be adapted to work as MCP tools with minimal changes.
When to Use LangChain
Use LangChain when you need multi-model support. If your application must work with Claude, GPT-4, Gemini, and open-source models through the same interface, LangChain is built for this.
Use it when you need complex workflows. LangGraph gives you state machines, conditional routing, parallel execution, and human-in-the-loop patterns that go beyond what the Agent SDK's agentic loop supports.
Use it when you are prototyping rapidly. LangChain's massive ecosystem of integrations means you can connect to almost any API, database, or service with a pre-built module. Vector stores, document loaders, output parsers, and memory implementations are all available as drop-in components.
Use it when you have an existing LangChain codebase. Migration has a real cost. If your agents are already built on LangChain and working in production, there is no urgent reason to rewrite them. LangChain supports Claude as a first-class model through langchain-anthropic.
Use it when you need mature observability tooling. LangSmith provides production-grade tracing, evaluation, and monitoring out of the box. Building equivalent tooling on top of the Agent SDK is possible but requires more work.
Using Both Together
Here is something that is not obvious from the documentation. You can use both. LangChain can use Claude as its underlying model through the langchain-anthropic package. And you can build some agents with the Agent SDK and others with LangChain within the same application.
This is a common pattern in production. Simple, focused agents use the Agent SDK. Complex multi-step workflows use LangGraph. They communicate through shared databases and message queues, not through framework-level integration.
Each agent is a service. The framework it uses is an implementation detail.
Here is a concrete example. A customer support system with three agents. The triage agent uses the Agent SDK. It reads the incoming message, classifies it, and hands off to a specialist agent.
The specialist agents also use the Agent SDK because they are focused, single-purpose tools with clear instructions and a small set of tools each.
But the reporting agent that analyses conversation trends, generates weekly summaries, and produces visualisations uses LangGraph. It needs to orchestrate multiple steps with conditional branching. It queries a database, runs statistical analysis, generates charts with a code interpreter tool, and compiles everything into a report. LangGraph's state machine model is ideal for this kind of multi-step pipeline.
The two systems share a PostgreSQL database. The Agent SDK agents write conversation records. The LangGraph agent reads them for analysis. Neither system knows or cares about the other's framework. The boundary is the database, not the framework.
This is the pragmatic approach. Do not pick a side. Pick the right tool for each specific problem.
Ecosystem and Community
One factor that deserves its own section is community size. LangChain has a significantly larger community. More tutorials, more Stack Overflow answers, more GitHub examples, more third-party integrations. When you get stuck, the odds of finding someone who has solved your specific problem are higher with LangChain.
The Agent SDK community is smaller but growing quickly. It has the advantage of being maintained by Anthropic directly, which means documentation is authoritative and issues get addressed by the people who build Claude. The trade-off is fewer community-contributed examples and integrations.
For teams evaluating both options, this is a practical consideration. A junior developer will find more LangChain learning resources. A senior developer will appreciate the Agent SDK's clean API and minimal surface area. Consider your team's experience level and your organisation's support expectations when making the choice.
The open-source ecosystem around each tool also matters. LangChain has LangSmith for observability, LangServe for deployment, and LangGraph for complex workflows. These form a cohesive platform.
The Agent SDK is more focused, providing the core agent loop and leaving deployment, observability, and workflow orchestration to your existing infrastructure. Both approaches have merit. The platform approach reduces decisions. The focused approach reduces lock-in.
The Lesson
The choice between the Agent SDK and LangChain is not about quality. Both are well-maintained, well-documented, and used in production by thousands of developers. The choice is about fit.
When building a Claude-powered agent that needs to be reliable, fast, and simple, reach for the Agent SDK. When building a complex workflow that might need multiple models, elaborate state management, or dozens of integrations, reach for LangChain.
The worst decision is to pick based on hype. We have seen teams adopt LangChain because "everyone uses it" and then struggle with abstractions they do not need. We have seen teams avoid the Agent SDK because it is "too simple" and then rebuild its features from scratch on top of LangChain.
Start with the problem. What does your agent need to do? How many models does it need? How complex is the workflow?
How important is minimal overhead? Answer those questions and the framework choice becomes obvious.
Conclusion
We wrote this guide because too many comparisons benchmark hello-world examples and declare a winner. Real agent development is messier than that. The right framework depends on your team, your requirements, and your existing infrastructure.
The Agent SDK is excellent for what it does. It gives you a clean, minimal, Claude-optimised way to build agents with tools, guardrails, and handoffs. To go deeper, our guide on building a custom Claude agent walks through a complete production example.
LangChain is excellent for what it does. It gives you a flexible, model-agnostic, ecosystem-rich way to build complex LLM applications. Its community is enormous and its integration library is unmatched.
Use the one that fits. Or use both. The agent does not care which framework built it. The user does not care which framework built it. What matters is that the agent works, that it works reliably, and that you can maintain it when things go wrong.