High-performance Rust accelerators for LangGraph applications. Drop-in components that provide up to 700x speedups for checkpoint operations and 10-50x speedups for state management.
LangGraph is great for building AI agents, but production workloads often hit performance bottlenecks:
- Checkpoint serialization - Python's deepcopy is slow for complex state
- State management at scale - High-frequency updates accumulate overhead
- Repeated LLM calls - Identical prompts waste API costs
Fast-LangGraph solves these by reimplementing critical paths in Rust while maintaining full API compatibility.
pip install fast-langgraphor
uv add fast-langgraphFast-LangGraph offers two types of acceleration:
Enable transparent acceleration with a single environment variable or function call. No code changes required to your existing LangGraph application.
# Option 1: Environment variable (recommended for production)
export FAST_LANGGRAPH_AUTO_PATCH=1
python your_app.py# Option 2: Explicit patching at startup
import fast_langgraph
fast_langgraph.shim.patch_langgraph()
# Your existing LangGraph code runs faster automaticallyWhat gets accelerated automatically:
| Component | Speedup | Description |
|---|---|---|
| Executor Caching | 2.3x | Reuses ThreadPoolExecutor across invocations |
| apply_writes | 1.2x | Rust-based channel batch updates |
Combined automatic speedup: ~2.8x for typical graph invocations.
Check acceleration status:
import fast_langgraph
fast_langgraph.shim.print_status()For maximum performance, use Rust components directly. These require small code changes but provide the largest speedups.
from fast_langgraph import (
RustSQLiteCheckpointer, # 5-6x faster checkpointing
cached, # LLM response caching
langgraph_state_update, # Fast state merging
)| Component | Speedup | When to Use |
|---|---|---|
RustSQLiteCheckpointer |
5-6x | State persistence |
@cached decorator |
10x+ | Repeated LLM calls (with 90% hit rate) |
langgraph_state_update |
13-46x | High-frequency state updates |
# At the top of your application
import fast_langgraph
fast_langgraph.shim.patch_langgraph()
# Rest of your code unchanged - runs 2-3x faster
from langgraph.graph import StateGraph
# ...Drop-in replacement for LangGraph's SQLite checkpointer:
from fast_langgraph import RustSQLiteCheckpointer
# 5-6x faster than the default checkpointer
checkpointer = RustSQLiteCheckpointer("state.db")
graph = graph.compile(checkpointer=checkpointer)Cache LLM responses to avoid redundant API calls:
from fast_langgraph import cached
@cached(max_size=1000)
def call_llm(prompt):
return llm.invoke(prompt)
# First call: hits the API (~500ms)
response = call_llm("What is LangGraph?")
# Second identical call: returns from cache (~0.01ms)
response = call_llm("What is LangGraph?")
# Check cache statistics
print(call_llm.cache_stats())
# {'hits': 1, 'misses': 1, 'size': 1}Efficient state merging for high-frequency updates:
from fast_langgraph import langgraph_state_update
new_state = langgraph_state_update(
current_state,
{"messages": [new_message]},
append_keys=["messages"]
)Find bottlenecks with minimal overhead:
from fast_langgraph.profiler import GraphProfiler
profiler = GraphProfiler()
with profiler.profile_run():
result = graph.invoke(input_data)
profiler.print_report()These are the operations where Rust provides the most dramatic improvements:
| Operation | Speedup | Best Use Case |
|---|---|---|
| Checkpoint Serialization | 43-737x | State persistence (scales with state size) |
| Sustained State Updates | 13-46x | Long-running graphs with many steps |
| E2E Graph Execution | 2-3x | Production workloads with checkpointing |
| Feature | Performance | Use Case |
|---|---|---|
| Complex Checkpoint (250KB) | 737x faster than deepcopy | Large agent state |
| Complex Checkpoint (35KB) | 178x faster | Medium state |
| LLM Response Caching | 10x speedup (90% hit rate) | Repeated prompts, RAG |
| Function Caching | 1.6x speedup | Expensive computations |
| In-Memory Checkpoint | 1.4 us/op | Fast state snapshots |
| LangGraph State Update | 1.4 us/op | High-frequency updates |
Note: Rust excels at complex state operations. For simple dict operations, Python's built-in dict (implemented in C) is already highly optimized. See BENCHMARK.md for detailed results.
- Python 3.9+
- Works with any LangGraph version
Authoritative docs live under documentation/docs/ and power the MkDocs site.
- Getting Started - Installation + first run
- User Guide - Automatic/manual acceleration details
- API Reference - Python + Rust surface area
- Architecture - Internal design + trade-offs
- Contributing - Tooling, tests, and release flow
- Author: Dipankar Sarkar (me@dipankar.name)
- Organization: Neul Labs
- Repository: https://github.com/neul-labs/fast-langgraph
- License: MIT
See the examples/ directory for complete working examples:
function_cache_example.py- Caching patternsprofiler_example.py- Performance analysisstate_merge_example.py- State manipulation
Contributions welcome! See documentation/docs/development/contributing.md for setup instructions.
MIT