Library and SDK Integration
The snippets below are the integration patterns listed in the Headroom README. They show the shape of each integration; confirm exact signatures, imports, and options against the Headroom docs before wiring them into production.
Use the library path when you own the request code and want compression inline rather than through a proxy.
1. Inline compressionโ
Python:
from headroom import compress
compressed = compress(messages, model="claude-sonnet-4-5")
TypeScript:
import { compress } from 'headroom-ai'
const compressed = await compress(messages, { model })
2. Wrap a provider SDKโ
Wrapping the SDK client applies compression to every call without changing your call sites:
from headroom import withHeadroom
from anthropic import Anthropic
client = withHeadroom(Anthropic())
The README lists both withHeadroom(new Anthropic()) and withHeadroom(new OpenAI()) for the JS SDKs.
3. Framework adaptersโ
| Your setup | Hook in with |
|---|---|
| Any Python app | compress(messages, model=โฆ) |
| Any TypeScript app | await compress(messages, { model }) |
| Anthropic / OpenAI SDK | withHeadroom(new Anthropic()) ยท withHeadroom(new OpenAI()) |
| Vercel AI SDK | wrapLanguageModel({ model, middleware: headroomMiddleware() }) |
| LiteLLM | litellm.callbacks = [HeadroomCallback()] |
| LangChain | HeadroomChatModel(your_llm) |
| Agno | HeadroomAgnoModel(your_model) |
| ASGI apps | app.add_middleware(CompressionMiddleware) |
| Multi-agent | SharedContext().put / .get |
| MCP clients | headroom mcp install |
Framework adapters are not in [all]. Install the one you need, e.g. pip install "headroom-ai[langchain]" (also [agno], [strands], [anyllm], [bedrock]). See Install and CLI.
4. Multi-agent shared contextโ
For workflows that span several agents, SharedContext passes compressed context between them:
from headroom import SharedContext
ctx = SharedContext()
ctx.put("plan", plan_data)
plan = ctx.get("plan")
The shared store carries agent provenance and auto-deduplicates, which is what lets Claude, Codex, and Gemini reuse each other's context instead of re-deriving it. This is the same store described on Architecture and modes.
5. Hooking into the pipelineโ
If you need custom behavior at a specific compression stage, the request lifecycle exposes on_pipeline_event(...) and compression hooks as extension seams. Prefer these over patching internals โ the core files stay orchestration-first and provider specifics live under headroom/providers/.