Skip to main content

Library and SDK Integration

Source scope as of July 1, 2026

The snippets below are the integration patterns listed in the Headroom README. They show the shape of each integration; confirm exact signatures, imports, and options against the Headroom docs before wiring them into production.

Use the library path when you own the request code and want compression inline rather than through a proxy.

1. Inline compressionโ€‹

Python:

from headroom import compress

compressed = compress(messages, model="claude-sonnet-4-5")

TypeScript:

import { compress } from 'headroom-ai'

const compressed = await compress(messages, { model })

2. Wrap a provider SDKโ€‹

Wrapping the SDK client applies compression to every call without changing your call sites:

from headroom import withHeadroom
from anthropic import Anthropic

client = withHeadroom(Anthropic())

The README lists both withHeadroom(new Anthropic()) and withHeadroom(new OpenAI()) for the JS SDKs.

3. Framework adaptersโ€‹

Your setupHook in with
Any Python appcompress(messages, model=โ€ฆ)
Any TypeScript appawait compress(messages, { model })
Anthropic / OpenAI SDKwithHeadroom(new Anthropic()) ยท withHeadroom(new OpenAI())
Vercel AI SDKwrapLanguageModel({ model, middleware: headroomMiddleware() })
LiteLLMlitellm.callbacks = [HeadroomCallback()]
LangChainHeadroomChatModel(your_llm)
AgnoHeadroomAgnoModel(your_model)
ASGI appsapp.add_middleware(CompressionMiddleware)
Multi-agentSharedContext().put / .get
MCP clientsheadroom mcp install
Adapters are separate installs

Framework adapters are not in [all]. Install the one you need, e.g. pip install "headroom-ai[langchain]" (also [agno], [strands], [anyllm], [bedrock]). See Install and CLI.

4. Multi-agent shared contextโ€‹

For workflows that span several agents, SharedContext passes compressed context between them:

from headroom import SharedContext

ctx = SharedContext()
ctx.put("plan", plan_data)
plan = ctx.get("plan")

The shared store carries agent provenance and auto-deduplicates, which is what lets Claude, Codex, and Gemini reuse each other's context instead of re-deriving it. This is the same store described on Architecture and modes.

5. Hooking into the pipelineโ€‹

If you need custom behavior at a specific compression stage, the request lifecycle exposes on_pipeline_event(...) and compression hooks as extension seams. Prefer these over patching internals โ€” the core files stay orchestration-first and provider specifics live under headroom/providers/.

6. Referencesโ€‹