Library and SDK Integration

Source scope as of July 1, 2026

The snippets below are the integration patterns listed in the Headroom README. They show the shape of each integration; confirm exact signatures, imports, and options against the Headroom docs before wiring them into production.

Use the library path when you own the request code and want compression inline rather than through a proxy.

1. Inline compression

Python:

from headroom import compress

compressed = compress(messages, model="claude-sonnet-4-5")

TypeScript:

import { compress } from 'headroom-ai'

const compressed = await compress(messages, { model })

2. Wrap a provider SDK

Wrapping the SDK client applies compression to every call without changing your call sites:

from headroom import withHeadroom
from anthropic import Anthropic

client = withHeadroom(Anthropic())

The README lists both withHeadroom(new Anthropic()) and withHeadroom(new OpenAI()) for the JS SDKs.

3. Framework adapters

Your setup	Hook in with
Any Python app	`compress(messages, model=…)`
Any TypeScript app	`await compress(messages, { model })`
Anthropic / OpenAI SDK	`withHeadroom(new Anthropic())` · `withHeadroom(new OpenAI())`
Vercel AI SDK	`wrapLanguageModel({ model, middleware: headroomMiddleware() })`
LiteLLM	`litellm.callbacks = [HeadroomCallback()]`
LangChain	`HeadroomChatModel(your_llm)`
Agno	`HeadroomAgnoModel(your_model)`
ASGI apps	`app.add_middleware(CompressionMiddleware)`
Multi-agent	`SharedContext().put / .get`
MCP clients	`headroom mcp install`

Adapters are separate installs

Framework adapters are not in [all]. Install the one you need, e.g. pip install "headroom-ai[langchain]" (also [agno], [strands], [anyllm], [bedrock]). See Install and CLI.

4. Multi-agent shared context

For workflows that span several agents, SharedContext passes compressed context between them:

from headroom import SharedContext

ctx = SharedContext()
ctx.put("plan", plan_data)
plan = ctx.get("plan")

The shared store carries agent provenance and auto-deduplicates, which is what lets Claude, Codex, and Gemini reuse each other's context instead of re-deriving it. This is the same store described on Architecture and modes.

5. Hooking into the pipeline

If you need custom behavior at a specific compression stage, the request lifecycle exposes on_pipeline_event(...) and compression hooks as extension seams. Prefer these over patching internals — the core files stay orchestration-first and provider specifics live under headroom/providers/.

1. Inline compression​

2. Wrap a provider SDK​

3. Framework adapters​

4. Multi-agent shared context​

5. Hooking into the pipeline​

6. References​

1. Inline compression

2. Wrap a provider SDK

3. Framework adapters

4. Multi-agent shared context

5. Hooking into the pipeline

6. References