DeepSeek Guide
DeepSeek is one consumer app, one official (China-hosted) API, a set of MIT-licensed open weights, and a broad third-party hosting ecosystem that runs those same weights elsewhere. This guide maps those surfaces, shows the OpenAI-compatible quickstart, and is explicit about the data-residency trade-offs that matter for a regulated EU context.
Based on official DeepSeek sources (api-docs.deepseek.com, huggingface.co/deepseek-ai, the DeepSeek privacy policy). The current generation is DeepSeek-V4 (released April 24, 2026). The legacy API model IDs deepseek-chat and deepseek-reasoner still resolve but are scheduled to retire on 2026-07-24 β migrate to the V4 IDs. Pricing and model specs change; re-check the live pricing page before quoting figures.
1. The mental modelβ
| Surface | What it is for | Primary user |
|---|---|---|
DeepSeek app (chat.deepseek.com) | Free consumer chat with a DeepThink toggle (off = fast, on = reasoning), web search, and file upload | End users |
DeepSeek API (platform.deepseek.com) | Pay-as-you-go developer API; OpenAI- and Anthropic-compatible | Developers |
| Open weights (Hugging Face, MIT license) | Download to self-host or fine-tune | Self-hosters, researchers, compliance-driven teams |
| Third-party routing (OpenRouter, Together, Fireworks, DeepInfra, Bedrock, Azure) | Run the open weights on non-China (US/EU) infrastructure | Teams that cannot send data to China, or want one API across models |
2. Model lineup (current, 2026)β
The current generation is DeepSeek-V4 (released April 24, 2026), a hybrid model β reasoning ("thinking") and fast ("non-thinking") behavior are controlled by a mode, not by switching models. This unifies what used to be split between a chat model and the standalone R1 reasoning model.
| Model | Total / active params | Context | Max output | License |
|---|---|---|---|---|
| DeepSeek-V4-Pro | 1.6T / 49B active (MoE) | 1M tokens | 384K tokens | MIT |
| DeepSeek-V4-Flash | 284B / 13B active (MoE) | 1M tokens | 384K tokens | MIT |
The V4-Pro card describes three reasoning modes β Non-Think, Think High, Think Max.
- Multimodality lives in separate models (DeepSeek-OCR / OCR-2, Janus-Pro, DeepSeek-VL2). Treat the
deepseek-v4-*text API as text-in / text-out. - DeepSeek-Coder is no longer a separate hosted model β coding capability is folded into the general models. The older DeepSeek-Coder-V2 open weights remain downloadable but are legacy.
- The V4-Flash parameter count is reported as 284B total / 13B active on the official release note; some listings showed conflicting figures β confirm on the live model card.
3. The APIβ
The DeepSeek API is OpenAI-compatible (and Anthropic-compatible), so it is a drop-in for existing codebases β you change base_url and the key:
- OpenAI format:
https://api.deepseek.com(alsohttps://api.deepseek.com/v1) - Anthropic format:
https://api.deepseek.com/anthropic
Current model identifiers: deepseek-v4-flash and deepseek-v4-pro. The legacy deepseek-chat and deepseek-reasoner still work but retire 2026-07-24 15:59 UTC (they currently map to the non-thinking and thinking modes of V4-Flash). Prompt caching is automatic and makes cached input roughly 50Γ cheaper than uncached.
curl https://api.deepseek.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
-d '{"model": "deepseek-v4-pro", "messages": [{"role": "user", "content": "Hello!"}]}'
4. Data residency and complianceβ
The official API and consumer app are operated from mainland China (per DeepSeek's privacy policy). For a regulated EU B2B context (DSGVO, and customers who are professional-secrecy holders), the hosted API should be treated as not suitable for sensitive or personal data without prior legal review. As precedent, Italy's data-protection authority ordered a block on processing Italian users' data in January 2025.
Two ways to keep data out of China:
- Self-host the MIT-licensed open weights on your own EU/US infrastructure (e.g. vLLM) β prompts and outputs never leave your environment.
- Route via a US/EU-hosted third party β OpenRouter, Together, Fireworks, DeepInfra, AWS Bedrock, or Azure AI Foundry. Expect higher per-token cost than the official API in exchange for residency and SLAs.
Content behavior (factual, third-party): independent red-team studies report that DeepSeek models decline or steer answers on topics that are politically sensitive in China. Because this is applied at the model level (fine-tuning), it can persist even in self-hosted open weights β it is not only an app-layer filter. These are third-party findings, not official DeepSeek statements.
5. Pricing (official, snapshot)β
Per 1M tokens, USD, from the official pricing page. Third-party providers set their own (usually higher) prices. Off-peak discounts are not currently offered.
| Model | Input (cache hit) | Input (cache miss) | Output |
|---|---|---|---|
deepseek-v4-flash | $0.0028 | $0.14 | $0.28 |
deepseek-v4-pro | $0.003625 | $0.435 | $0.87 |
The standout lever is the cache-hit input rate (~50Γ cheaper) β structure long, stable system prompts or RAG context to hit the cache.
6. Decision guideβ
| Situation | Choose⦠|
|---|---|
| cost is the priority, data is non-sensitive, China residency is acceptable | the official DeepSeek API (cheapest, full feature set, OpenAI drop-in) |
| data must stay out of China; you want redundancy / SLAs / a unified multi-model API | third-party routing (Together / Fireworks / DeepInfra / OpenRouter / Bedrock / Azure) |
| strict compliance (DSGVO, etc.) or zero per-token cost at scale, and you have GPUs | self-host the MIT open weights (note: V4-Pro is a 1.6T-param MoE β real infra and ops) |
7. Official linksβ
- DeepSeek (corporate)
- DeepSeek app (chat)
- API docs home
- Models & pricing
- API changelog
- DeepSeek-V4 release note
- Hugging Face β deepseek-ai (open weights, MIT)
- Privacy policy
Related guides