Is DeepSeek V3 actually as good as Claude 3.5 Sonnet for coding?

On day-to-day refactoring, code review, test generation and Python/TS/Rust tasks: very close. On long-horizon agentic loops with heavy tool use, Claude still has an edge. Most users land on a hybrid — DeepSeek for the bulk, Claude for the hardest 10%.

Will Claude Code break if I point it at qlaud?

No. Claude Code reads ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY from your environment. We expose the Anthropic Messages API verbatim, including streaming SSE, tool use, and reasoning content blocks. Same UX, different upstream.

How do reasoning models work over the Anthropic surface?

DeepSeek-R1 emits reasoning_content as a separate stream channel. Our SSE translator opens an Anthropic 'thinking' content block for those deltas, so Claude Code renders the chain-of-thought in its UI just like it would for a real Claude reasoning model.

Why Claude Code on DeepSeek V3 costs 30× less than Claude direct

On a typical Claude Code coding turn we measured ~2,400 input tokens and ~700 output tokens. Multiply that by Claude 3.5 Sonnet's direct API price and you get $0.0177 per turn. Run the same turn through qlaud → DeepSeek V3 and you pay $0.00057 — 31× less. Across a 100-turn debugging session, that's the difference between $1.77 and 5.7 cents.

The math

The numbers below assume the input + output token counts above (a representative coding turn — Claude Code reads a few files, writes a patch, runs a test). All prices are the customer-facing rate (upstream cost × qlaud's 1.07 markup).

Model (via qlaud)	Input $/1M	Output $/1M	Per-turn cost	100-turn session
`claude-3-5-sonnet-20241022`	$3.21	$16.05	$0.0189	$1.89
`deepseek-v3`	$0.29	$1.18	$0.00152	$0.15
`deepseek-r1` (reasoning)	$0.59	$2.34	$0.00305	$0.31
`llama-3.3-70b` via Groq	$0.63	$0.85	$0.00211	$0.21

Why the gap is so big

Anthropic prices Claude 3.5 Sonnet at the top of the closed-frontier-model tier — those models pay for the largest training runs and the biggest research teams. DeepSeek prices V3 at the floor of what their inference economics actually cost, because the open-weights distribution model means they don't need to amortize an exclusive product moat through API margins.

For most coding tasks you don't need the marginal frontier capability — you need decent reasoning + good code synthesis + tool-call reliability. DeepSeek V3 hits that bar today.

The setup, end-to-end

One env var change in your shell:

export ANTHROPIC_BASE_URL=https://api.qlaud.ai
export ANTHROPIC_API_KEY=ak_live_<your_key>
claude

Claude Code reads those two vars and hits us instead of api.anthropic.com. Internally we translate the Anthropic Messages format → OpenAI Chat Completions (DeepSeek's native shape) → Cloudflare AI Gateway → DeepSeek. The streaming response comes back through the same pipe. Tool calls, system prompts, max_tokens, stop_sequences, reasoning content — all preserved.

When you still want Claude

DeepSeek V3 isn't Claude 3.5 Sonnet. The places we've seen Claude pull ahead in real Claude Code sessions:

Agent loops with 20+ tool calls in a row — Claude is more reliable about re-reading the spec each turn.
Subtle multi-file refactors where one change implies edits in three other files — Claude tends to find the implied changes; DeepSeek sometimes stops at the first file.
Anything that benefits from the larger Claude context window (200K vs 64K).

The clean answer: route by task. Use DeepSeek V3 for the 80%, fall back to Claude 3.5 Sonnet for the 20%. Both run through qlaud with the same API key — no SDK juggling, no rebuilding your agent loop.

Get started

Sign up for qlaud, top up $5, point Claude Code at us. The first $5 will run you a few days of DeepSeek V3 sessions. If you don't love it, top-ups don't auto-renew — just walk away.

Why Claude Code on DeepSeek V3 costs 30× less than Claude direct

The math

Why the gap is so big

The setup, end-to-end

When you still want Claude

Get started

Frequently asked questions

Keep reading

Ship an AI agent on qlaud in under a minute.