HiveLLM HiveLLM
AI Gateway · Smart Routing

One API.
100+ LLMs unified.

Connect to OpenAI, Claude, Gemini, DeepSeek, Qwen and 40+ providers through a single endpoint. Auto retry, quota control, cost analytics—all in one place.

Works with OpenAI / Claude / Gemini SDKs 99.9% uptime

50+

Providers

100+

Billable models

50+

API routes

99.9%

Uptime

Connected

All major LLMs · One gateway

OpenAIOpenAI
ClaudeClaude
GeminiGemini
DeepSeekDeepSeek
QwenQwen
DoubaoDoubao
MinimaxMiniMax
MoonshotAIMoonshot
MistralMistral
CohereCohere
GrokGrok
YiYi
ZhipuZhipu
MetaMeta
OpenAIOpenAI
ClaudeClaude
GeminiGemini
DeepSeekDeepSeek
QwenQwen
DoubaoDoubao
MinimaxMiniMax
MoonshotAIMoonshot
MistralMistral
CohereCohere
GrokGrok
YiYi
ZhipuZhipu
MetaMeta

Core features

Built for production AI workloads

01

Smart routing · auto retry

Weighted multi-channel routing with automatic failover, rate limiting, graceful degradation and quota isolation.

OpenAI Claude Gemini DeepSeek Qwen
02

Transparent billing

Per-token billing in real time. Cache-hit discounts detected automatically. One line per call.

Real-time Per-token Cache discount Tiered
03

Zero migration cost

Use the OpenAI / Claude / Gemini SDKs you already have. Just point baseURL at us.

baseURL: hivellm.io/v1
04

Enterprise-grade security

Token groups, IP allow-list, model black/white list, full audit log.

IP allow-list Token groups Audit log Quota isolation

High concurrency

Auto load balancing. 8000+ QPS per instance.

Observable

Tokens, requests, latency and errors in one dashboard.

Multi-tenant

Team / project / user — three-tier permission isolation.

Self-hostable

Run on your own infra, fully under your control. No vendor lock-in.

Transparent pricing

All models · up to 45% off

Every LLM, billed per token in real time. Per-vendor discount shown below; prices follow upstream list.

Up to 45% off

Price per 1M tokens — CNY (USD reference)

Claude

15 models 25% off
  • claude-opus-4-7
    Text in ¥35.5 $5 out ¥177.5 $25
  • claude-opus-4-6
    Text in ¥35.5 $5 out ¥177.5 $25
  • claude-sonnet-4-6
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-opus-4-5-20251101
    Text in ¥35.5 $5 out ¥177.5 $25
  • claude-haiku-4-5-20251001
    Text in ¥7.1 $1 out ¥35.5 $5
  • claude-haiku-4-5-20251001-thinking
    Text in ¥7.1 $1 out ¥35.5 $5
  • claude-3-7-sonnet-20250219-thinking
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-opus-4-1-20250805
    Text in ¥106.5 $15 out ¥532.5 $75
  • claude-3-7-sonnet-20250219
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-opus-4-20250514
    Text in ¥106.5 $15 out ¥532.5 $75
  • claude-sonnet-4-20250514-thinking
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-sonnet-4-5-20250929-thinking
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-3-haiku-20240307
    Text in ¥1.78 $0.25 out ¥8.88 $1.25
  • claude-sonnet-4-5-20250929
    Text in ¥21.3 $3 out ¥106.5 $15
  • claude-opus-4-1-20250805-thinking
    Text in ¥106.5 $15 out ¥532.5 $75

Gemini

13 models 25% off
  • gemini-3.1-flash-image-preview
    Text in ¥3.55 $0.5 out ¥21.3 $3
    Image in out ¥426 $60
  • gemini-3.1-pro-preview
    Text · ≤200K tokens in ¥14.2 $2 out ¥85.2 $12
    Text · >200K tokens in ¥28.4 $4 out ¥127.8 $18
  • gemini-3-flash-preview
    Text in ¥3.55 $0.5 out ¥21.3 $3
  • gemini-3-pro-image-preview
    Text in ¥14.2 $2 out ¥85.2 $12
    Image in out ¥852 $120
  • gemini-2.5-flash
    Text in ¥2.13 $0.3 out ¥17.75 $2.5
  • gemini-2.5-flash-preview-09-2025
    Text in ¥2.13 $0.3 out ¥17.75 $2.5
  • gemini-3-pro-preview
    Text · ≤200K tokens in ¥14.2 $2 out ¥85.2 $12
    Text · >200K tokens in ¥28.4 $4 out ¥127.8 $18
  • gemini-2.0-flash
    Text in ¥0.71 $0.1 out ¥2.84 $0.4
  • gemini-2.5-flash-lite
    Text in ¥0.71 $0.1 out ¥2.84 $0.4
  • gemini-2.5-flash-image
    Image in ¥2.13 $0.3 out ¥213 $30
  • gemini-2.5-pro
    Text · ≤200K tokens in ¥8.88 $1.25 out ¥71 $10
    Text · >200K tokens in ¥17.75 $2.5 out ¥106.5 $15
  • gemini-2.5-flash-lite-preview-09-2025
    Text in ¥0.71 $0.1 out ¥2.84 $0.4
  • gemini-2.5-flash-image-preview
    Image in ¥2.13 $0.3 out ¥213 $30

OpenAI

15 models 45% off
  • gpt-5.5-pro
    Text in ¥213 $30 out ¥1278 $180
  • gpt-5.5
    Text in ¥35.5 $5 out ¥213 $30
  • gpt-5.4-pro
    Text in ¥213 $30 out ¥1278 $180
  • gpt-5.4
    Text in ¥17.75 $2.5 out ¥106.5 $15
  • gpt-5.2
    Text in ¥12.43 $1.75 out ¥99.4 $14
  • gpt-5-pro
    Text in ¥106.5 $15 out ¥852 $120
  • gpt-5.1
    Text in ¥8.88 $1.25 out ¥71 $10
  • gpt-5
    Text in ¥8.88 $1.25 out ¥71 $10
  • gpt-5-mini
    Text in ¥1.77 $0.25 out ¥14.2 $2
  • gpt-5-nano
    Text in ¥0.355 $0.05 out ¥2.84 $0.4
  • gpt-4.1-mini
    Text in ¥2.84 $0.4 out ¥11.36 $1.6
  • gpt-4o-2024-08-06
    Text in ¥17.75 $2.5 out ¥71 $10
  • gpt-4.1-2025-04-14
    Text in ¥14.2 $2 out ¥56.8 $8
  • gpt-4.1-nano
    Text in ¥0.71 $0.1 out ¥2.84 $0.4
  • gpt-4o-mini
    Text in ¥2.84 $0.4 out ¥10.65 $1.5

Prices follow upstream list. Final pricing as shown in the console at top-up time.

Plug-and-play

Change one line. Ship.

Drop-in replacement for OpenAI, Claude and Gemini — SDKs and REST.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hivellm.io/v1",   # ← only line that changes
    api_key="hivellm-xxxxx",
)

resp = client.chat.completions.create(
    model="gpt-4o",                          # or any of 100+ models
    messages=[{"role": "user", "content": "hi"}],
)
print(resp.choices[0].message.content)