AI Infrastructure for Companies

The AI layer your
product runs on.

White-label inference, multi-model consensus blends, and dynamic resilience routing — built for AI companies that cannot afford downtime. When a provider goes down, our blends absorb it. Your API is built to absorb provider failures at the blend layer.

Infrastructure for AI companies. Not a developer gateway.

OpenRouter connects developers to models. Calculus gives AI companies the layer they need to ship products — blending, branding, healing, and opacity included.

🎭

White-Label by Design

Your brand, your model names. On Epic, responses return yourco-pro — not calculus-pro. Your customers never see us.

🧠

Multi-Model Consensus

14 hybrid blends combine frontier models with proprietary consensus voting. 90/10 or 50/50 weighting. Blend-native architecture is a core design principle, not an add-on.

🛡️

Full Provider Opacity

We strip all upstream fingerprints — headers, model names, trace IDs, error messages. Your stack is invisible to customers and competitors.

🔧

Response Healing

Malformed JSON from any model is automatically detected and repaired before delivery. Five-pass healing pipeline handles the common failure modes that break integrations.

🔒

Zero Data Retention — Two Layers

No storage at the Calculus layer. When ZDR is active, we route exclusively through enterprise inference infrastructure that operates under provider-level data processing agreements prohibiting training on customer data. Provider-level ZDR, not just gateway-level.

🔄

Blend-Native Resilience

Every blend spans multiple providers. The architecture is designed so provider degradation is absorbed at the blend layer rather than surfaced to callers. Automatic weight redistribution on outage is on the roadmap for Q2 2026.

Two ways to access models

Use Calculus tiers for intelligent auto-routing across providers, or request named models directly for pinned access to specific frontier models with live market pricing.

~400ms
Fast tier latency
2M
Max context tokens
11
Named models live
10+
Providers in pool
6h
Price scan cadence
99.5%+
Target availability
Calculus Tier Models — intelligent auto-routing · provider-agnostic · automatic fallback
Model ID
Capabilities & Context
Pricing (per 1M)
Access
calculus-mini
calculus-mini
Ultra-low latency. Classification, extraction, short Q&A, intent detection.
131k ctx ~400ms 4k out
$0.50 / $1.50
Vibe
calculus-fast
calculus-fast
Fast general-purpose. Chatbots, summaries, structured output, tool use.
131k ctx ~800ms 8k out vision
$0.50 / $1.50
Vibe
calculus-standard
calculus-standard
Balanced quality + cost. Complex instructions, multi-turn, JSON mode, function calling.
256k ctx ~1.5s 16k out vision fn calls
$3.00 / $10.00
Elite
calculus-coder
calculus-coder
Optimized for software engineering. Code gen, debugging, architecture, PR review, test writing.
256k ctx ~2s 32k out fn calls multi-lang
$4.00 / $14.00
Elite
calculus-analyst
calculus-analyst
Long-context specialist. Document review, financial analysis, research synthesis. Up to 2M context.
2M ctx ~2.5s 16k out vision doc review
$3.50 / $12.00
Elite
calculus-pro
calculus-pro
Frontier reasoning. Critical decisions, multi-step chains, complex analysis. Top-tier models only.
400k ctx ~3s 32k out vision enforced
$7.00 / $22.00
Epic
calculus-ultra
calculus-ultra
Maximum intelligence, highest-priority routing. Extended output, deep research, complex code chains.
400k ctx ~3s 64k out vision priority
$7.00 / $22.00
Epic
Named Models — Live Pricing — pinned to specific model · dynamic pricing updated every 6h · tier-gated

Request any model by its exact name. Pricing tracks the live market — we scan competitor rates every 6 hours and update automatically. Elite tier unlocks value models; Epic unlocks the full catalog.

Model
Notes
In / 1M
Out / 1M
Access
Claude Sonnet 4.5
claude-sonnet-4.5
Anthropic's frontier reasoning model. 200k context, vision, function calling. Strong for complex analysis.
200k ctx 8k out vision fn calls
$3.45
$17.25
Epic
GPT-4.1
gpt-4.1
OpenAI GPT-4.1. 1M context, vision, strong coding and instruction following.
1M ctx 32k out vision fn calls
$2.40
$9.60
Epic
DeepSeek R1
deepseek-r1
Open-weight chain-of-thought reasoning. Math, logic, research. MIT license.
131k ctx 32k out reasoning
$0.91
$3.25
Elite
Kimi K2.5
kimi-k2
Moonshot AI 1T MoE model. 256k context, vision. Exceptional multilingual and long-context reasoning.
256k ctx 32k out vision 1T params
$0.69
$2.75
Epic
Qwen3 235B
qwen3-235b
Alibaba 235B MoE. 100+ language support, strong math and coding. Apache 2.0.
131k ctx 32k out 235B MoE 100+ lang
$0.57
$2.28
Elite
GLM-5
glm-5
ZhipuAI 754B model. 203k context, ultra-deep reasoning and document analysis.
203k ctx 32k out 754B
$0.52
$1.82
Elite
MiniMax M2.5
minimax-m2
2M context window. Book-length documents, large codebases, long-horizon tasks.
2M ctx 8k out
$0.52
$2.86
Elite
DeepSeek V3
deepseek-v3
Top open-weight coding and instruction model. MIT license. Excellent price-to-performance.
131k ctx 8k out fn calls
$0.43
$1.20
Elite
Gemini 2.5 Flash
gemini-2.5-flash
Google Gemini 2.5 Flash. 1M context, vision, multimodal. Fast and cost-effective.
1M ctx 65k out vision
$0.36
$3.00
Elite
Llama 3.3 70B
llama-3.3-70b
Meta Llama 3.3 70B. Ultra-low latency, 131k context. Ideal for real-time applications. Apache 2.0.
131k ctx 8k out ultra-fast
$0.13
$0.42
Elite
Llama 4 Scout
llama-4-scout
Meta Llama 4 Scout. 1M context, vision, MoE architecture. Best open-weight for RAG and long docs. Apache 2.0.
1M ctx 16k out vision MoE
$0.11
$0.42
Elite

Pricing updated automatically via live market scan every 6 hours. All named model access via authorized inference infrastructure.

Your stack. Invisible.

We actively strip all provider fingerprints before returning responses. Your customers and competitors cannot determine which models or infrastructure handle your workload.

Epic: White-Label Model Aliases

On Epic, the model namespace is yours. Instead of returning calculus-pro, we return whatever alias you configure — your company name, your product name, your brand.

DEFAULT (Calculus branding)
"model": "calculus-pro"
"id": "x-calculus-request-id"
EPIC WHITE-LABEL (your branding)
"model": "joeai-pro"
"id": "x-joeai-request-id"
  • Custom model prefix — yourco-fast, yourco-pro, yourco-ultra
  • Custom trace ID header — x-yourco-request-id
  • Your customers see your brand at every layer of the response
  • Fully OpenAI-compatible — drop-in for any SDK
  • Configured once at account level — no per-request overhead

Calculus Hybrid Blends

Blends serve two purposes simultaneously: they improve output quality through consensus voting, and they reduce single points of failure. Every blend spans multiple providers — the architecture is designed so provider outages are absorbed at the blend layer, not surfaced to callers. Automatic weight redistribution ships Q2 2026.

NORMAL OPERATION
blend-ultra-reason (50/50)
Claude Opus 4.6 — 50% + Grok-3 — 50%
✓ Consensus vote → structured output
PROVIDER OUTAGE DETECTED
blend-ultra-reason (auto-reweighted)
Claude Opus 4.6 Grok-3 — 100%
✓ Zero errors returned to caller
Weighting Options
Weights are your uptime architecture, not just a quality setting.
99/1 Warm standby via live traffic — 1% of requests exercise the backup continuously, so failover is instant and verified, not cold. Available when blend health monitoring ships (Q2 2026).
90/10 Primary model dominant with meaningful secondary signal. Speed and quality of a single model, resilience of a blend.
50/50 Equal consensus voting — both models answer, best response wins. Maximum accuracy and depth for critical workloads.
Custom Epic tier — set any weight, tune per workload, per customer, per use case. Full uptime architecture control.
Each blend routes your prompt, runs parallel calls when needed, applies consensus voting, and returns clean structured output. Automatic weight redistribution on provider outage is in development (Q2 2026) — when live, provider failures will be absorbed with no errors surfaced to callers.
Creative & Strategy
Blend
Description
90/10
50/50
Grok/Claude Consensus
blend-grok-claude
Bold ideas meet meticulous refinement. Grok-3 drives creative spark; Claude Opus 4.6 polishes with precision.
Grok-3-fast + Claude Opus 4.6
Grok-3 + Claude Opus 4.6
Grok Creative Spark
blend-grok-creative
Witty, uncensored ideation with professional polish. Perfect for content, storytelling, and marketing.
Grok-3-fast + Claude Haiku 4.5
Grok-3 + Claude Haiku 4.5 + GPT-5-mini
Reasoning & Analysis
Blend
Description
90/10
50/50
Ultra-Reason Trio
blend-ultra-reason
Deep, nuanced analysis through three frontier minds. Ideal for research, complex problem-solving, and debates.
Claude Opus 4.6 + Grok-3
Claude Opus 4.6 + Grok-3 + GLM-5
Claude/GPT Hybrid
blend-claude-gpt
Careful step-by-step reasoning fused with speed and creativity. Best for reports, emails, and professional deliverables.
Claude Opus 4.6 + GPT-5.2
Claude Opus 4.6 + GPT-5.2 (equal)
GPT-5 Heavy Verify
blend-gpt5-verify
Frontier-level intelligence with built-in cross-checks. Eliminates hallucinations on high-stakes queries.
GPT-5.2 + Grok-3-beta
GPT-5 + Grok-3 + Claude Sonnet 4
Agentic & Engineering
Blend
Description
90/10
50/50
GPT-5 Agent Stack
blend-gpt5-agent
Autonomous agent workflows with powerful tool-calling and multimodal support.
GPT-5 + Gemini 2.5-flash
GPT-5 + Gemini 2.5-flash + Grok-3-fast
Claude Engineering Focus
blend-claude-eng
Expert-level code review and large-scale development. Handles massive codebases with 1M+ context.
Claude Opus 4.6 + Llama-4-maverick
Claude Opus 4.6 + Llama-4-maverick + GLM-5
Speed & Efficiency
Blend
Description
90/10
50/50
Kimi Speed Blend
blend-kimi-speed
Blazing-fast responses with strong depth. The everyday workhorse for high-volume use.
Kimi-K2.5 + MiniMax-M2.5
Kimi-K2.5 + MiniMax-M2.5 + Grok-3-mini
MiniMax Efficiency King
blend-minimax-eff
Maximum performance per dollar. Perfect for long sessions and cost-sensitive scaling.
MiniMax-M2.5 + Kimi-K2.5
MiniMax-M2.5 + Kimi-K2.5 + Qwen3-235B
Cost-Optimized Daily Driver
blend-daily-driver
Best balance of intelligence and affordability for general use and chat.
Kimi-K2.5 + GLM-5
Kimi-K2.5 + GLM-5 + Grok-3-mini
Long-Context & Multimodal
Blend
Description
90/10
50/50
Gemini Long-Context Beast
blend-gemini-long
Handles massive documents, books, or codebases with perfect structure and recall.
Gemini 2.5-flash (1M) + Claude Sonnet 4
Gemini 2.5-flash + Claude Sonnet 4 + Llama-4-maverick
MoE Speed Blend
blend-moe-speed
Ultra-long context powerhouse with efficient mixture-of-experts scaling.
Llama-4-maverick + GLM-5
Llama-4-maverick + GLM-5 + Gemini 2.5-flash
Gemini Vision + Claude
blend-gemini-vision
Advanced image, diagram, and visual analysis paired with expert text reasoning.
Gemini 2.5-flash (vision) + Claude Opus 4.6
Gemini 2.5-flash + Claude Opus 4.6 + GPT-4o
xAI + Google Turbo
blend-xai-google
Real-time, fun, and multimodal responses with lightning speed.
Grok-3 + Gemini 2.5-flash-lite
Grok-3 + Gemini 2.5-flash-lite + Kimi-K2.5
Qwen3 Heavy Lift
blend-qwen3-heavy
Massive reasoning capacity at competitive cost. Excellent for heavy analytical workloads.
Qwen3-235B + MiniMax-M2.5
Qwen3-235B + MiniMax-M2.5 + Claude Sonnet 4

Simple. Usage-based. No lock-in.

Pay only for what you use. No monthly minimums on Vibe and Elite. Enterprise contracts available.

Vibe
$0 to start
Fast and mini tier models only
  • calculus-mini + calculus-fast
  • 20 requests/minute
  • OpenAI-compatible endpoint
  • Usage dashboard
  • Elite / Epic models
  • Named model access
  • SLA guarantee
Epic
Pay-as-you-go
Full catalog including frontier models
  • All Elite models
  • calculus-pro + calculus-ultra
  • 150–300 requests/minute
  • Named: Claude Sonnet 4.5 + GPT-4.1
  • Bucket enforcement guarantees
  • Dedicated routing priority
  • Custom rate limits available
  • White-label model aliases (your brand, not ours)

Built to stay up when providers go down

Each request tries the next provider in your tier's chain if the primary fails. You get responses — not errors.

10+
Providers in pool
11
Named models live
2M
Max context tokens
6h
Price scan cadence
99.5%+
Target availability

Production features. Not afterthoughts.

All current features ship in the base product. No add-on pricing, no professional services engagement required. Roadmap items (blend health monitoring, auto-reweighting) are noted where applicable.

🔒

Zero Data Retention (ZDR) — Two Layers

No prompt or completion stored at the Calculus layer. Enable account-wide in settings or per-request via X-Calculus-ZDR: true header.

When ZDR is active, we enforce provider-level ZDR by routing exclusively through our enterprise inference network — infrastructure operating under provider-level data processing agreements that prohibit training on customer data. Your prompts are not stored or used at any layer of the stack.

✓ Calculus layer  ·  ✓ Provider layer (via enterprise DPA)  ·  ✓ Per-request  ·  ✓ Account-wide
🔄

Blend-Native Resilience ROADMAP Q2 2026

Blends are architected to span multiple providers simultaneously. The next phase adds continuous health monitoring and automatic weight redistribution — when a provider goes offline, its blend weight moves to healthy models in real time. This is the Q2 2026 roadmap target.

A 50/50 blend would become 100% on the surviving model. When the provider recovers, weights restore automatically. This is currently in development and targeted for Q2 2026.

⏳ Health monitoring — in development  ·  ⏳ Auto-reweighting — in development  ·  ✓ Multi-provider blend architecture  ·  ✓ Consensus voting live
🔧

Response Healing

Malformed JSON from any upstream model is automatically detected and repaired before the response is returned. Five-pass healing pipeline handles the failure modes that typically break integrations — even at 3am.

✓ JSON repair  ·  ✓ Schema validation  ·  ✓ Zero client-side handling

Prompt Caching

Warm-context reuse on supported adapters. Long system prompts cached across requests — cost and latency reduction on repeated contexts. Tracked separately in /api/usage as cached tokens.

✓ Automatic  ·  ✓ Cost-tracked  ·  ✓ No config required
🏛️

SOC 2 Type II IN PROGRESS

Security controls, access management, encryption, and data handling policies are being formally audited to SOC 2 Type II standards — the gold standard for enterprise procurement sign-off.

Enterprise customers requiring compliance documentation before signing: a controls summary and architecture review are available under NDA. Contact api@calculusresearch.io to begin the process.

⏳ Audit in progress  ·  ✓ Controls summary under NDA  ·  ✓ Architecture review available
🔗

Framework Integrations

Any framework that wraps the OpenAI SDK works out of the box. Change one URL. Nothing else.

OpenAI SDK LangChain Vercel AI SDK PydanticAI LlamaIndex CrewAI AutoGen Claude Code MCP
📡

Observability

Per-model token tracking, latency histograms, per-key spend breakdown, and CSV export via /api/usage. Webhook delivery for spend threshold alerts.

✓ Real-time  ·  ✓ Per-model  ·  ✓ Exportable

Works with every major framework

Drop-in replacement for any OpenAI-compatible SDK or tool. Change one line — your base URL. Everything else is identical.

# Python — tier model (auto-routing) from openai import OpenAI client = OpenAI( base_url="https://ai.calculusresearch.io/api", api_key="calc-k1-your-key-here", ) response = client.chat.completions.create( model="calculus-standard", messages=[{"role": "user", "content": "Hello"}], )
# Python — named model (pinned to specific model) response = client.chat.completions.create( model="claude-sonnet-4.5", # ← exact model name messages=[{"role": "user", "content": "Analyze this document..."}], ) # Other named models: gpt-4.1, kimi-k2, deepseek-r1, gemini-2.5-flash, # llama-4-scout, qwen3-235b, glm-5, minimax-m2, llama-3.3-70b
# LangChain (Python) from langchain_openai import ChatOpenAI llm = ChatOpenAI( model="calculus-pro", openai_api_base="https://ai.calculusresearch.io/api", openai_api_key="calc-k1-your-key-here", )
// Vercel AI SDK (TypeScript) import { createOpenAI } from '@ai-sdk/openai'; const calculus = createOpenAI({ baseURL: 'https://ai.calculusresearch.io/api', apiKey: 'calc-k1-your-key-here', }); const result = await generateText({ model: calculus('calculus-pro'), prompt: 'Your prompt here', });
# ZDR — Zero Data Retention per-request curl https://ai.calculusresearch.io/api/chat \ -H "Authorization: Bearer calc-k1-your-key-here" \ -H "X-Calculus-ZDR: true" \ -H "Content-Type: application/json" \ -d '{"model":"deepseek-r1","messages":[{"role":"user","content":"..."}]}'

Common questions

What is Calculus AI and who is it for? +
Calculus AI is AI infrastructure for companies building AI products — not a developer API playground. We provide a single OpenAI-compatible endpoint with white-label branding, multi-model consensus blending, provider opacity, Zero Data Retention, and named access to frontier models. If you're building an AI product and need your stack to be invisible to your customers, that's us.
How is this different from OpenRouter or direct provider APIs? +
OpenRouter routes each request to a single model. We run multi-model consensus blends — parallel calls, consensus voting, structured output. OpenRouter has no white-label capability; on our Epic tier your customers see your brand, not ours. Direct provider APIs expose your stack (Anthropic headers, OpenAI fingerprints); we strip all of that. We're the layer for companies that need infrastructure, not access.
What does white-label mean exactly? +
On Epic tier, every response returns your model names (yourco-pro instead of calculus-pro) and your trace ID header (x-yourco-request-id). Configured once at the account level. Your customers, your investors, your competitors — none of them can determine which models or infrastructure power your product.
What is your Zero Data Retention policy? +
ZDR operates at two layers. At the Calculus layer: no prompts or completions are stored, period. At the provider layer: when ZDR is active (via X-Calculus-ZDR: true header or account setting), we route exclusively through enterprise inference infrastructure operating under provider-level data processing agreements that prohibit training on your data. Your prompts are not stored or used at any layer of the stack.
Which SDKs and frameworks work with Calculus AI? +
Any framework that wraps the OpenAI SDK works with a single base URL change. This includes the OpenAI Python SDK, OpenAI Node SDK, LangChain, Vercel AI SDK, PydanticAI, LlamaIndex, CrewAI, AutoGen, and Claude Code MCP. Change one line — base_url="https://ai.calculusresearch.io/api" — and everything else stays identical.
What happens if a provider goes down mid-blend? +
Automatic blend weight redistribution is in development, targeted for Q2 2026. When live: a provider outage will trigger instant reweighting to healthy models with no errors surfaced to callers — a 50/50 blend becomes 100% on the survivor. Today, blends deliver quality improvements via consensus voting across multiple models. For maximum uptime right now, named models with client-side retry logic are recommended.
What are Hybrid Blends? +
Hybrid Blends are proprietary multi-model ensembles — not raw model access. Each blend routes your prompt to multiple frontier models in parallel, applies consensus voting (90/10 dominant or 50/50 equal weighting), and returns clean structured output. 14 blends across Creative, Reasoning, Agentic, Speed, and Long-Context tiers. No other API gateway offers this as a native product.
Are you SOC 2 certified? +
SOC 2 Type II audit is currently in progress. Enterprise customers requiring compliance documentation before signing can request a controls summary and architecture review under NDA. Email api@calculusresearch.io to start the process.

Ready to build your AI product?

Enterprise access provisioned within 24 hours. White-label config, ZDR setup, and onboarding call included on Epic.

🔒
Calculus AI

Confidential Document

Enter your access code to continue.

Calculus Research  ·  Confidential  ·  Not for redistribution