Deploy ML models
for |
The inference platform that handles optimization, scaling, and global distribution. Push your model, get an endpoint.
Free tier included. No credit card required.
How it works
Three steps to production
Upload Model
Push any PyTorch, TensorFlow, or ONNX model via CLI, SDK, or drag-and-drop. We handle containerization.
Auto-Optimize
Automatic quantization, graph optimization, and hardware-specific compilation. Zero config required.
Deploy Global
One command deploys to 40+ edge regions with auto-scaling, health checks, and zero-downtime rollouts.
Platform
Built for scale
Real-time monitoring
Track latency, throughput, error rates, and cost per inference across all deployed models. Set alerts, compare experiments, and drill into individual requests.
Global edge network
Automatic geo-routing to 40+ regions. Lowest latency for every request, everywhere.
Auto-generated SDKs
Type-safe clients generated from your model schema. Python, TypeScript, Go, Rust.
Enterprise-grade security
SOC 2 Type II certified. End-to-end encryption in transit and at rest, RBAC, audit logs, VPC peering, and air-gapped deployment options.
Performance
32ms p50 latency.
40+ regions. Zero config.
Our inference engine is optimized at every layer -- from custom CUDA kernels to smart request batching. Serve millions of requests with sub-50ms latency across 40+ global regions.
Inference latency (lower is better)
Developer experience
Ship in minutes
The Python SDK that gets out of your way.
import cortex
# Initialize the client
client = cortex.Client("ctx_sk_...")
# Deploy a model
deployment = client.deploy(
model="./models/classifier-v3",
gpu="a100",
replicas=3,
regions=["us-east", "eu-west", "ap-south"]
)
# Run inference
result = deployment.predict(
input={"text": "Analyze this document..."},
stream=True
)
for chunk in result:
print(chunk.output, end="")Trusted by 50K+ developers at
Predictable pricing
Start free. Scale when you're ready.
Starter
$0forever
Pro
$49/mo
Most chosenEnterprise
Custom
Start deploying models today
Free tier. No credit card. Production-ready in minutes.