Inference API
Pay per token across all platform models. OpenAI-compatible endpoints, multi-region routing.
Loading model rates…
GPU Compute
Pay per GPU-hour. Same rate for dedicated inference, workspace instances, and clusters. Billed per second of running time.
Loading GPU rates…
Storage
Persistent storage attached to your GPU instances and clusters. Cloud Drives are single-instance (ReadWriteOnce); Shared Filesystems mount across multiple instances (ReadWriteMany). Billed per second of attached time.
Loading storage rates…