Real-time compute orchestration
Stateful reliability and durable compute
Continue workloads without interruption through infrastructure failure. Workloads are continuously saved on configurable intervals, enabling them to be automatically resumed in the case of infra or workload failure. Suspend work indefinitely without incurring compute costs or losing work and guarantee eventual completion by resuming when conditions (price/performance) are right.

Elastically and dynamically scalable in real-time.
Scale up and down workloads with higher performance, utilization and faster response times than previously available. Dynamic resizing of compute resources enables workloads to elastically and dynamically scale across instances, clusters, regions and clouds. Preempt and save workloads quickly to downscale resources without losing progress or performance.
Seamless and transparent integration
Kubernetes aware SMR. Works with all your cloud native tooling including Terraform and Helm Broad container and runtime support including Kata, Podman, Containerd We support CPU and GPU containerized workloads.
Get Started
Play in the sandbox

We’ve deployed a test cluster for you to play with where you can interact and experiment with the system.
SandboxGet a demo

We’ve deployed a test cluster for you to play with where you can interact and experiment with the system.
ConnectAPI Reference & Guides

We’ve deployed a test cluster for you to play with where you can interact and experiment with the system.
Documentation