Flox CUDA
Kickstart Program

Deterministic, reproducible CUDA ML/AI runtimes on NVIDIA GPUs. Define ML/AI runtimes as code. Promote by switching a pinned reference. Revert the same way.

Request Access

Kickstart Program

From Model to Production:
CUDA Environments, Pre-Validated

The CUDA Kickstart Program ships a reference architecture and validated Flox environments for common serving patterns (diffusion, LLM inference, Triton, PyTorch inference) plus GPU-architecture-specific builds of PyTorch, torchvision, torchaudio, vLLM, ONNX Runtime, and other core frameworks. Teams ship as slim OCI images or run directly on Kubernetes from a pinned Flox environment reference.

Security & compliance
Deterministic SBOMs Faster CVE triage Promotion + rollback
Learn more
Resources
Technical Case Study Reproducible environments Reproducible Builds
Learn more
Free start
Get in touch
Learn more

Diagram showing four common CUDA stacks, their philosophies, and aha moments.

Zero Drift

Promote validated runtimes from eval to serving by switching a single immutable reference.

Zero Trust

Deterministic SBOMs and provenance computed from each runtime environment's dependency graph.

Zero Fat

Smaller artifacts and faster rollouts via GPU-specific builds, option to generate optimized OCI images.

How to get started

Flox in 5 minutes

If you’re completely new to Flox, start with our Flox in 5 minutes guide.

Get the Kickstart Kit

Get access to validated Flox environments for common GPU serving patterns, GPU-specific builds of ML frameworks like PyTorch, ONNX Runtime, and other resources.

Talk to a solutions engineer

Get concrete recommendations for your stack, deployment targets, and GPU architectures, along with guidance on adopting Flox in your environment.

How it works

CUDA stacks move fast.
Production has to stay stable.

CUDA-accelerated ML workloads depend on a fragile matrix of tightly coupled CUDA user-space libraries, Python runtimes, native libraries, and serving frameworks. Teams historically use OCI images to isolate these dependencies, but container rebuild → push → pull → test loops slow them down. Flox gives teams a declarative, reproducible alternative that eliminates image rebuild loops and provides a reviewable diff across OS, Python, and CUDA dependencies. This simplifies CVE patching, runtime validation, and audits.

Diagram showing host OS variations flowing through Flox/Nix to stable binary execution regardless of host.

Reference Architecture

A deployment blueprint for deterministic CUDA ML environments across GPU fleets: promotion gates, rollback mechanics, and reproducible runtime definitions.

Validated Environments

Ready-to-run environments for common serving patterns: diffusion, LLM inference, Triton, PyTorch inference, and more.

Build Recipes for GPU-Specific Frameworks

Deterministic, reproducible builds for GPU-architecture-specific artifacts (PyTorch, vLLM, ONNX Runtime, llama.cpp, and more) to reduce artifact size and reduce rollout time across GPU fleets.

Technical Case Study (Capital Markets)

How declarative, deterministic runtime environments help highly regulated, latency-sensitive orgs move CUDA ML/AI workloads from R&D to production. Dependency changes are atomic edits to an environment definition. Promotion and rollback are reference switches, not container rebuilds. Deterministic SBOMs accelerate CVE triage and response.

Demos and Blogs

Walkthroughs and implementation notes for operating reproducible CUDA stacks.

How it works

A secure, repeatable path from R&D to production

Flox environments resolve from a pinned environment definition, together with its lockfile, that resolves to an immutable, hash-addressed dependency set. That yields a tamper-evident chain stretching from what you declared (i.e., the environment definition and lockfile) to what runs (i.e., the realized runtime packages). SBOMs are derived from the dependency graph itself, which makes it easier to map CVEs to what's actually running in production; remediation becomes an edit to a pinned runtime definition plus a reference promotion.

Deterministic SBOMs per runtime reference/generation
Faster CVE triage by mapping alerts to environment refs/hashes
Patch by editing the declared runtime, promote by switching the ref
Roll back instantly by reverting the ref

Free e-book

Deterministic ML Infrastructure for Capital Markets

See how financial services teams use Flox and Nix to ship CUDA ML stacks with reproducible environments, atomic rollbacks, and provable software supply chains.

Download for free

Capital Market Case Study e-book cover displayed on a tablet

Explore Flox for
your organization

Get concrete recommendations for your stack, deployment targets, and GPU architectures, along with guidance on adopting Flox in your environment.

Trusted by teams building the future

“Flox removes the risk of environment drift by letting you replicate your exact production environment during development, regardless of architecture differences between OSes.”

Priya Ananthasankar

Principal Software Engineer at Microsoft

Flox CUDA
Kickstart Program

From Model to Production:
CUDA Environments, Pre-Validated

Security & compliance

Resources