Pods & Pixels

Pods & Pixels

Running Stateful ML Workloads in EKS with Persistent Volumes on Amazon FSx for Lustre

Christopher Adamson's avatar
Christopher Adamson
Feb 05, 2026
∙ Paid

Modern machine learning pipelines demand not only powerful compute resources but also exceptionally fast access to large volumes of data. Whether you’re training models on massive image datasets, running simulations, or processing real-time streams, the storage layer can quickly become a performance bottleneck. Kubernetes—via Amazon EKS—offers flexibility and scalability for running containerized workloads, but ML workloads that are stateful in nature (e.g., checkpointing, shared datasets, multi-epoch retraining) require more than ephemeral volumes.

User's avatar

Continue reading this post for free, courtesy of Christopher Adamson.

Or purchase a paid subscription.
© 2026 Christopher Adamson · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture