Pods & Pixels
Subscribe
Sign in
Home
Podcast
Archive
Leaderboard
About
Latest
Top
Scaling and Automating High-Performance ML Pipelines with FSx on EKS
So far, you’ve built a ML training architecture on EKS using FSx for Lustre, optimized it for high-throughput I/O, and validated that it supports…
Mar 11
•
Christopher Adamson
Tuning FSx for Lustre to Maximize Machine Learning I/O Efficiency
Running ML workloads at scale is not just about having enough compute—it’s also about feeding your GPUs or CPUs fast enough data to avoid idle cycles.
Mar 10
•
Christopher Adamson
Running Stateful ML Training Jobs with Mounted Lustre Volumes
With Amazon FSx for Lustre now successfully integrated into your EKS cluster as a PersistentVolume, the real power of this architecture begins to shine.
Mar 9
•
Christopher Adamson
Setting Up Amazon FSx for Lustre with Kubernetes Persistent Volumes
Now that we’ve designed our architecture and understood the role Amazon FSx for Lustre plays in stateful ML workloads, it’s time to bring the storage…
Mar 6
•
Christopher Adamson
February 2026
Running Stateful ML Workloads in EKS with Persistent Volumes on Amazon FSx for Lustre
Modern machine learning pipelines demand not only powerful compute resources but also exceptionally fast access to large volumes of data.
Feb 5
•
Christopher Adamson
Advanced Traffic Shaping and Failure Recovery
You’ve now automated canary deployments using Flagger, AWS App Mesh, Helm, and GitHub Actions, with full observability via Prometheus and Grafana.
Feb 4
•
Christopher Adamson
1
Observability with Prometheus, Grafana, and Flagger Metrics
No canary deployment strategy is complete without robust observability.
Feb 3
•
Christopher Adamson
1
CI/CD Pipeline Integration (GitHub Actions + Helm)
With your EKS cluster, App Mesh, and Flagger all wired up for automated canary deployments, the next step is to integrate this system into a CI/CD…
Feb 2
•
Christopher Adamson
1
January 2026
Deploying a Sample Application with Canary Routing
Now that your EKS cluster, App Mesh controller, and Flagger are ready, it’s time to deploy a real workload and configure it for automated canary…
Jan 30
•
Christopher Adamson
1
Automated Canary Deployments on Amazon EKS using Flagger and AWS App Mesh
Deploying new versions of applications without introducing regressions is a constant challenge.
Jan 29
•
Christopher Adamson
1
Hybrid Mesh and Disaster Recovery Strategies
In this final part, we’ll take your cross-cluster service mesh to the next level by extending the mesh to hybrid environments, including on-premises…
Jan 28
•
Christopher Adamson
Enabling Cross-Cluster Routing and Observability
Now that our services are communicating across clusters using App Mesh and Cloud Map, it’s time to elevate the setup with production-grade features…
Jan 27
•
Christopher Adamson
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts