Skip to content

Commit 78f5988

Browse files
committed
update readme.adoc
1 parent 93c9a2d commit 78f5988

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

lustre/SageMaker-training-using-FSxL-on-EKS/readme.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@
55

66
image:FSx-SageMaker-EKS-Tutorial.png[alt="Amazon EFS", align="left",width=420]
77

8-
This tutorial covers how to use *Amazon FSx for Lustre persistent deployment option*, a high-performance, highly available, scalable file storage for *machine learning* workloads on *Kubernetes containers*.
8+
== Introduction
99

1010
Organizations are modernizing their applications by adopting containers and microservices-based architectures. Many customers are deploying high-performance workloads on containers to power microservices architecture, and require access to low latency and high throughput shared storage from these containers. Because containers are transient in nature, these long-running applications also require data to be stored in durable storage. *Amazon FSx for Lustre (FSx for Lustre)* provides the world's most popular high-performance file system, now fully managed and integrated with Amazon S3. It offers a POSIX-compliant, fast parallel file system to enable peak performance and highly durable storage for your Kubernetes workloads. By getting rid of the traditional complexity of setting up and managing Lustre file systems, FSx for Lustre allows you to spin up a high-performance file system in minutes. FSx for Lustre provides sub-millisecond latencies, up to hundreds of gigabytes per second of throughput, and millions of IOPS. Customers use FSx for Lustre for workloads where speed matters, such as machine learning, high performance computing (HPC), video processing, and financial modeling.
1111

1212
*Kubernetes* is an open-source container-orchestration system for automating the deployment, scaling, and management of containerized applications. AWS makes it easy to run Kubernetes without needing to install and operate your own Kubernetes control plane or worker nodes using our managed service *Amazon Elastic Kubernetes Service (Amazon EKS)*.Amazon EKS runs Kubernetes control plane instances across multiple Availability Zones to ensure high availability. Amazon EKS automatically detects and replaces unhealthy control plane instances, and it provides automated version upgrades and patching for them.
1313

14-
In this tutorial, I will focus on FSx for Lustre and cover how to provision *Amazon FSx for Lustre persistent file system* with Amazon EKS cluster, and accelerate your machine learning training using Amazon FSx and *Amazon SageMaker*. High performance workloads running on EKS clusters that require fast, highly available persistent storage can benefit from using Amazon FSx for Lustre. I will cover training a gradient-boosting model on the link::https://en.wikipedia.org/wiki/[MNIST_database[Modified National Institute of Standards and Technology (MNIST) dataset] using the Amazon SageMaker training operator. The MNIST dataset contains images of handwritten digits from 0 to 9 and is a popular machine learning problem. The MNIST dataset contains 60,000 training images and 10,000 test images.
14+
This tutorial covers how to use *Amazon FSx for Lustre persistent deployment option*, a high-performance, highly available, scalable file storage for *machine learning* workloads on *Kubernetes containers*.I will show you how to provision *Amazon FSx for Lustre persistent file system* with Amazon EKS cluster, and accelerate your machine learning training using Amazon FSx and *Amazon SageMaker*. High performance workloads running on EKS clusters that require fast, highly available persistent storage can benefit from using Amazon FSx for Lustre. I will cover training a gradient-boosting model on the *Modified National Institute of Standards and Technology (MNIST) dataset* using the Amazon SageMaker training operator. The MNIST dataset contains images of handwritten digits from 0 to 9 and is a popular machine learning problem. The MNIST dataset contains 60,000 training images and 10,000 test images.
1515

1616
== Basic Components of Kubernetes Containers
1717

0 commit comments

Comments
 (0)