|
| 1 | += Configure Amazon SageMaker operator for Kubernetes |
| 2 | +:toc: |
| 3 | +:icons: |
| 4 | +:linkattrs: |
| 5 | +:imagesdir: ../resources/images |
| 6 | + |
| 7 | + |
| 8 | +== Summary |
| 9 | + |
| 10 | +This section will cover configuring *Amazon SageMaker Operator for Kubernetes*. First, we will setup IAM roles and permissions for SageMaker operator and then setup the operator on the kubernetes cluster. |
| 11 | + |
| 12 | + |
| 13 | +== Duration |
| 14 | + |
| 15 | +NOTE: It will take approximately 10 minutes to complete this section. |
| 16 | + |
| 17 | + |
| 18 | +== Step-by-step Guide |
| 19 | + |
| 20 | +IMPORTANT: Read through all steps below before continuing. |
| 21 | + |
| 22 | +For the operator to access your SageMaker resources, you first need to configure a Kubernetes service account with an OIDC authenticated role that has the proper permissions. For more information, see link:https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html[Enabling IAM Roles for Service Accounts on your Cluster]. |
| 23 | + |
| 24 | +=== Setup IAM roles and permissions for SageMaker Operator: |
| 25 | + |
| 26 | +1. Associate an IAM OpenID Connect (OIDC) provider with your EKS cluster for authentication with AWS resources using commands shown below: |
| 27 | ++ |
| 28 | +[source,bash,subs="verbatim,quotes"] |
| 29 | +---- |
| 30 | +# Set the AWS region and EKS cluster name |
| 31 | +export CLUSTER_NAME=<EKS CLUSTER NAME> |
| 32 | +export AWS_REGION=<AWS Region> |
| 33 | +---- |
| 34 | ++ |
| 35 | +=============================== |
| 36 | +*Example*: |
| 37 | +
|
| 38 | +export CLUSTER_NAME="FSxL-Persistent-Cluster" |
| 39 | +
|
| 40 | +export AWS_REGION="us-east-1" |
| 41 | +=============================== |
| 42 | ++ |
| 43 | +Next, Run the command shown below: |
| 44 | ++ |
| 45 | +[source,bash,subs="verbatim,quotes"] |
| 46 | +---- |
| 47 | +eksctl utils associate-iam-oidc-provider --cluster ${CLUSTER_NAME} --region ${AWS_REGION} --approve |
| 48 | +
|
| 49 | +---- |
| 50 | ++ |
| 51 | + |
| 52 | +You should see output as shown below: |
| 53 | ++ |
| 54 | +[source,bash,subs="verbatim,quotes"] |
| 55 | +---- |
| 56 | +[i] eksctl version 0.16.0 |
| 57 | +[i] using region us-east-1 |
| 58 | +[i] will create IAM Open ID Connect provider for cluster "FSxL-Persistent-Cluster" in "us-east-1" |
| 59 | +[✔] created IAM Open ID Connect provider for cluster "FSxL-Persistent-Cluster" in "us-east-1" |
| 60 | +
|
| 61 | +---- |
| 62 | ++ |
| 63 | + |
| 64 | +Now that your Kubernetes cluster in EKS has an OIDC identity provider, you can create a role and give it permissions. |
| 65 | + |
| 66 | +2. Obtain the OIDC issuer URL using command below: |
| 67 | + |
| 68 | ++ |
| 69 | +[source,bash,subs="verbatim,quotes"] |
| 70 | +---- |
| 71 | +aws eks describe-cluster --name ${CLUSTER_NAME} --region ${AWS_REGION} --query cluster.identity.oidc.issuer --output text |
| 72 | +---- |
| 73 | ++ |
| 74 | + |
| 75 | +The output from above command returns a URL as shown below. If the output is None, make sure your AWS CLI has a version listed in Prerequisites. |
| 76 | ++ |
| 77 | +[source,bash,subs="verbatim,quotes"] |
| 78 | +---- |
| 79 | +https://oidc.eks.${AWS_REGION}.amazonaws.com/id/{Your OIDC ID} |
| 80 | +---- |
| 81 | ++ |
| 82 | + |
| 83 | +=============================== |
| 84 | +*Example*: https://oidc.eks.us-east-1.amazonaws.com/id/*3A12CB18E75BD1133D141F6C7B5FB266* |
| 85 | +=============================== |
| 86 | + |
| 87 | +3. Next use the OIDC ID returned by previous command to create your role. Create a new file named *trust.json* using the command shown below. Replace the *AWS account ID* with your own ID, *EKS cluster region* and *OIDC ID*: |
| 88 | ++ |
| 89 | +[source,json] |
| 90 | +---- |
| 91 | +{ |
| 92 | + "Version": "2012-10-17", |
| 93 | + "Statement": [{ |
| 94 | + "Effect": "Allow", |
| 95 | + "Principal": { |
| 96 | + "Federated": "arn:aws:iam::<AWS account ID>:oidc-provider/oidc.eks.<EKS cluster region>.amazonaws.com/id/<OIDC ID>" |
| 97 | + }, |
| 98 | + "Action": "sts:AssumeRoleWithWebIdentity", |
| 99 | + "Condition": { |
| 100 | + "StringEquals": { |
| 101 | + "oidc.eks.<EKS cluster region>.amazonaws.com/id/<OIDC ID>:aud": "sts.amazonaws.com", |
| 102 | + "oidc.eks.<EKS cluster region>.amazonaws.com/id/<OIDC ID>:sub": "system:serviceaccount:sagemaker-k8s-operator-system:sagemaker-k8s-operator-default" |
| 103 | + } |
| 104 | + } |
| 105 | + }] |
| 106 | +} |
| 107 | +---- |
| 108 | ++ |
| 109 | + |
| 110 | +=============================== |
| 111 | +*Example*: |
| 112 | +[source,json] |
| 113 | +---- |
| 114 | +{ |
| 115 | + "Version": "2012-10-17", |
| 116 | + "Statement": [{ |
| 117 | + "Effect": "Allow", |
| 118 | + "Principal": { |
| 119 | + "Federated": "arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/3A12CB18E75BD1133D141F6C7B5FB266" |
| 120 | + }, |
| 121 | + "Action": "sts:AssumeRoleWithWebIdentity", |
| 122 | + "Condition": { |
| 123 | + "StringEquals": { |
| 124 | + "oidc.eks.us-east-1.amazonaws.com/id/3A12CB18E75BD1133D141F6C7B5FB266:aud": "sts.amazonaws.com", |
| 125 | + "oidc.eks.us-east-1.amazonaws.com/id/3A12CB18E75BD1133D141F6C7B5FB266:sub": "system:serviceaccount:sagemaker-k8s-operator-system:sagemaker-k8s-operator-default" |
| 126 | + } |
| 127 | + } |
| 128 | + }] |
| 129 | +} |
| 130 | +---- |
| 131 | +=============================== |
| 132 | + |
| 133 | + |
| 134 | +4. Create a new IAM role that can be assumed by the cluster service accounts. The output from the command will contain the role ARN: |
| 135 | ++ |
| 136 | +[source,bash,subs="verbatim,quotes"] |
| 137 | +---- |
| 138 | +aws iam create-role --role-name <rolename> --assume-role-policy-document file://trust.json --output=text |
| 139 | +---- |
| 140 | ++ |
| 141 | +=============================== |
| 142 | +*Example*: |
| 143 | +
|
| 144 | +aws iam create-role --role-name *eks-fsx-cluster-role* --assume-role-policy-document file://trust.json --output=text |
| 145 | +
|
| 146 | +ROLE *arn:aws:iam::012345678910:role/eks-fsx-cluster-role* 2020-04-07T21:11:34Z / AROAQMDZVU6IANBHRF4QQ eks-fsx-cluster-role |
| 147 | +ASSUMEROLEPOLICYDOCUMENT 2012-10-17 |
| 148 | +STATEMENT sts:AssumeRoleWithWebIdentity Allow |
| 149 | +STRINGEQUALS sts.amazonaws.com system:serviceaccount:sagemaker-k8s-operator-system:sagemaker-k8s-operator-default |
| 150 | +PRINCIPAL arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/3A12CB18E75BD1133D141F6C7B5FB266 |
| 151 | +=============================== |
| 152 | + |
| 153 | + |
| 154 | +5. Give this new role access to Amazon SageMaker and attach the AmazonSageMakerFullAccess using code below: |
| 155 | ++ |
| 156 | +[source,bash,subs="verbatim,quotes"] |
| 157 | +---- |
| 158 | +aws iam attach-role-policy --role-name <rolename> --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess |
| 159 | +---- |
| 160 | ++ |
| 161 | +=============================== |
| 162 | +*Example*: |
| 163 | +
|
| 164 | +aws iam attach-role-policy --role-name eks-fsx-cluster-role --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess |
| 165 | +=============================== |
| 166 | + |
| 167 | + |
| 168 | + |
| 169 | +=== Setup the operator on the Kubernetes cluster: |
| 170 | + |
| 171 | +6. Install Amazon SageMaker Operators for Kubernetes from the link:https://github.com/aws/amazon-sagemaker-operator-for-k8s[GitHub repo] by downloading a YAML configuration file that configures your Kubernetes cluster with the custom resource definitions and operator controller service. See command below: |
| 172 | ++ |
| 173 | +[source,bash,subs="verbatim,quotes"] |
| 174 | +---- |
| 175 | +wget https://raw.githubusercontent.com/aws/amazon-sagemaker-operator-for-k8s/master/release/rolebased/installer.yaml |
| 176 | +---- |
| 177 | ++ |
| 178 | + |
| 179 | +7. In the *installer.yaml* file, update the *eks.amazonaws.com/role-arn* with the ARN from your OIDC-based role from the previous step. This will be *eks-fsx-cluster-role* we created earlier. You can see the example below: |
| 180 | ++ |
| 181 | +[source,bash,subs="verbatim,quotes"] |
| 182 | +---- |
| 183 | +metadata: |
| 184 | + annotations: |
| 185 | + eks.amazonaws.com/role-arn: <arn of OIDC-based role> |
| 186 | + name: sagemaker-k8s-operator-default |
| 187 | + namespace: sagemaker-k8s-operator-system |
| 188 | +---- |
| 189 | ++ |
| 190 | + |
| 191 | +=============================== |
| 192 | +*Example*: |
| 193 | +[source,bash] |
| 194 | +---- |
| 195 | +metadata: |
| 196 | + annotations: |
| 197 | + eks.amazonaws.com/role-arn: arn:aws:iam::012345678910:role/eks-fsx-cluster-role |
| 198 | + name: sagemaker-k8s-operator-default |
| 199 | + namespace: sagemaker-k8s-operator-system |
| 200 | +---- |
| 201 | +=============================== |
| 202 | + |
| 203 | + |
| 204 | +8. On your Kubernetes cluster, install the Amazon SageMaker CRDs and set up your operators as shown below: |
| 205 | ++ |
| 206 | +[source,bash,subs="verbatim,quotes"] |
| 207 | +---- |
| 208 | +kubectl apply -f installer.yaml |
| 209 | +---- |
| 210 | ++ |
| 211 | + |
| 212 | +You will see output as shown below: |
| 213 | ++ |
| 214 | +[source,bash,subs="verbatim,quotes"] |
| 215 | +---- |
| 216 | +namespace/sagemaker-k8s-operator-system created |
| 217 | +customresourcedefinition.apiextensions.k8s.io/batchtransformjobs.sagemaker.aws.amazon.com created |
| 218 | +customresourcedefinition.apiextensions.k8s.io/endpointconfigs.sagemaker.aws.amazon.com created |
| 219 | +customresourcedefinition.apiextensions.k8s.io/hostingdeployments.sagemaker.aws.amazon.com created |
| 220 | +customresourcedefinition.apiextensions.k8s.io/hyperparametertuningjobs.sagemaker.aws.amazon.com created |
| 221 | +customresourcedefinition.apiextensions.k8s.io/models.sagemaker.aws.amazon.com created |
| 222 | +customresourcedefinition.apiextensions.k8s.io/trainingjobs.sagemaker.aws.amazon.com created |
| 223 | +serviceaccount/sagemaker-k8s-operator-default created |
| 224 | +role.rbac.authorization.k8s.io/sagemaker-k8s-operator-leader-election-role created |
| 225 | +clusterrole.rbac.authorization.k8s.io/sagemaker-k8s-operator-manager-role created |
| 226 | +clusterrole.rbac.authorization.k8s.io/sagemaker-k8s-operator-proxy-role created |
| 227 | +rolebinding.rbac.authorization.k8s.io/sagemaker-k8s-operator-leader-election-rolebinding created |
| 228 | +clusterrolebinding.rbac.authorization.k8s.io/sagemaker-k8s-operator-manager-rolebinding created |
| 229 | +clusterrolebinding.rbac.authorization.k8s.io/sagemaker-k8s-operator-proxy-rolebinding created |
| 230 | +service/sagemaker-k8s-operator-controller-manager-metrics-service created |
| 231 | +deployment.apps/sagemaker-k8s-operator-controller-manager created |
| 232 | +---- |
| 233 | ++ |
| 234 | + |
| 235 | +9. Verify that Amazon SageMaker operators are available in your Kubernetes cluster using command below: |
| 236 | ++ |
| 237 | +[source,bash,subs="verbatim,quotes"] |
| 238 | +---- |
| 239 | +kubectl get crd | grep sagemaker |
| 240 | +---- |
| 241 | ++ |
| 242 | + |
| 243 | +You will see output as shown below: |
| 244 | ++ |
| 245 | +[source,bash,subs="verbatim,quotes"] |
| 246 | +---- |
| 247 | +batchtransformjobs.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 248 | +endpointconfigs.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 249 | +hostingdeployments.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 250 | +hyperparametertuningjobs.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 251 | +models.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 252 | +trainingjobs.sagemaker.aws.amazon.com 2020-04-07T21:17:59Z |
| 253 | +---- |
| 254 | ++ |
| 255 | + |
| 256 | +With these operators, all Amazon SageMaker’s managed and secured ML infrastructure and software optimization at scale is now available as a custom resource in your Kubernetes cluster. |
| 257 | + |
| 258 | +== Next section |
| 259 | + |
| 260 | +Click the button below to go to the next section. |
| 261 | + |
| 262 | +image::03-Prepare-SageMaker-Training.png[link=../03-Prepare-SageMaker-Training/, align="left",width=420] |
| 263 | + |
| 264 | + |
| 265 | + |
| 266 | + |
0 commit comments