Install EKS and Related Infrastructure and Resources
Contents
- Prerequisites
- Create EKS Cluster
- Configure Cluster Autoscaler
- Install Amazon EBS CSI Driver
- Install Amazon EFS CSI Driver
- Create NodeGroup
Detailed Steps
For detailed explanations of the {variables}, please refer to the Terminology page.
1. Prerequisites
Ensure the environment setup is complete. Refer to 1. Setup Deployment Environment.
2. Create EKS Cluster
-
Use the
eksctl
command to create a cluster. -
The following command will automatically create an EKS Cluster and VPC.
- zones: Set the VPC subnet to
a
andc
. - vpc-cidr: Set the IP range for the VPC.
eksctl create cluster \
--name ${AWS_CLUSTER_NAME} \
--version ${AWS_CLUSTER_VERSION} \
--region ${AWS_DEFAULT_REGION} \
--zones ${AWS_DEFAULT_REGION}a,${AWS_DEFAULT_REGION}c \
--vpc-cidr 10.0.0.0/16 \
--without-nodegroup \
--managed \
--with-oidc - zones: Set the VPC subnet to
-
After creation, check the value of {AWS_VPC_NAME}.
- {AWS_VPC_NAME} = eksctl-eks-{AWS_DEFAULT_REGION_ALIAS}-{CLUSTER_NAME}-{DEPLOY_ENV}-{AWS_CLUSTER_VERSION_STR}-eks-master-cluster/VPC
# Use the command below to find the "VALUE" of "Key": "Name" as ${AWS_VPC_NAME}.
aws ec2 describe-vpcs --query "Vpcs[-2].Tags"
export AWS_VPC_NAME=
3. Configure Cluster AutoScaler
Reference Page : https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
- Create IAM policy and Service Account for Cluster AutoScaler using the command below.
cat <<EOT > policy-cluster-autoscaler.json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:ResourceTag/k8s.io/cluster-autoscaler/$AWS_CLUSTER_NAME": "owned"
}
}
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeAutoScalingGroups",
"ec2:DescribeLaunchTemplateVersions",
"autoscaling:DescribeTags",
"autoscaling:DescribeLaunchConfigurations",
"ec2:DescribeInstanceTypes"
],
"Resource": "*"
}
]
}
EOTaws iam create-policy --policy-name policy-cluster-autoscaler --policy-document file://policy-cluster-autoscaler.json
eksctl create iamserviceaccount \
--cluster=$AWS_CLUSTER_NAME \
--region=$AWS_DEFAULT_REGION \
--namespace=kube-system \
--name=cluster-autoscaler \
--attach-policy-arn=arn:aws:iam::$AWS_ACCOUNT_ID:policy/policy-cluster-autoscaler \
--role-name=role-cluster-autoscaler \
--override-existing-serviceaccounts \
--approve - Create Cluster AutoScaler.
curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
cp cluster-autoscaler-autodiscover.yaml cluster-autoscaler.yaml
export CLUSTER_AUTOSCALER_VERSION=$(curl -sL https://registry.k8s.io/v2/autoscaling/cluster-autoscaler/tags/list | grep -o '"tags":\[.*\]' | sed 's/"tags":\[//;s/\]//' | tr ',' '\n' | tr -d '"' | grep "$AWS_CLUSTER_VERSION" | sort -V | tail -n 1)
yq e 'select(documentIndex == 5)' cluster-autoscaler.yaml > deployment.yaml
yq e -i '.spec.template.metadata.annotations."cluster-autoscaler.kubernetes.io/safe-to-evict" = "false"' deployment.yaml
yq e -i '.spec.template.spec.containers[] |= select(.name == "cluster-autoscaler").image = "registry.k8s.io/autoscaling/cluster-autoscaler:" + env(CLUSTER_AUTOSCALER_VERSION)' deployment.yaml
yq e -i '
.spec.template.spec.containers[] |=
select(.name == "cluster-autoscaler").command =
(.command | map(select(. != "--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>")) +
["--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/" + env(AWS_CLUSTER_NAME),
"--scale-down-unneeded-time=10m",
"--scale-down-utilization-threshold=0.7"])' deployment.yaml
yq eval 'select(documentIndex != 5)' cluster-autoscaler.yaml > temp.yaml
echo "---" >> temp.yaml
cat temp.yaml deployment.yaml > cluster-autoscaler.yaml
rm temp.yaml deployment.yamlkubectl apply -f cluster-autoscaler.yaml
4. Install Amazon EBS CSI Driver
Reference page: https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html
-
Set the following commands.
export EBS_CSI_SA_ROLE_NAME=role-ebs-csidriver-${AWS_CLUSTER_NAME}
export EBS_CSI_SA_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${EBS_CSI_SA_ROLE_NAME} -
Verify that OIDC is configured for the cluster.
oidc_id=$(aws eks describe-cluster --name ${AWS_CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5)
echo $oidc_id -
Create the Service Account IAM to manage the EBS CSI Driver.
eksctl create iamserviceaccount \
--name ebs-csi-controller-sa \
--namespace kube-system \
--cluster ${AWS_CLUSTER_NAME} \
--attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
--approve \
--role-only \
--role-name ${EBS_CSI_SA_ROLE_NAME} -
Create the Amazon EBS CSI add-on.
eksctl create addon --name aws-ebs-csi-driver --cluster ${AWS_CLUSTER_NAME} --service-account-role-arn ${EBS_CSI_SA_ROLE_ARN} --force
-
Verify that the ebs-csi-controller is created.
kubectl get deploy -n kube-system
Example output:
NAME READY UP-TO-DATE AVAILABLE AGE
ebs-csi-controller 2/2 2 2 30s
5. Install Amazon EFS CSI Driver (Skip if Edge Conductor is installed on-premise)
Reference page: https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html
-
Set the following commands.
export EFS_CSI_SA_ROLE_NAME=role-efs-csidriver-${AWS_CLUSTER_NAME}
export EFS_CSI_SA_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${EFS_CSI_SA_ROLE_NAME} -
Create the Service Account IAM to manage the EFS CSI Driver.
eksctl create iamserviceaccount \
--name efs-csi-controller-sa \
--namespace kube-system \
--cluster ${AWS_CLUSTER_NAME} \
--role-name ${EFS_CSI_SA_ROLE_NAME} \
--role-only \
--attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEFSCSIDriverPolicy \
--override-existing-serviceaccounts \
--approve -
Verify the EFS Trust policy and update the IAM policy.
TRUST_POLICY=$(aws iam get-role --role-name ${EFS_CSI_SA_ROLE_NAME} --query 'Role.AssumeRolePolicyDocument' | \
sed -e 's/efs-csi-controller-sa/efs-csi-*/' -e 's/StringEquals/StringLike/')
aws iam update-assume-role-policy --role-name ${EFS_CSI_SA_ROLE_NAME} --policy-document "${TRUST_POLICY}" -
Create the Amazon EFS CSI add-on.
eksctl create addon --name aws-efs-csi-driver --cluster ${AWS_CLUSTER_NAME} --service-account-role-arn ${EFS_CSI_SA_ROLE_ARN} --force
-
Verify that the efs-csi-controller is created.
kubectl get deploy -n kube-system
Example output:
NAME READY UP-TO-DATE AVAILABLE AGE
efs-csi-controller 2/2 2 2 30s
6. Create NodeGroup
-
Add a node group for operating AI Conductor to the created cluster.
- For a detailed guide on creating node groups, refer to AI Conductor Manage NodeGroup.
- Create the nodegroup-aicond.yaml file defining the creation of the node group.
[Expand nodegroup-aicond.yaml]
cat <<EOT > nodegroup-aicond.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${AWS_CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
availabilityZones: ["${AWS_DEFAULT_REGION}a", "${AWS_DEFAULT_REGION}c"]
managedNodeGroups:
- name: 'ng-${AWS_DEFAULT_REGION_ALIAS}-aicond-${INFRA_NAME}-controller'
instanceType: m5.2xlarge
# autoscaling
minSize: 1
maxSize: 3
desiredCapacity: 2
# volume
volumeType: gp2
volumeSize: 200
labels: {aic-role: controller}
propagateASGTags: true
iam:
withAddonPolicies:
autoScaler: true
EOT- Add the node group to the cluster.
eksctl create nodegroup --config-file=nodegroup-aicond.yaml
-
Skip this step if Edge Conductor is installed on-premise.
-
Add a node group for operating Edge Conductor to the created cluster.
- Create the nodegroup-edgecond.yaml file defining the creation of the node group.
[Expand nodegroup-edgecond.yaml]
cat <<EOT > nodegroup-edgecond.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${AWS_CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
availabilityZones: ["${AWS_DEFAULT_REGION}a", "${AWS_DEFAULT_REGION}c"]
managedNodeGroups:
- name: 'ng-${AWS_DEFAULT_REGION_ALIAS}-edgecond-${INFRA_NAME}-controller'
instanceType: m5.2xlarge
# autoscaling
minSize: 1
maxSize: 3
desiredCapacity: 2
# volume
volumeType: gp2
volumeSize: 200
labels: {edgecond-role: 'ng-${AWS_DEFAULT_REGION_ALIAS}-edgecond-${INFRA_NAME}'}
propagateASGTags: true
iam:
withAddonPolicies:
autoScaler: true
EOT- Add the node group to the cluster.
eksctl create nodegroup --config-file=nodegroup-edgecond.yaml
-