Skip to main content
Version: Next

Resource Monitoring Installation


Table of Contents

  1. Resource Monitoring Infrastructure Setup
  2. Deploy Prometheus & Grafana


Detailed Steps

For detailed explanations of {variables}, refer to the Terminology page.

  • Set up the {variables}.

    export AWS_CLUSTER_NAME=
    export INFRA_NAME=
    export DEPLOY_ENV=
    export AWS_DEFAULT_REGION=
    export AWS_DEFAULT_REGION_ALIAS=
    export PROJECT_NODEGROUP_LABEL=mellerikat-monitor
    export PROJECT_NODEGROUP_NAME=ng-${AWS_DEFAULT_REGION_ALIAS}-${PROJECT_NODEGROUP_LABEL}
    export PROJECT_NODEGROUP_DESIRED_SIZE=1
    export PROJECT_NODEGROUP_MIN=1
    export PROJECT_NODEGROUP_MAX=2
    export PROJECT_NODEGROUP_EC2_NAME=m5.large

1. Resource Monitoring Infrastructure Setup

1-1. Create Resource Monitoring Node Group

For detailed instructions, refer to the Add Nodegroup page.

  • Create a create-nodegroup-monitoring.yaml file that defines the node group.

    [Expand create-nodegroup-monitoring.yaml]

    NOTE : "propagateASGTags: true" is a mandatory setting.

    cat <<EOT > create-nodegroup-monitoring.yaml
    apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig
    managedNodeGroups:
    - amiFamily: AmazonLinux2
    desiredCapacity: ${PROJECT_NODEGROUP_DESIRED_SIZE}
    disableIMDSv1: false
    disablePodIMDS: false
    iam:
    withAddonPolicies:
    albIngress: false
    appMesh: false
    appMeshPreview: false
    autoScaler: true
    awsLoadBalancerController: false
    certManager: false
    cloudWatch: false
    ebs: false
    efs: false
    externalDNS: false
    fsx: false
    imageBuilder: false
    xRay: false
    instanceSelector: {}
    instanceType: ${PROJECT_NODEGROUP_EC2_NAME}
    labels:
    aic-role: ${PROJECT_NODEGROUP_LABEL}
    alpha.eksctl.io/cluster-name: ${AWS_CLUSTER_NAME}
    alpha.eksctl.io/nodegroup-name: ${PROJECT_NODEGROUP_NAME}
    maxSize: ${PROJECT_NODEGROUP_MAX}
    minSize: ${PROJECT_NODEGROUP_MIN}
    name: ${PROJECT_NODEGROUP_NAME}
    availabilityZones: ["${AWS_DEFAULT_REGION}a", "${AWS_DEFAULT_REGION}c"]
    privateNetworking: true
    releaseVersion: ""
    securityGroups:
    withLocal: null
    withShared: null
    ssh:
    allow: false
    publicKeyPath: ""
    tags:
    alpha.eksctl.io/nodegroup-name: ${PROJECT_NODEGROUP_NAME}
    alpha.eksctl.io/nodegroup-type: managed
    volumeIOPS: 3000
    volumeSize: 50
    volumeThroughput: 125
    volumeType: gp3
    propagateASGTags: true
    metadata:
    name: ${AWS_CLUSTER_NAME}
    region: ${AWS_DEFAULT_REGION}
    EOT
  • Create the node group with the following command:

    eksctl create nodegroup --config-file=create-nodegroup-monitoring.yaml
    [Expand Troubleshooting: 'AccessConfig']
    error getting cluster stack template: failed to parse GetStackTemplate response: json: unknown field "AccessConfig"

    After updating eksctl, execute the command again.


1-2. Create Resource Monitoring Target Group

For detailed instructions, refer to the Create and Configure Target Group page.

  • You need a total of 2 target groups, each serving the following purposes:

    • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-aic-fe-{AWS_CLUSTER_VERSION_NUM}-30090 : Prometheus
    • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-aic-be-{AWS_CLUSTER_VERSION_NUM}-30050 : Grafana
    export AWS_CLUSTER_VERSION_NUM=`echo ${AWS_CLUSTER_VERSION} | tr '.' ''`

  • Create Target Group
    • Navigate to the AWS EC2 Console.
    • Click Target Groups from the left menu.
    • Click Create target group to start creating a target group.
    • Step 1: Basic configuration
      • Choose a target type: Instances
      • Target group name: tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090
      • Protocol: Port: HTTP : 30090
      • IP address type: IPv4
      • Select VPC: {AWS_VPC_NAME}
      • Protocol version: HTTP1
    • Step 2: Health checks
      • Health check protocol: HTTP
      • Path: /
    • Click Next and then Create target group to complete.
    • Create another target group similarly, with the following name and port:
      • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
        • Protocol: Port: HTTP : 30050

  • Set Target Group Attachments

    • Navigate to the AWS EKS Console.
    • Click on {AWS_CLUSTER_NAME}.
    • Go to the Compute tab.
    • In the Node groups section of the Compute tab, click {PROJECT_NODEGROUP_NAME}.
    • In the Autoscaling group name section of the Details tab, click on the asg group resource.
    • In the Auto Scaling groups section, under the Details tab, go to Load balancing and click Edit.
    • Step 1: Load balancing
      • Check Application, Network, or Gateway Load Balancer target groups.
      • From the Load balancers list, select the following:
        • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090
        • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
      • Click Update to finalize the settings.

1-3. Configure Resource Monitoring ALB

For detailed instructions, refer to the Configure ALB page.

  • Configure Listener Rules for the ALB

    • From the list of Load Balancers, click on the resource you created.
    • Under the Listeners and rules tab, click on 6 rule in the Rules section.
    • Click Add rule.
    Configurationmellerikat Prometheusmellerikat Grafana
    Name and tags : Namemellerikat Prometheusmellerikat Grafana
    ConditionsAdd conditionAdd condition
    Conditions : TypeHost headerHost header
    Conditions : Valueaicond-prom.{DOMAIN_NAME}aicond-mon.{DOMAIN_NAME}
    ConditionsConfirmConfirm
    ConditionsAdd conditionAdd condition
    Conditions : TypePathPath
    Conditions : Value/*/*
    ConditionsConfirmConfirm
    ConditionsNextNext
    Actions : TypeForward to target groupsForward to target groups
    Actions : Target grouptg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
    ActionsNextNext
    Rule : Priority600700
    RuleNextNext
    CreateCreateCreate

1-4. Create Resource Monitoring Namespace

kubectl create namespace mellerikat-monitor

1-5. Create Resource Monitoring Storage Class

  • Create the create-storage-class-monitoring.yaml file.

    [Expand create-storage-class-monitoring.yaml]
    cat <<EOT > create-storage-class-monitoring.yaml
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: prometheus-sc
    provisioner: kubernetes.io/aws-ebs
    parameters:
    type: gp2
    fsType: ext4
    reclaimPolicy: Retain
    allowVolumeExpansion: true
    volumeBindingMode: WaitForFirstConsumer
    EOT
  • Create the Storage Class.

    kubectl apply -f create-storage-class-monitoring.yaml

    # Output
    $ kubectl get storageclass -A
    NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
    prometheus-sc kubernetes.io/aws-ebs Retain WaitForFirstConsumer true 6s


2. Deploy Prometheus & Grafana

2-1. Configure Helm

Refer to the Helm Installation page for more details.

  • Install Helm

    # Download Helm
    wget https://get.helm.sh/helm-v3.16.1-linux-amd64.tar.gz

    # Unpack
    tar -zxvf helm-v3.16.1-linux-amd64.tar.gz

    cp linux-amd64/helm /usr/local/bin/helm

    # Check Helm version
    helm version
    # version.BuildInfo{Version:"v3.16.1", GitCommit:"5a5449dc42be07001fd5771d56429132984ab3ab", GitTreeState:"clean", GoVersion:"go1.22.7"}

2-2. Configure Prometheus & Grafana

  • Download the Prometheus Helm Chart.

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update

    # Download kube-prometheus-stack
    helm pull prometheus-community/kube-prometheus-stack

    tar xfz kube-prometheus-stack-62.7.0.tgz
    cd kube-prometheus-stack
  • Configure the values.yaml file for Prometheus Helm Chart.

    • Configure the values.yaml as shown below.
    • This is based on version 62.7.0.
    [Expand Set values.yaml]
    alertmanager:
    # Change
    enabled: false

    ############## Prometheus ###################
    kube-state-metrics:
    # Add
    metricAnnotationsAllowList:
    - pods=[*]

    prometheus:
    prometheusSpec:
    # Change
    retention: 30d # Store data for up to 30 days

    # Change
    storageSpec:
    volumeClaimTemplate:
    spec:
    storageClassName: prometheus-sc
    accessModes: ["ReadWriteOnce"]
    resources:
    requests:
    storage: 50Gi

    # Change
    nodeSelector:
    aic-role: mellerikat-monitor

    # Change
    service:
    type: NodePort
    nodePort: 30090

    prometheusOperator:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor
    admissionWebhooks:
    deployment:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor
    patch:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor


    ############## Grafana ###################
    grafana:
    # Change
    defaultDashboardsEnabled: false

    # Change
    adminPassword: admin@com # Set the password of admin in Grafana

    # Add
    nodeSelector:
    aic-role: mellerikat-monitor

    # Change
    service:
    type: NodePort
    nodePort: 30050

    # Change
    persistence:
    enabled: true
    storageClassName: prometheus-sc
    accessModes:
    - ReadWriteOnce
    size: 20Gi

2-3. Deploy Prometheus & Grafana

  • Deploy using Helm

    # path : kube-prometheus-stack
    helm install mellerikat-monitor . -f values.yaml -n mellerikat-monitor

    # Output
    NAME: mellerikat-monitor
    LAST DEPLOYED: Tue Jul 30 01:06:40 2024
    NAMESPACE: mellerikat-monitor
    STATUS: deployed
    REVISION: 1
    NOTES:
    kube-prometheus-stack has been installed. Check its status by running:
    kubectl --namespace mellerikat-monitor get pods -l "release=mellerikat-monitor"

    Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
  • To uninstall:

    # path : kube-prometheus-stack
    helm uninstall mellerikat-monitor -n mellerikat-monitor