본문으로 건너뛰기
버전: docs v25.02

Resource Monitoring 설치


목차

  1. Resource Monitoring 인프라 설치
  2. Prometheus & Grafana 설치


자세한 단계

{변수}의 자세한 설명은 Terminology 페이지를 참고하세요

  • {변수} 설정을 합니다.

    export AWS_CLUSTER_NAME=
    export INFRA_NAME=
    export DEPLOY_ENV=
    export AWS_DEFAULT_REGION=
    export AWS_DEFAULT_REGION_ALIAS=
    export PROJECT_NODEGROUP_LABEL=mellerikat-monitor
    export PROJECT_NODEGROUP_NAME=ng-${AWS_DEFAULT_REGION_ALIAS}-${PROJECT_NODEGROUP_LABEL}
    export PROJECT_NODEGROUP_DESIRED_SIZE=1
    export PROJECT_NODEGROUP_MIN=1
    export PROJECT_NODEGROUP_MAX=2
    export PROJECT_NODEGROUP_EC2_NAME=m5.large

1. Resource Monitoring 인프라 설치

1-1. Resource Monitoring 노드그룹 생성

자세한 설명은 Nodegroup 추가 페이지를 참고하세요

  • Nodegroup 생성을 정의한 create-nodegroup-monitoring.yaml 을 생성합니다.

    [Expand create-nodegroup-monitoring.yaml]

    NOTE : "propagateASGTags: true" 필수 설정

    cat <<EOT > create-nodegroup-monitoring.yaml
    apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig
    managedNodeGroups:
    - amiFamily: AmazonLinux2
    desiredCapacity: ${PROJECT_NODEGROUP_DESIRED_SIZE}
    disableIMDSv1: false
    disablePodIMDS: false
    iam:
    withAddonPolicies:
    albIngress: false
    appMesh: false
    appMeshPreview: false
    autoScaler: true
    awsLoadBalancerController: false
    certManager: false
    cloudWatch: false
    ebs: false
    efs: false
    externalDNS: false
    fsx: false
    imageBuilder: false
    xRay: false
    instanceSelector: {}
    instanceType: ${PROJECT_NODEGROUP_EC2_NAME}
    labels:
    aic-role: ${PROJECT_NODEGROUP_LABEL}
    alpha.eksctl.io/cluster-name: ${AWS_CLUSTER_NAME}
    alpha.eksctl.io/nodegroup-name: ${PROJECT_NODEGROUP_NAME}
    maxSize: ${PROJECT_NODEGROUP_MAX}
    minSize: ${PROJECT_NODEGROUP_MIN}
    name: ${PROJECT_NODEGROUP_NAME}
    availabilityZones: ["${AWS_DEFAULT_REGION}a", "${AWS_DEFAULT_REGION}c"]
    privateNetworking: true
    releaseVersion: ""
    securityGroups:
    withLocal: null
    withShared: null
    ssh:
    allow: false
    publicKeyPath: ""
    tags:
    alpha.eksctl.io/nodegroup-name: ${PROJECT_NODEGROUP_NAME}
    alpha.eksctl.io/nodegroup-type: managed
    volumeIOPS: 3000
    volumeSize: 50
    volumeThroughput: 125
    volumeType: gp3
    propagateASGTags: true
    metadata:
    name: ${AWS_CLUSTER_NAME}
    region: ${AWS_DEFAULT_REGION}
    EOT
  • 아래 명령어로 Nodegroup을 생성합니다.

    eksctl create nodegroup --config-file=create-nodegroup-monitoring.yaml
    [Expand Trouble Shooting : 'AccessConfig']
    error getting cluster stack template: failed to parse GetStackTemplate response: json: unknown field "AccessConfig

    eksctl 업데이트 후 실행


1-2. Resource Monitoring Target Group 생성

자세한 설명은 Target group 생성 및 설정하기 페이지를 참고하세요

  • 총 2개의 Target Group이 필요하고, 각각의 용도는 아래와 같습니다. 

    • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-aic-fe-{AWS_CLUSTER_VERSION_NUM}-30090 : Prometheus
    • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-aic-be-{AWS_CLUSTER_VERSION_NUM}-30050 : Grafana
    export AWS_CLUSTER_VERSION_NUM=`echo ${AWS_CLUSTER_VERSION} | tr '.' ''`

  • Target Group 생성
    • AWS EC2 Console 로 이동합니다.
    • 왼쪽 메뉴의 Target Groups 를 클릭합니다.
    • Create target group 버튼을 눌러 타겟 그룹 생성을 시작합니다.
    • Step 1: Basic configuration
      • Choose a target type: Instances
      • Target group name: tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090
      • Protocol : Port : HTTP : 30090
      • IP address type : IPv4
      • VPC : {AWS_VPC_NAME} 선택
      • Protocol version : HTTP1
    • Step 2: Health checks
      • Health check protocol: HTTP
      • Path: /
    • Next 버튼 클릭
    • Create target group 버튼으로 target group 생성 완료
    • 아래 항목도 같은 방법으로 생성 진행하되, target group 이름의 마지막 숫자 5자리가 Port로 지정되어야 하는 것에 유의
      • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
        • Protocol : Port : HTTP : 30050

  • Target Group 연결 설정

    • AWS EKS Console 으로 이동합니다.
    • {AWS_CLUSTER_NAME} 클릭합니다.
    • Compute 탭을 클릭합니다.
    • Compute 탭의 Node groups 섹션에서 {PROJECT_NODEGROUP_NAME} 클릭합니다.
    • Details 탭의 Autoscaling group name 섹션에서 asg group 자원 클릭합니다.
    • Auto Scaling groups 의 Detailes 탭의 Load balancing 섹션에서 Edit 버튼을 클릭합니다.
    • Step 1: Load balancing
      • Application, Network or Gateway Load Balancer target groups 체크
      • Load balancers 에서 아래 2개 항목을 선택합니다.
        • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090
        • tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
      • Update 버튼을 눌러 설정을 마칩니다.

1-3. Resource Monitoring ALB 설정

자세한 설명은 ALB 설정하기 페이지를 참고하세요

  • ALB의 Listener rules 설정

    • Load balancers 목록에서 생성한 자원을 클릭합니다.
    • Listeners and rules 탭에서 Rules 항목의 6 rule을 클릭합니다.
    • Add rule 버튼을 클릭합니다.
    설정mellerikat Prometheusmellerikat Grafana
    Name and tags : Namemellerikat Prometheusmellerikat Grafana
    ConditionsAdd conditionAdd condition
    Conditions : rule condition typesHost headerHost header
    Conditions : Valueaicond-prom.{DOMAIN_NAME}aicond-mon.{DOMAIN_NAME}
    ConditionsConfirmConfirm
    ConditionsAdd conditionAdd condition
    Conditions : rule condition typesPathPath
    Conditions : Value/*/*
    ConditionsConfirmConfirm
    ConditionsNextNext
    Actions : Routing actionsForward to target groupsForward to target groups
    Actions : Target grouptg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-pm-{AWS_CLUSTER_VERSION_NUM}-30090tg-{AWS_DEFAULT_REGION_ALIAS}-{INFRA_NAME}-{DEPLOY_ENV}-mkd-gf-{AWS_CLUSTER_VERSION_NUM}-30050
    ActionsNextNext
    Rule : Priority600700
    RuleNextNext
    CreateCreateCreate

1-4. Resource Monitoring Namespace 생성

kubectl create namespace mellerikat-monitor

1-5. Resource Monitoring Storage Class 생성

  • create-storage-class-montoring.yaml 을 생성합니다.

    [Expand create-storage-class-montoring.yaml]
    cat <<EOT > create-storage-class-montoring.yaml
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: prometheus-sc
    provisioner: kubernetes.io/aws-ebs
    parameters:
    type: gp2
    fsType: ext4
    reclaimPolicy: Retain
    allowVolumeExpansion: true
    volumeBindingMode: WaitForFirstConsumer
    EOT
  • Storage Class을 생성합니다.

    kubectl apply -f create-storage-class-montoring.yaml

    # 출력
    $ kubectl get storageclass -A
    NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
    prometheus-sc kubernetes.io/aws-ebs Retain WaitForFirstConsumer true 6s


2. Prometheus & Grafana 배포

2-1. Helm 환경설정

Helm 설치 페이지를 참고하세요

  • Helm 설치

    • Helm 버전 페이지에서 버전을 확인 합니다.
    # download helm
    wget https://get.helm.sh/helm-v3.16.1-linux-amd64.tar.gz

    # unpack
    tar -zxvf helm-v3.16.1-linux-amd64.tar.gz

    cp linux-amd64/helm /usr/local/bin/helm

    # check helm version
    helm version
    # version.BuildInfo{Version:"v3.16.1", GitCommit:"5a5449dc42be07001fd5771d56429132984ab3ab", GitTreeState:"clean", GoVersion:"go1.22.7"}

2-2. Prometheus & Grafana 설정

  • Prometheus Helm Chart 를 다운로드 합니다.

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update

    # kube-prometheus-stack 다운로드
    helm pull prometheus-community/kube-prometheus-stack

    tar xfz kube-prometheus-stack-62.7.0.tgz
    cd kube-prometheus-stack
  • Prometheus Helm Chart 의 values.yaml을 설정합니다.

    • 아래와 같이 values.yaml 설정 합니다.
    • 62.7.0 기준으로 작성되었습니다.
    [Expand values.yaml 설정]
    alertmanager:
    # Change
    enabled: false

    ############## Prometheus ###################
    kube-state-metrics:
    # Add
    metricAnnotationsAllowList:
    - pods=[*]

    prometheus:
    prometheusSpec:
    # Change
    retention: 30d # 최대 30일치 데이터 저장

    # Change
    storageSpec:
    volumeClaimTemplate:
    spec:
    storageClassName: prometheus-sc
    accessModes: ["ReadWriteOnce"]
    resources:
    requests:
    storage: 50Gi

    # Change
    nodeSelector:
    aic-role: mellerikat-monitor

    # Change
    service:
    type: NodePort
    nodePort: 30090

    prometheusOperator:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor
    admissionWebhooks:
    deployment:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor
    patch:
    # Change
    nodeSelector:
    aic-role: mellerikat-monitor


    ############## Grafana ###################
    grafana:
    # Change
    defaultDashboardsEnabled: false

    # Change
    adminPassword: admin@com # Grafana 에서 admin 비밀번호 설정

    # Add
    nodeSelector:
    aic-role: mellerikat-monitor

    # Change
    service:
    type: NodePort
    nodePort: 30050

    # Change
    persistence:
    enabled: true
    storageClassName: prometheus-sc
    accessModes:
    - ReadWriteOnce
    size: 20Gi

2-3. Prometheus & Grafana 배포

  • Helm 으로 배포

    # path : kube-prometheus-stack
    helm install mellerikat-monitor . -f values.yaml -n mellerikat-monitor

    # 출력
    NAME: mellerikat-monitor
    LAST DEPLOYED: Tue Jul 30 01:06:40 2024
    NAMESPACE: mellerikat-monitor
    STATUS: deployed
    REVISION: 1
    NOTES:
    kube-prometheus-stack has been installed. Check its status by running:
    kubectl --namespace mellerikat-monitor get pods -l "release=mellerikat-monitor"

    Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

  • 참고 : 삭제

    # path : kube-prometheus-stack
    helm uninstall mellerikat-monitor -n mellerikat-monitor