Skip to main content

EVA Agent Dependencies

This is a guide for installing the Dependencies required to run the EVA Agent.

Before installing the EVA Agent, you must first set up the infrastructure for Qdrant (vector DB) for data storage and vLLM for model inference.

This guide explains how to install the foundational dependency packages for EVA.




Understanding the Installation Structure

To ensure a successful installation, please check the dependencies and order between each package.

  1. eva-agent-init: Defines the Storage Class. (Must be installed first)
  2. qdrant / vllm: Uses the defined storage to store data.
  3. eva-agent: Installed last, after the above services are fully prepared.



Prerequisites

Please verify that the required CLI tools are installed.




Register and Update Helm Repositories

Register all necessary open-source repositories and keep them up to date. Also set up the namespace and service account for EVA Agent installation in advance.

# 1. Add each repository
helm repo add qdrant https://qdrant.github.io/qdrant-helm
helm repo add vllm https://vllm-project.github.io/production-stack
helm repo add eva-agent https://mellerikat.github.io/eva-agent

# 2. Update to the latest information
helm repo update

# 3. Create the namespace and service account for EVA Agent
kubectl create namespace eva-agent
kubectl create serviceaccount sa-eva-agent -n eva-agent
kubectl -n eva-agent patch serviceaccount sa-eva-agent \
-p '{"imagePullSecrets":[{"name":"ecr-pull"}]}'



Step 1: Install eva-agent-init

This package is a critical step that pre-defines a common Storage Class so that Qdrant and vLLM (installed later) can smoothly store data.

  • Configuration & Version Reference:
  • values Template: You can check environment-specific configuration files in the GitHub Repository.
  • Check Chart Version: Refer to the env.tpl file for the required version information.
  • Package Role:
  • It configures a dedicated storage class to ensure eva-agent-vllm and eva-agent-qdrant can store data safely.
  • Therefore, it must be installed before any other packages.

Download Configuration Files (eva-agent-init)

Download the configuration files above from the GitHub Repository. Choose the file that matches your environment.

Values templates are organized by EVA Agent version (image tag). Helm charts and images use separate versions.

# Example: Download k3s values
mkdir eva-agent-init
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-init/values-k3s.yaml" \
-o eva-agent-init/values-k3s.yaml

# Example: Install for k3s environment
helm install eva-agent-init eva-agent/eva-agent-init \
--version=1.0.0 \
-n eva-agent \
-f eva-agent-init/values-k3s.yaml



Step 2: Install eva-agent-qdrant

Install the Qdrant DB to store vector data.

  • Configuration & Version Reference:
  • values Template: Find the necessary templates in the GitHub Repository.
  • Check Chart Version: Refer to the env.tpl file for the exact version required.
  • Detailed Value Descriptions: Detailed parameters for the Qdrant Helm chart can be found on the Artifact Hub Qdrant page.
  • Data Management Notes:
  • PVCs and PVs are dynamically created using the Storage Class defined in eva-agent-init.
  • Once created, PVs are not automatically deleted even if you uninstall the environment, in order to protect your data.
  • To completely clean up storage, you must manually delete the PVCs and PVs using kubectl.

Update Settings for Your Environment

CategoryNameDescriptionOn-premise(K3s)Cloud(AWS)
Storagepersistence.sizePVC size allocated to Qdrant10Gi10Gi
Storagepersistence.annotationsAnnotations for PVC/PV (add if needed)
Storagepersistence.storageVolumeName(Depending on environment) volume identifier nameeva-agent-qdrant-storageeva-agent-qdrant-storage
SchedulingnodeSelectorNode label selector to schedule only on specific nodeseks.amazonaws.com/nodegroup: ng-an2-eva-agent-db

Download Configuration Files (eva-agent-qdrant)

Download the values and post-renderer templates for Qdrant.

# Example: Install Qdrant on k3s
mkdir eva-agent-qdrant
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values.yaml" \
-o eva-agent-qdrant/values.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values-aws.yaml" \
-o eva-agent-qdrant/values-aws.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/post-renderer.sh" \
-o eva-agent-qdrant/post-renderer.sh
chmod +x eva-agent-qdrant/post-renderer.sh

# For AWS, add "-f eva-agent-qdrant/values-aws.yaml"
helm install eva-agent-qdrant qdrant/qdrant \
--version=1.15.0 \
-n eva-agent \
-f eva-agent-qdrant/values.yaml \
--post-renderer eva-agent-qdrant/post-renderer.sh

Update Configuration Values for Your Environment

CategoryNameDescriptionOn-premise (K3s)Cloud (AWS)
Storagepersistence.sizePVC capacity allocated to Qdrant10Gi10Gi
Storagepersistence.annotationsAnnotations related to PVC/PV (add if needed)
Storagepersistence.storageVolumeNameName to identify the PV/volume (depending on environment)eva-agent-qdrant-storageeva-agent-qdrant-storage
SchedulingnodeSelectorNode label selector to schedule only on specific nodeseks.amazonaws.com/nodegroup: ng-an2-eva-agent-db



Step 3: Install eva-agent-vllm

Install vLLM, the model inference server. (Requires Agent image version 2.2-a2.0 or higher)

  • Configuration & Version Reference:
  • values Template: Available in the GitHub Repository.
  • Check Chart Version: Refer to the env.tpl file for the exact version value.
  • Detailed Value Descriptions: More configuration options can be found on the Artifact Hub vLLM-stack page.
  • Core Setting: Please double-check that the nodeSelector setting is correct to ensure smooth dynamic allocation of PVCs/PVs.

💡 Note: PVCs/PVs are created using the Storage Class defined in eva-agent-init. Existing PVs are preserved during reinstallation, but you must use kubectl delete for manual cleanup if you wish to remove them entirely.

Update Settings for Your Environment (vLLM)

The table below is based on a server with a single Nvidia L40s GPU.

CategoryNameDescriptionOn-premise(K3s)Cloud(AWS)
Serving EngineservingEngineSpec.runtimeClassNameRuntime class for GPU""""
Serving EngineservingEngineSpec.modelSpecList of models (engines) to serve[][]
ModelservingEngineSpec.modelSpec.nameModel identifier (referenced in chart)"qwen3-vl-8b-fp8""qwen3-vl-8b-fp8"
ModelservingEngineSpec.modelSpec.modelURLHF model path"Qwen/Qwen3-VL-8B-Instruct-FP8""Qwen/Qwen3-VL-8B-Instruct-FP8"
ModelservingEngineSpec.modelSpec.repositoryvLLM image repo"vllm/vllm-openai""vllm/vllm-openai"
ModelservingEngineSpec.modelSpec.tagvLLM image tag"v0.11.0""v0.11.0"
AuthservingEngineSpec.modelSpec.serviceAccountNameServiceAccount for the engine podsa-eva-agentsa-eva-agent
ScaleservingEngineSpec.modelSpec.replicaCountEngine replica count11
ResourcesservingEngineSpec.modelSpec.requestCPUCPU request44
ResourcesservingEngineSpec.modelSpec.requestMemoryMemory request"32Gi""28Gi"
ResourcesservingEngineSpec.modelSpec.requestGPUGPU request11
StorageservingEngineSpec.modelSpec.pvcStoragePVC size"30Gi""30Gi"
StorageservingEngineSpec.modelSpec.storageClassStorageClass name"eva-agent-sc-fs""eva-agent-sc-fs"
StorageservingEngineSpec.modelSpec.pvcAccessModePVC access mode["ReadWriteMany"]["ReadWriteMany"]
vLLMservingEngineSpec.modelSpec.vllmConfig.hostvLLM host address"0.0.0.0""0.0.0.0"
vLLMservingEngineSpec.modelSpec.vllmConfig.tensorParallelSizeGPUs used per model11
vLLMservingEngineSpec.modelSpec.vllmConfig.gpuMemoryUtilizationGPU memory utilization0.70.7
vLLMservingEngineSpec.modelSpec.vllmConfig.maxModelLenMax context length1228812288
vLLMservingEngineSpec.modelSpec.vllmConfig.dtypeModel dtype"auto""auto"
vLLMservingEngineSpec.modelSpec.vllmConfig.enableChunkedPrefillEnable chunked prefilltruetrue
vLLMservingEngineSpec.modelSpec.vllmConfig.enablePrefixCachingEnable prefix cachingtruetrue
vLLMservingEngineSpec.modelSpec.vllmConfig.extraArgsAdditional vllm serve args--served-model-name
qwen3-vl-8b-fp8
--kv-cache-dtype
fp8
--max-num-batched-tokens
4096
--served-model-name
qwen3-vl-8b-fp8
--kv-cache-dtype
fp8
--max-num-batched-tokens
4096
SchedulingservingEngineSpec.modelSpec.nodeSelectorTermsEngine pod scheduling constraints-matchExpressions: kubernetes.io/os In [linux] (AWS: eks nodegroup In [ng-an2-eva-agent-gpu])

⚠️ Warning: The kv-cache-dtype in vllmConfig.extraArgs depends on the Nvidia GPU architecture. For pre-Ada architectures, set it to "auto".

Download Configuration Files (eva-agent-vllm)

Download the values and post-renderer templates for vLLM. Among the templates below, the common values.yaml and values-k3s.yml are based on Nvidia A6000 GPUs, while values-aws.yaml is based on a single Nvidia L40s GPU server. Update vLLM settings according to your GPU specs based on the table above.

# k3s & vLLM
mkdir eva-agent-vllm
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values.yaml" \
-o eva-agent-vllm/values.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values-k3s.yaml" \
-o eva-agent-vllm/values-k3s.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/post-renderer.sh" \
-o eva-agent-vllm/post-renderer.sh
chmod +x eva-agent-vllm/post-renderer.sh

# For AWS, change to "-f eva-agent-vllm/values-aws.yaml"
helm install eva-agent-vllm vllm/vllm-stack \
--version=0.1.7 \
-n eva-agent \
-f eva-agent-vllm/values.yaml \
-f eva-agent-vllm/values-k3s.yaml \
--post-renderer eva-agent-vllm/post-renderer.sh