EVA Agent Dependencies
This is a guide for installing the Dependencies required to run the EVA Agent.
Before installing the EVA Agent, you must first set up the infrastructure for Qdrant (vector DB) for data storage and vLLM for model inference.
This guide explains how to install the foundational dependency packages for EVA.
Understanding the Installation Structure
To ensure a successful installation, please check the dependencies and order between each package.
- eva-agent-init: Defines the Storage Class. (Must be installed first)
- qdrant / vllm: Uses the defined storage to store data.
- eva-agent: Installed last, after the above services are fully prepared.
Prerequisites
Please verify that the required CLI tools are installed.
- kubectl: Cluster control tool [Installation] (https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/)
- helm: Package management tool [Installation] (https://helm.sh/docs/intro/install/)
- kustomize: Configuration customization tool (required for post-rendering) [Installation] (https://kubectl.docs.kubernetes.io/installation/kustomize/binaries/)
Register and Update Helm Repositories
Register all necessary open-source repositories and keep them up to date. Also set up the namespace and service account for EVA Agent installation in advance.
# 1. Add each repository
helm repo add qdrant https://qdrant.github.io/qdrant-helm
helm repo add vllm https://vllm-project.github.io/production-stack
helm repo add eva-agent https://mellerikat.github.io/eva-agent
# 2. Update to the latest information
helm repo update
# 3. Create the namespace and service account for EVA Agent
kubectl create namespace eva-agent
kubectl create serviceaccount sa-eva-agent -n eva-agent
kubectl -n eva-agent patch serviceaccount sa-eva-agent \
-p '{"imagePullSecrets":[{"name":"ecr-pull"}]}'
Step 1: Install eva-agent-init
This package is a critical step that pre-defines a common Storage Class so that Qdrant and vLLM (installed later) can smoothly store data.
- Configuration & Version Reference:
- values Template: You can check environment-specific configuration files in the GitHub Repository.
- Check Chart Version: Refer to the env.tpl file for the required version information.
- Package Role:
- It configures a dedicated storage class to ensure
eva-agent-vllmandeva-agent-qdrantcan store data safely. - Therefore, it must be installed before any other packages.
Download Configuration Files (eva-agent-init)
Download the configuration files above from the GitHub Repository. Choose the file that matches your environment.
Values templates are organized by EVA Agent version (image tag). Helm charts and images use separate versions.
- k3s values template: https://github.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-init/values-k3s.yaml
- AWS values template: https://github.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-init/values-aws.yaml
- NCP values template: https://github.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-init/values-ncp.yaml
# Example: Download k3s values
mkdir eva-agent-init
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-init/values-k3s.yaml" \
-o eva-agent-init/values-k3s.yaml
# Example: Install for k3s environment
helm install eva-agent-init eva-agent/eva-agent-init \
--version=1.0.0 \
-n eva-agent \
-f eva-agent-init/values-k3s.yaml
Step 2: Install eva-agent-qdrant
Install the Qdrant DB to store vector data.
- Configuration & Version Reference:
- values Template: Find the necessary templates in the GitHub Repository.
- Check Chart Version: Refer to the env.tpl file for the exact version required.
- Detailed Value Descriptions: Detailed parameters for the Qdrant Helm chart can be found on the Artifact Hub Qdrant page.
- Data Management Notes:
- PVCs and PVs are dynamically created using the Storage Class defined in
eva-agent-init. - Once created, PVs are not automatically deleted even if you uninstall the environment, in order to protect your data.
- To completely clean up storage, you must manually delete the PVCs and PVs using
kubectl.
Update Settings for Your Environment
| Category | Name | Description | On-premise(K3s) | Cloud(AWS) |
|---|---|---|---|---|
| Storage | persistence.size | PVC size allocated to Qdrant | 10Gi | 10Gi |
| Storage | persistence.annotations | Annotations for PVC/PV (add if needed) | ||
| Storage | persistence.storageVolumeName | (Depending on environment) volume identifier name | eva-agent-qdrant-storage | eva-agent-qdrant-storage |
| Scheduling | nodeSelector | Node label selector to schedule only on specific nodes | eks.amazonaws.com/nodegroup: ng-an2-eva-agent-db |
Download Configuration Files (eva-agent-qdrant)
Download the values and post-renderer templates for Qdrant.
- Common values template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values.yaml
- k3s values template: Same as the common template
- AWS values template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values-aws.yaml
- Post renderer template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/post-renderer.sh
# Example: Install Qdrant on k3s
mkdir eva-agent-qdrant
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values.yaml" \
-o eva-agent-qdrant/values.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/values-aws.yaml" \
-o eva-agent-qdrant/values-aws.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-qdrant/post-renderer.sh" \
-o eva-agent-qdrant/post-renderer.sh
chmod +x eva-agent-qdrant/post-renderer.sh
# For AWS, add "-f eva-agent-qdrant/values-aws.yaml"
helm install eva-agent-qdrant qdrant/qdrant \
--version=1.15.0 \
-n eva-agent \
-f eva-agent-qdrant/values.yaml \
--post-renderer eva-agent-qdrant/post-renderer.sh
Update Configuration Values for Your Environment
| Category | Name | Description | On-premise (K3s) | Cloud (AWS) |
|---|---|---|---|---|
| Storage | persistence.size | PVC capacity allocated to Qdrant | 10Gi | 10Gi |
| Storage | persistence.annotations | Annotations related to PVC/PV (add if needed) | ||
| Storage | persistence.storageVolumeName | Name to identify the PV/volume (depending on environment) | eva-agent-qdrant-storage | eva-agent-qdrant-storage |
| Scheduling | nodeSelector | Node label selector to schedule only on specific nodes | eks.amazonaws.com/nodegroup: ng-an2-eva-agent-db |
Step 3: Install eva-agent-vllm
Install vLLM, the model inference server. (Requires Agent image version 2.2-a2.0 or higher)
- Configuration & Version Reference:
- values Template: Available in the GitHub Repository.
- Check Chart Version: Refer to the env.tpl file for the exact version value.
- Detailed Value Descriptions: More configuration options can be found on the Artifact Hub vLLM-stack page.
- Core Setting: Please double-check that the
nodeSelectorsetting is correct to ensure smooth dynamic allocation of PVCs/PVs.
💡 Note: PVCs/PVs are created using the Storage Class defined in
eva-agent-init. Existing PVs are preserved during reinstallation, but you must usekubectl deletefor manual cleanup if you wish to remove them entirely.
Update Settings for Your Environment (vLLM)
The table below is based on a server with a single Nvidia L40s GPU.
| Category | Name | Description | On-premise(K3s) | Cloud(AWS) |
|---|---|---|---|---|
| Serving Engine | servingEngineSpec.runtimeClassName | Runtime class for GPU | "" | "" |
| Serving Engine | servingEngineSpec.modelSpec | List of models (engines) to serve | [] | [] |
| Model | servingEngineSpec.modelSpec.name | Model identifier (referenced in chart) | "qwen3-vl-8b-fp8" | "qwen3-vl-8b-fp8" |
| Model | servingEngineSpec.modelSpec.modelURL | HF model path | "Qwen/Qwen3-VL-8B-Instruct-FP8" | "Qwen/Qwen3-VL-8B-Instruct-FP8" |
| Model | servingEngineSpec.modelSpec.repository | vLLM image repo | "vllm/vllm-openai" | "vllm/vllm-openai" |
| Model | servingEngineSpec.modelSpec.tag | vLLM image tag | "v0.11.0" | "v0.11.0" |
| Auth | servingEngineSpec.modelSpec.serviceAccountName | ServiceAccount for the engine pod | sa-eva-agent | sa-eva-agent |
| Scale | servingEngineSpec.modelSpec.replicaCount | Engine replica count | 1 | 1 |
| Resources | servingEngineSpec.modelSpec.requestCPU | CPU request | 4 | 4 |
| Resources | servingEngineSpec.modelSpec.requestMemory | Memory request | "32Gi" | "28Gi" |
| Resources | servingEngineSpec.modelSpec.requestGPU | GPU request | 1 | 1 |
| Storage | servingEngineSpec.modelSpec.pvcStorage | PVC size | "30Gi" | "30Gi" |
| Storage | servingEngineSpec.modelSpec.storageClass | StorageClass name | "eva-agent-sc-fs" | "eva-agent-sc-fs" |
| Storage | servingEngineSpec.modelSpec.pvcAccessMode | PVC access mode | ["ReadWriteMany"] | ["ReadWriteMany"] |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.host | vLLM host address | "0.0.0.0" | "0.0.0.0" |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.tensorParallelSize | GPUs used per model | 1 | 1 |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.gpuMemoryUtilization | GPU memory utilization | 0.7 | 0.7 |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.maxModelLen | Max context length | 12288 | 12288 |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.dtype | Model dtype | "auto" | "auto" |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.enableChunkedPrefill | Enable chunked prefill | true | true |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.enablePrefixCaching | Enable prefix caching | true | true |
| vLLM | servingEngineSpec.modelSpec.vllmConfig.extraArgs | Additional vllm serve args | • --served-model-name• qwen3-vl-8b-fp8• --kv-cache-dtype• fp8• --max-num-batched-tokens• 4096 | • --served-model-name• qwen3-vl-8b-fp8• --kv-cache-dtype• fp8• --max-num-batched-tokens• 4096 |
| Scheduling | servingEngineSpec.modelSpec.nodeSelectorTerms | Engine pod scheduling constraints | - | matchExpressions: kubernetes.io/os In [linux] (AWS: eks nodegroup In [ng-an2-eva-agent-gpu]) |
⚠️ Warning: The
kv-cache-dtypeinvllmConfig.extraArgsdepends on the Nvidia GPU architecture. For pre-Ada architectures, set it to "auto".
Download Configuration Files (eva-agent-vllm)
Download the values and post-renderer templates for vLLM.
Among the templates below, the common values.yaml and values-k3s.yml are based on Nvidia A6000 GPUs, while values-aws.yaml is based on a single Nvidia L40s GPU server.
Update vLLM settings according to your GPU specs based on the table above.
- Common values template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values.yaml
- k3s values template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values-k3s.yaml
- AWS values template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values-aws.yaml
- Post renderer template: https://github.com/mellerikat/eva-agent/blob/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/post-renderer.sh
# k3s & vLLM
mkdir eva-agent-vllm
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values.yaml" \
-o eva-agent-vllm/values.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/values-k3s.yaml" \
-o eva-agent-vllm/values-k3s.yaml
curl -L \
"https://raw.githubusercontent.com/mellerikat/eva-agent/chartmuseum/release/2.3-a3.0/1/eva-agent-vllm/post-renderer.sh" \
-o eva-agent-vllm/post-renderer.sh
chmod +x eva-agent-vllm/post-renderer.sh
# For AWS, change to "-f eva-agent-vllm/values-aws.yaml"
helm install eva-agent-vllm vllm/vllm-stack \
--version=0.1.7 \
-n eva-agent \
-f eva-agent-vllm/values.yaml \
-f eva-agent-vllm/values-k3s.yaml \
--post-renderer eva-agent-vllm/post-renderer.sh