EVA Agent Release Note

v2.3+a3.0 (Dec 26, 2025)

Highlights

Large refactor to a LangGraph-based architecture: new app/ layer (API/Core/Graph/Node/Provider/Schema) + GraphRegistry/GraphExecutor + YAML ConfigLoader with version resolution (895c9e6, 906205f, 6ed8471, 1d2a6b7).
Visual Agent v1 overhaul: basic_graph runs 1-step and 2-step sub-graphs in parallel → consolidates results → optionally runs translation and image-description generation as a single end-to-end pipeline (6ed8471, 602bd70, c332d29, 9386313).
Stronger Feedback ↔ RAG loop: add /api/v1/agents/feedback/record + extract vector-store helpers/DI + run vector search after visual query to correct alerts using historical feedback (38ecda7, 5b82132, d427f6b, 072f4de).
Config system redesign: standardized config/{agent}/{graph}/{node}/{model}/{version}.yaml layout, add Qwen3-VL (FP8) configs, and keep legacy aliases (qwen2.5-vl-32b-awq) for compatibility (d3ff717, 114b51f, 9012cdc, 92aa51c).
Packaging/deployment defaults updated: docker-compose default model switched to Qwen3-VL-8B-Instruct-FP8 with additional vLLM performance flags; eva-agent image tag bumped to 2.3-a3.0 (948306b, 2464349, 286b14c).

Breaking Changes

API schema/payload changes (important)
- Visual query moved from the legacy single vision.schemas.VisualQueryRequest/Response flow to app.schemas.visual_agent.VisualAgentRequest/Response (scenario_list-based, with a unified meta model) (87a9d88, 8b0d795).
- Feedback recording request schema is aligned around the unified meta, and the endpoint is exposed as /api/v1/agents/feedback/record (38ecda7, 5e7909c).
- The unified meta schema is now enforced as Pydantic models in app/schemas/common.py, so client meta payload shape is validated strictly (36bd1d7, 528b59d).
Config path/file layout changes (important)
- Legacy config trees like config/vlm/**, config/enrich_scenario/**, and config/llm/prompts/conversational_agent_system_prompt.md are removed/replaced; the server expects the new ConfigLoader path convention (dd066f5, de2c21f, eb784d0).
Environment variable cleanup
- Deprecated provider/env vars (e.g., LLM_PROVIDER, LLM_MODEL, OLLAMA_BASE_URL) were removed from .env.eva-agent.sample; deployments should align to the updated sample (ce29171, 0295587).
Internal module path changes
- Legacy code such as src/messages/** and src/vision/detection_scenario/** was removed and replaced by the new app layer; internal import paths changed significantly (c8c941b, 9590c9e, 983163f).
Infra endpoint removed
- The legacy /infra/models endpoint was removed (main.py refactor).

Features

API Layer
- Router modularization + docs: app/api/v1/{visual,chat,enrich,feedback} and dependency injection via app/api/dependency.py for RAG engines and vector-store config mapping (e103659, d427f6b).
- Add a 422 validation exception handler and provide health/healthz endpoints (b86f7e3, 6f93f49).
Visual Agent
- Multi-scenario requests: scenario_list input and per-scenario results returned in analysis_list (64984df, 9a20f36, 87a9d88).
- Vector-search-based alert correction: search feedback-backed few-shot points to adjust alerts; record embedding/search/correction steps in Langfuse spans (5b82132, 7c67670, 823fbd9).
- Translation pipeline: when an alert triggers, translate only the candidates via TranslatorGraph, in parallel with image-description generation (eeaca38, a25b859, 9386313).
Enrich Agent
- Refactor enrich graphs to async nodes, strengthen output schema/fields, and reflect area/metadata more consistently (8d94d4e, 215e86f, 7cb6e13, 26f1554).
- (Experimental) multi-scenario enrichment: classifier/decomposer-driven case splitting with reduced outputs (151242e, b2e2c04, 8561b5b).
Providers/Model Catalog
- (Internal) improve model listing per provider and consolidate catalog utilities (f49b626).
Observability
- Stronger Langfuse tracing for visual-agent/feedback/chat-agent: hierarchical spans, trace metadata updates, scenario context propagation (7fe4e95, 823fbd9, 072f4de, a586e70).
- Content-aware image redaction: scrub not only base64/data URLs but also payloads that decode to known image-byte signatures (d66c356).
Docs
- Add README_DEV.md and layer-specific guides for App/Core/Graph/Node (ee2e11e, dd066f5, ca1e1f2).

Fixes

Visual Agent stability/correctness
- Fix 2-step inference edge cases (exceptions), sorting, and output schema issues; stabilize scenario ordering (52749d9, eb1f359, 28e9322).
- Fix area handling for None and language-related issues (e.g., forcing English for area strings) (2681f9a, fb04c00, 4f0c0cc).
- Fix translation logic (duplicate generation/update bugs) and output formatting (5811063, d7157a2, 598cb01).
Chat Agent UX/consistency
- Improve unsupported request flow and prevent task type from leaking into system answers (c9111a5, ac8ca84).
- Clarify detection interval units/messages, adjust false detection cutoff range, and fix target extraction/validation (023f53a, 7f87d11, c9acabe, dae25ea).
RAG/Qdrant
- Improve Qdrant URL/env handling and related utilities (e.g., base URL support) (src/rag/vector_stores/qdrant/* changes).

Config & Deployment

Docker Compose
- Switch vLLM default model to Qwen3-VL (FP8) and add flags like KV cache dtype, chunked prefill, and prefix caching (948306b).
- Update eva-agent image version to eva-agent:2.3-a3.0 (286b14c).
Config
- Add large sets of versioned YAML configs for Visual/Chat/Enrich/Translation agents under the new path convention (682d331, 9012cdc, 114b51f).

Tests & Quality

Apply pre-commit formatting/style fixes (black/isort, trailing whitespace, etc.) (fdb7008).

Upgrade Notes

API clients
- Visual query/feedback/enrich/chat payloads are now based on the unified meta model; update clients using app/schemas/*.py as the source of truth.
- Visual query now returns per-scenario results via analysis_list (driven by scenario_list input); update any legacy single-scenario response parsing.
Config
- Migrate any legacy config/vlm/** and config/enrich_scenario/** setups to the new convention: config/{agent}/{graph}/{node}/{model}/{version}.yaml.
Env/Docker
- Align environment variables with .env.eva-agent.sample (deprecated vars removed) and review vLLM/Agent settings after the docker-compose default model/options change.
- For local runs (python main.py), the default port is 8888; review any existing port/healthcheck assumptions.

Changelog (selected commits)

feat(api): add feedback recording endpoint — 38ecda7
feat(api): inject rag deps for visual agent — d427f6b
feat(api): add reusable vector-store helpers and visual-agent vector search — 5b82132
refactor: chat agent to langgraph — 895c9e6
package: update docker-compose default model & vllm config — 948306b
feat(o11y): content-aware image redaction — d66c356
package: update image version(tag) — 286b14c

v2.2+a2.0 (Oct 2, 2025)

Highlights

Vision API v2 end-to-end wiring with typed envelopes and Langfuse v3 observability across visual, scenario-enrichment, and feedback flows (5c85cf9, 3f0cbd1, 28abd24, c90133a).
New OpenAI provider and factory integration, alongside existing Ollama support (19dc1fb, ee7799e, 7a9daf9).
RAG layer refactor: introduce AsyncEmbeddingEngine and AsyncVectorStoreEngine; drop legacy AsyncSearchEngine; add ensure_space() and named-vectors flow (a9df66b, 5638948, 4b571e3, d9fce6c).
Conversation agent fully refactored to class-based tasks with improved intent/category routing and UX-oriented suggestions (f581c53, 6e45298, adf3676, 2acf2e1).
Detection-scenario enrichment pipeline consolidated and renamed to “enrich_scenario”; camera_id handling relaxed to support pre-registration flows (3d8287f, 941db68, 2625ff8, 9c6595b).
New, versioned configs for VLM (Qwen2.5-VL:32B v2.2.0) and RAG visual agent (v2.2.0) for consistent, reproducible environments (39aa3cd, 9d2865c, main.py).

Breaking Changes

Vision API schemas
- VisualQueryRequest requires vlm_model_name (f208a24).
- Few-shot template now derives from VLMScenarioAnalysis; field access updated, feedback_content → feedback_message (1d50d71).
- ScenarioEnricher result now wrapped; VisualAgent returns typed envelope (LanguageModelAgentResponse[T], ChainBuiltResult) (c62e2fb4, 60e2fb4, 285b09b, 45f3f25).
Feedback API v2: request renamed and extended — add camera_id, feedback → feedback_binary, optional feedback_message (4c834ce). Wiring updated in main.py.
RAG/Vector Store
- Add AsyncVectorStoreEngine; retire AsyncSearchEngine; split/rename embed protocols (a9df66b, 5638948, 4b571e3).
- Named-vector mapping required; ensure_space() must be called for per-camera spaces (5c85cf9, d9fce6c).
Conversation
- IntentClassifier no longer pulls RAG context; prompt composer updated (477e9ed, 31050ba).

Features

Observability
- Langfuse v3 client integration; per-request trace IDs; prompt previews recorded post-ainvoke; image scrubbing (3f0cbd1, 28abd24, 7778b73, 2910db8, 444b936).
Providers
- OpenAI provider implementation with tests; provider utils to extract provider/model names (19dc1fb, ee7799e, 3f0cbd1).
Conversation Tasks
- New tasks: set alert interval and feedback similarity threshold (6d91649).
- Task responses suggest next actions: e.g., start monitoring, adjust threshold (2acf2e1).
Vision & Enrichment
- Scenario enrichment messages and schemas consolidated under vision/detection_scenario; new enrich configs (3d8287f, 941db68).
- Visual agent decoupled message builder; guard for missing few-shot template (f23f50f, 371fcf6).
Configuration
- Add VLM qwen2.5vl:32b v2.1.2 and v2.2.0; add RAG config visual_agent v2.2.0; Helm improvements incl. imagePullSecrets and env pkg CLI (39aa3cd, 9d2865c, 25a701e, f7bad38, env_pkg_cli.sh).

Fixes

Conversation routing & UX
- Category detection only on AI messages; quick button text stripped; remove brightness strategy (e6a194e, 4e29e76).
- Enforce English target in set-target; prompt and validation tweaks (80e2d8b).
- Alert interval defaults/units clarified; minimums and conversions with tests (5e7fd81, bf64aae).
- Quick button response inclusion and tests (0bb8dd5).
Vision/Enrichment correctness
- Detection scenario language match with query (265bf65).
- Persist rendered prompt in span metadata and correct trace updates (8df3c0e, ab58ade).
Build/Tooling
- Black language_version cleanup; pre-commit/style fixes (90b6cb3, 7a9daf9, ce32292).

Config & Deployment

Helm
- Add imagePullSecrets for eva-agent image; add env_pkg_cli.sh helper; add k3s values; prune legacy values (25a701e, f7bad38, new k3s values files).
- Update chart templates and values; secret dockerconfig template added.
Env & Samples
- .env.eva-agent.sample and .env.sample updated; .gitignore expanded.
Versioned runtime configs for VLM, RAG, and enrich_scenario aligned to the app wiring in main.py.

Tests & Quality

Extensive new unit tests across conversation tasks, providers (OpenAI), RAG engines, and vision message builders; integration markers for scenario enrichment and visual agent (multiple tests/** additions).
Disabled Langfuse in unit tests; updated fixtures and conftest to match new schemas (b6ae545, tests additions in this release).

Upgrade Notes

Vision API v2
- Ensure clients send vlm_model_name in VisualQueryRequest payloads (f208a24).
- Few-shot payloads now use fields inherited from VLMScenarioAnalysis; rename feedback_content → feedback_message (1d50d71).
Feedback API
- Update payload to include camera_id, feedback_binary, and optional feedback_message (4c834ce).
Vector Store & Embeddings
- Call ensure_space() for each camera-namespaced space before add/query operations (d9fce6c).
- Migrate from legacy AsyncSearchEngine to AsyncVectorStoreEngine and from encode() → embed() methods (a9df66b, 4b571e3).
Conversation
- Remove any reliance on RAG context within IntentClassifier/PromptComposer (477e9ed, 31050ba).

Changelog (selected commits)

feat(api): Vision API v2 (schemas) — 091140c
feat(api/schemas): add required vlm_model_name — f208a24
feat(api/feedback): v2 request schema — 4c834ce
refactor(vector-store,rag)!: add AsyncVectorStoreEngine; drop AsyncSearchEngine — a9df66b
feat(rag/vector-stores): expose ensure_space(); implement in Qdrant — d9fce6c
refactor(vision)!: wrap VLM result in typed envelope — 60e2fb4
refactor(vision)!: wrap ScenarioEnricher result — c62e2fb4
feat: implement OpenAIProvider and tests — 19dc1fb, ee7799e
refactor(conversation): remove RAG from intent — 477e9ed
feat(conversation): new tasks for thresholds/intervals — 6d91649
fix: category/quick button/brightness updates — e6a194e, 4e29e76
fix: alert interval defaults/units + tests — 5e7fd81, bf64aae
fix: detection scenario language match — 265bf65
o11y: Langfuse v3 wiring, prompt capture, scrubbing — 3f0cbd1, 28abd24, 7778b73

v2.1+a1.2 (Sep 10, 2025)

This release marks a significant step forward in the agent's capabilities, with a major focus on enhancing the Retrieval-Augmented Generation (RAG) system, improving observability and tracing, and streamlining the development and deployment experience.

Key Features

Named-Vector RAG and Search: The RAG system has been fundamentally upgraded to support named vectors. This allows for more sophisticated, multi-modal search with weighted scoring across different modalities (e.g., image, text, detection scenario), leading to more relevant and accurate retrieval results.
Feedback-Aware Alert Correction: The system can now intelligently correct alerts based on user feedback, creating a more robust and self-improving feedback loop.
Dynamic Agent ID for LangFuse: LangFuse tracing now uses dynamic agent_ids based on camera information. This greatly improves the traceability of requests and observations, making it easier to debug and monitor the system's behavior in a multi-camera environment.
Enhanced Development and Deployment:
- Docker Optimizations: The Docker build process has been optimized for speed, and the docker-compose setup has been improved for a better local development experience.
- Local Development Script: A new run_agent.sh script has been added to simplify running the agent locally with hot-reloading.
- Ollama Model Warm-up: Ollama models are now pre-loaded on container start to reduce the latency of the first request.
- EKS Deployment: Significant work has been done to enable deployment of the application to Amazon EKS.
GENERAL_CONVERSATION Task Removed: The GENERAL_CONVERSATION task has been removed from the list of supported tasks.

Fixes

Qdrant Hybrid Search: Improved the reliability of score fusion in Qdrant's hybrid search by using a large prefetch to avoid missing modalities.
Prompt Definitions: Corrected the message definition in prompts.
Test Execution: Ensured the correct Python version is used in the test execution script.

Refactoring

Observability: The observability submodule has been removed, and the tracing logic has been refactored into decorators for a cleaner separation of concerns.
Environment Variables: Standardized and clarified environment variables, for example, by renaming VLM_ENDPOINT_URL to OLLAMA_BASE_URL.
API Naming: Renamed the VisualAgent's execution methods from run/arun to invoke/ainvoke for better consistency.
RAG and Vector Store Codebase: The RAG and vector store codebase has been significantly refactored to support named vectors, improve modularity, and centralize data preprocessing and embedding.

Breaking Changes

Environment Variables:
- The PORT environment variable has been renamed to AGENT_INTERNAL_PORT.
- The VLM_ENDPOINT_URL environment variable has been renamed to OLLAMA_BASE_URL.
VisualAgent API: The run and arun methods have been removed. Use invoke and ainvoke instead.
RAG Configuration: The RAG configuration for the visual agent now requires a score_config section for named-vector search.
Search and Feedback APIs: The APIs for the AsyncSearchEngine and AsyncFeedbackAgent have been changed. Callers must now pre-embed queries and provide more explicit configuration.

v2.1+a1.1 (Aug 29, 2025)

Summary

This release introduces significant internal system optimizations and refinements across the RAG embedding system and Langfuse tracing integration, aiming to enhance performance, maintainability, and consistency.

Key Changes

RAG Embeddings System Enhancements:
- Refactored the asynchronous embedding provider registry for improved efficiency and clarity, particularly for image embeddings.
- Streamlined embedder instantiation by removing redundant caching and threading locks, leading to a more direct and potentially faster setup.
- Reduced verbose logging during embedding model loading.
Langfuse Tracing Integration Improvements:
- Optimized image data handling within the Langfuse tracing decorator, externalizing complex processing logic for better modularity and reduced overhead.
- Refined error logging to Langfuse, providing more precise error levels and metadata.
Environment Variable Loading Standardization:
- Standardized environment variable loading to explicit calls at the application entry point or within test configurations, ensuring consistent and predictable environment setup across the project.
Internal Code Cleanups:
- Implemented general code cleanups and minor structural adjustments across various utility and embedding-related modules, contributing to overall code health and maintainability.

v2.1.0 (Aug 26, 2025)

Added

Localization for SET_TARGET task: Implemented localization for responses related to the SET_TARGET task.
Target Management Operations: Introduced add, delete, and set operations for comprehensive target management.
Centralized Langfuse Tracing: Integrated a centralized Langfuse tracing system for improved observability and debugging.
Conversation History: Incorporated chat history into the conversational agent's prompt for more context-aware interactions.
Dynamic LLM Provider: Added a dynamic LLM provider mechanism, allowing for easier switching and management of different LLM backends.
Langfuse Image Upload & VLM Preprocessing: Enhanced Langfuse integration with image upload capabilities and improved VLM preprocessing.
Structured Fallback & Rich Context: Implemented a structured fallback mechanism for the conversational agent and enabled passing of richer context to tasks.
Multilingual Support: Introduced multi-language support and refined task classification for broader applicability.
New Conversational Tasks:
- ANSWER_SYSTEM_QUESTION: A new task to handle system-related queries.
- HANDLE_UNSUPPORTED_REQUEST: A new task to gracefully manage unsupported user requests.
Enhanced Task Prompting: Improved the prompting mechanism for each task type with more detailed descriptions and examples.
Typed Few-Shot Meta & VLM v2.1.0: Added typed few-shot metadata and updated VLM configurations to version 2.1.0.
Scenario Enhancer Pipeline: Introduced a scenario enhancer pipeline with new configurations (v0.1.0, v1.0.0).
Detection Scenario Task: Added a new task type for handling detection scenarios.
Brightness Control Task: Implemented a brightness control task with support for bright_param.
Message Builders & Schemas: Introduced base, visual, and conversational message builders, along with a new detection schema.
New VLM Prompt Configurations: Added several new VLM prompt configurations (v0.1.0, v0.1.1, v0.1.2, v0.1.2a, v2.0.0, v2.0.1, v2.1.0, v2.1.0a, v2.1.1, v2.1.2).

Fixed

Scenario Enhancer Prompt: Corrected an issue in the scenario enhancer prompt.
Search Engine Tests: Updated search engine tests for compatibility with asynchronous operations.
F-string Formatting: Fixed f-string formatting errors in status messages.
Docker Base Image Pinning: Pinned the Docker base image to a specific version for improved stability.
Asynchronous Synchronization: Resolved synchronization issues in asynchronous operations.
Agent Logic & QueryRequest Alignment: Aligned agent logic and QueryRequest model with updated API formats and test expectations.
VLM Preprocessing: Modified VLM preprocessing to use base64 pass-through/encode.

Changed

ANSWER_SYSTEM_QUESTION Task Handling: Streamlined the handling of the ANSWER_SYSTEM_QUESTION task.
QueryRequest Passing: Modified tasks to receive the full QueryRequest object.
Pre-commit Hooks: Updated pre-commit hooks to include autoflake and applied them repository-wide for consistent code formatting.
VLM Prompt Variants & Schema: Standardized YAML schema for VLM prompt variants.
LM Model Requirement: Made LM model a required field, extended few-shot metadata, and clarified analysis documentation.
Debug Logging: Improved debug logging for built messages with pretty-printing.
README Updates: Updated the README.md file with new information.
Tag Mechanism & Sample Environment: Updated the tag mechanism and sample environment configurations.
Language Detection: Replaced langdetect with a heuristic-based approach for language detection.
Tracing Method: Modularized the tracing method for better organization.
Provider Factory Function: Renamed the provider factory function for clarity.
Default LLM Model: Changed the default LLM model to qwen2.5vl.
VisualAgent Prompt Configuration: Switched VisualAgent to use v2.0.0 prompt configuration.
Ollama Configuration: Added Ollama concurrency environment variables, dropped non-GPU reservations, and updated the default LLM.
VLM Prompt Language/Tone: Refined language and tone rules for VLM prompts.
Modular Task Execution: Refactored the conversational agent to implement modular task execution via a task registry.
Asynchronous ConversationalAgent: Made the ConversationalAgent fully asynchronous, utilizing run_in_threadpool for blocking operations.
Docstring & Comment Separation: Separated file paths from docstrings and moved them to comments.
SearchEngine Asynchronicity: Made SearchEngine temporarily asynchronous.
Bullet Caps: Tightened short-mode bullet caps to 1.
Externalized System Prompt: Externalized the conversational agent's system prompt.
Embedding Models Device: Changed the default device for embedding models to CPU.
Gitignore Update: Added the docs directory to .gitignore.

v2.0.0 (Aug 8, 2025)

This is the first official release of the EVA Agent, a conversational and visual AI agent.

Added

Conversational Agent: Implemented a sophisticated conversational agent (src/conversation) with capabilities for intent classification, context building, and dynamic prompt composition.
Visual Agent: Introduced a visual agent (src/vision) for image analysis tasks, leveraging a configurable Visual Language Model (VLM) backend.
RAG Pipeline: Built a Retrieval-Augmented Generation (RAG) pipeline (src/rag) to enrich user queries with context from a knowledge base.
Vector Stores: Integrated support for FAISS and Qdrant vector stores for efficient similarity search.
LLM Provider Architecture: Designed a flexible provider model (src/providers) to easily switch between different LLM backends, with initial support for Ollama, Azure OpenAI, and Clova.
FastAPI Application: Developed a robust FastAPI application (main.py) to serve the agent's capabilities via a RESTful API.
Docker-Based Deployment: Created a multi-service Docker environment (docker-compose.yml) for easy and consistent deployment of the agent and its dependencies (Ollama, Qdrant).
Interactive Chat: Added a script (interactive_chat.py) for local interactive testing of the conversational agent.
Testing Framework: Established a comprehensive testing suite using pytest, with tests separated into unit and integration groups.
CI/CD: Integrated pre-commit hooks to enforce code formatting standards.

Fixed

Korean Language in Responses: Corrected an issue where the agent would occasionally return Korean text in the structured_response. The prompt has been updated with stricter constraints to enforce English-only output.
Clova Provider Bug: Fixed a bug in the ClovaProvider where the endpoint_url from the configuration was being ignored.
Docker Build and Runtime Issues: Resolved various issues to ensure a stable and reliable Docker deployment.

Changed

Project Structure: Refactored the codebase to improve modularity and class relationships.
API Schema: Unified and streamlined the API request and response schemas for clarity and consistency.
Prompt Formatting: The prompt in PromptComposer has been reformatted using textwrap.dedent for improved readability.
Test Configuration: The pytest.ini file has been updated to exclude the tests/evaluation directory from the default test run.

v2.3+a3.0 (Dec 26, 2025)​

Highlights​

Breaking Changes​

Features​

Fixes​

Config & Deployment​

Tests & Quality​

Upgrade Notes​

Changelog (selected commits)​

v2.2+a2.0 (Oct 2, 2025)​

Highlights​

Breaking Changes​

Features​

Fixes​

Config & Deployment​

Tests & Quality​

Upgrade Notes​

Changelog (selected commits)​

v2.1+a1.2 (Sep 10, 2025)​

Key Features​

Fixes​

Refactoring​

Breaking Changes​

v2.1+a1.1 (Aug 29, 2025)​

Summary​

Key Changes​

v2.1.0 (Aug 26, 2025)​

Added​

Fixed​

Changed​

v2.0.0 (Aug 8, 2025)​

Added​

Fixed​

Changed​

v2.3+a3.0 (Dec 26, 2025)

Highlights

Breaking Changes

Features

Fixes

Config & Deployment

Tests & Quality

Upgrade Notes

Changelog (selected commits)

v2.2+a2.0 (Oct 2, 2025)

Highlights

Breaking Changes

Features

Fixes

Config & Deployment

Tests & Quality

Upgrade Notes

Changelog (selected commits)

v2.1+a1.2 (Sep 10, 2025)

Key Features

Fixes

Refactoring

Breaking Changes

v2.1+a1.1 (Aug 29, 2025)

Summary

Key Changes

v2.1.0 (Aug 26, 2025)

Added

Fixed

Changed

v2.0.0 (Aug 8, 2025)

Added

Fixed

Changed