EVA Agent Release Note
v2.3+a3.0 (Dec 26, 2025)
Highlights
- Large refactor to a LangGraph-based architecture: new
app/layer (API/Core/Graph/Node/Provider/Schema) +GraphRegistry/GraphExecutor+ YAML ConfigLoader with version resolution (895c9e6, 906205f, 6ed8471, 1d2a6b7). - Visual Agent v1 overhaul:
basic_graphruns 1-step and 2-step sub-graphs in parallel → consolidates results → optionally runs translation and image-description generation as a single end-to-end pipeline (6ed8471, 602bd70, c332d29, 9386313). - Stronger Feedback ↔ RAG loop: add
/api/v1/agents/feedback/record+ extract vector-store helpers/DI + run vector search after visual query to correct alerts using historical feedback (38ecda7, 5b82132, d427f6b, 072f4de). - Config system redesign: standardized
config/{agent}/{graph}/{node}/{model}/{version}.yamllayout, add Qwen3-VL (FP8) configs, and keep legacy aliases (qwen2.5-vl-32b-awq) for compatibility (d3ff717, 114b51f, 9012cdc, 92aa51c). - Packaging/deployment defaults updated: docker-compose default model switched to
Qwen3-VL-8B-Instruct-FP8with additional vLLM performance flags; eva-agent image tag bumped to2.3-a3.0(948306b, 2464349, 286b14c).
Breaking Changes
- API schema/payload changes (important)
- Visual query moved from the legacy single
vision.schemas.VisualQueryRequest/Responseflow toapp.schemas.visual_agent.VisualAgentRequest/Response(scenario_list-based, with a unifiedmetamodel) (87a9d88, 8b0d795). - Feedback recording request schema is aligned around the unified
meta, and the endpoint is exposed as/api/v1/agents/feedback/record(38ecda7, 5e7909c). - The unified
metaschema is now enforced as Pydantic models inapp/schemas/common.py, so clientmetapayload shape is validated strictly (36bd1d7, 528b59d).
- Visual query moved from the legacy single
- Config path/file layout changes (important)
- Legacy config trees like
config/vlm/**,config/enrich_scenario/**, andconfig/llm/prompts/conversational_agent_system_prompt.mdare removed/replaced; the server expects the new ConfigLoader path convention (dd066f5, de2c21f, eb784d0).
- Legacy config trees like
- Environment variable cleanup
- Deprecated provider/env vars (e.g.,
LLM_PROVIDER,LLM_MODEL,OLLAMA_BASE_URL) were removed from.env.eva-agent.sample; deployments should align to the updated sample (ce29171, 0295587).
- Deprecated provider/env vars (e.g.,
- Internal module path changes
- Legacy code such as
src/messages/**andsrc/vision/detection_scenario/**was removed and replaced by the new app layer; internal import paths changed significantly (c8c941b, 9590c9e, 983163f).
- Legacy code such as
- Infra endpoint removed
- The legacy
/infra/modelsendpoint was removed (main.py refactor).
- The legacy
Features
- API Layer
- Router modularization + docs:
app/api/v1/{visual,chat,enrich,feedback}and dependency injection viaapp/api/dependency.pyfor RAG engines and vector-store config mapping (e103659, d427f6b). - Add a 422 validation exception handler and provide
health/healthzendpoints (b86f7e3, 6f93f49).
- Router modularization + docs:
- Visual Agent
- Multi-scenario requests:
scenario_listinput and per-scenario results returned inanalysis_list(64984df, 9a20f36, 87a9d88). - Vector-search-based alert correction: search feedback-backed few-shot points to adjust alerts; record embedding/search/correction steps in Langfuse spans (5b82132, 7c67670, 823fbd9).
- Translation pipeline: when an alert triggers, translate only the candidates via TranslatorGraph, in parallel with image-description generation (eeaca38, a25b859, 9386313).
- Multi-scenario requests:
- Enrich Agent
- Refactor enrich graphs to async nodes, strengthen output schema/fields, and reflect area/metadata more consistently (8d94d4e, 215e86f, 7cb6e13, 26f1554).
- (Experimental) multi-scenario enrichment: classifier/decomposer-driven case splitting with reduced outputs (151242e, b2e2c04, 8561b5b).
- Providers/Model Catalog
- (Internal) improve model listing per provider and consolidate catalog utilities (f49b626).
- Observability
- Stronger Langfuse tracing for visual-agent/feedback/chat-agent: hierarchical spans, trace metadata updates, scenario context propagation (7fe4e95, 823fbd9, 072f4de, a586e70).
- Content-aware image redaction: scrub not only base64/data URLs but also payloads that decode to known image-byte signatures (d66c356).
- Docs
- Add
README_DEV.mdand layer-specific guides for App/Core/Graph/Node (ee2e11e, dd066f5, ca1e1f2).
- Add
Fixes
- Visual Agent stability/correctness
- Fix 2-step inference edge cases (exceptions), sorting, and output schema issues; stabilize scenario ordering (52749d9, eb1f359, 28e9322).
- Fix area handling for
Noneand language-related issues (e.g., forcing English for area strings) (2681f9a, fb04c00, 4f0c0cc). - Fix translation logic (duplicate generation/update bugs) and output formatting (5811063, d7157a2, 598cb01).
- Chat Agent UX/consistency
- Improve unsupported request flow and prevent task type from leaking into system answers (c9111a5, ac8ca84).
- Clarify detection interval units/messages, adjust false detection cutoff range, and fix target extraction/validation (023f53a, 7f87d11, c9acabe, dae25ea).
- RAG/Qdrant
- Improve Qdrant URL/env handling and related utilities (e.g., base URL support) (src/rag/vector_stores/qdrant/* changes).
Config & Deployment
- Docker Compose
- Switch vLLM default model to Qwen3-VL (FP8) and add flags like KV cache dtype, chunked prefill, and prefix caching (948306b).
- Update eva-agent image version to
eva-agent:2.3-a3.0(286b14c).
- Config
- Add large sets of versioned YAML configs for Visual/Chat/Enrich/Translation agents under the new path convention (682d331, 9012cdc, 114b51f).
Tests & Quality
- Apply pre-commit formatting/style fixes (black/isort, trailing whitespace, etc.) (fdb7008).
Upgrade Notes
- API clients
- Visual query/feedback/enrich/chat payloads are now based on the unified
metamodel; update clients usingapp/schemas/*.pyas the source of truth. - Visual query now returns per-scenario results via
analysis_list(driven byscenario_listinput); update any legacy single-scenario response parsing.
- Visual query/feedback/enrich/chat payloads are now based on the unified
- Config
- Migrate any legacy
config/vlm/**andconfig/enrich_scenario/**setups to the new convention:config/{agent}/{graph}/{node}/{model}/{version}.yaml.
- Migrate any legacy
- Env/Docker
- Align environment variables with
.env.eva-agent.sample(deprecated vars removed) and review vLLM/Agent settings after the docker-compose default model/options change. - For local runs (
python main.py), the default port is8888; review any existing port/healthcheck assumptions.
- Align environment variables with
Changelog (selected commits)
- feat(api): add feedback recording endpoint — 38ecda7
- feat(api): inject rag deps for visual agent — d427f6b
- feat(api): add reusable vector-store helpers and visual-agent vector search — 5b82132
- refactor: chat agent to langgraph — 895c9e6
- package: update docker-compose default model & vllm config — 948306b
- feat(o11y): content-aware image redaction — d66c356
- package: update image version(tag) — 286b14c
v2.2+a2.0 (Oct 2, 2025)
Highlights
- Vision API v2 end-to-end wiring with typed envelopes and Langfuse v3 observability across visual, scenario-enrichment, and feedback flows (5c85cf9, 3f0cbd1, 28abd24, c90133a).
- New OpenAI provider and factory integration, alongside existing Ollama support (19dc1fb, ee7799e, 7a9daf9).
- RAG layer refactor: introduce AsyncEmbeddingEngine and AsyncVectorStoreEngine; drop legacy AsyncSearchEngine; add
ensure_space()and named-vectors flow (a9df66b, 5638948, 4b571e3, d9fce6c). - Conversation agent fully refactored to class-based tasks with improved intent/category routing and UX-oriented suggestions (f581c53, 6e45298, adf3676, 2acf2e1).
- Detection-scenario enrichment pipeline consolidated and renamed to “enrich_scenario”; camera_id handling relaxed to support pre-registration flows (3d8287f, 941db68, 2625ff8, 9c6595b).
- New, versioned configs for VLM (Qwen2.5-VL:32B v2.2.0) and RAG visual agent (v2.2.0) for consistent, reproducible environments (39aa3cd, 9d2865c, main.py).
Breaking Changes
- Vision API schemas
- VisualQueryRequest requires
vlm_model_name(f208a24). - Few-shot template now derives from
VLMScenarioAnalysis; field access updated,feedback_content→feedback_message(1d50d71). - ScenarioEnricher result now wrapped; VisualAgent returns typed envelope (
LanguageModelAgentResponse[T],ChainBuiltResult) (c62e2fb4, 60e2fb4, 285b09b, 45f3f25).
- VisualQueryRequest requires
- Feedback API v2: request renamed and extended — add
camera_id,feedback→feedback_binary, optionalfeedback_message(4c834ce). Wiring updated inmain.py. - RAG/Vector Store
- Add
AsyncVectorStoreEngine; retireAsyncSearchEngine; split/rename embed protocols (a9df66b, 5638948, 4b571e3). - Named-vector mapping required;
ensure_space()must be called for per-camera spaces (5c85cf9, d9fce6c).
- Add
- Conversation
- IntentClassifier no longer pulls RAG context; prompt composer updated (477e9ed, 31050ba).
Features
- Observability
- Langfuse v3 client integration; per-request trace IDs; prompt previews recorded post-ainvoke; image scrubbing (3f0cbd1, 28abd24, 7778b73, 2910db8, 444b936).
- Providers
- OpenAI provider implementation with tests; provider utils to extract provider/model names (19dc1fb, ee7799e, 3f0cbd1).
- Conversation Tasks
- New tasks: set alert interval and feedback similarity threshold (6d91649).
- Task responses suggest next actions: e.g., start monitoring, adjust threshold (2acf2e1).
- Vision & Enrichment
- Scenario enrichment messages and schemas consolidated under
vision/detection_scenario; new enrich configs (3d8287f, 941db68). - Visual agent decoupled message builder; guard for missing few-shot template (f23f50f, 371fcf6).
- Scenario enrichment messages and schemas consolidated under
- Configuration
- Add VLM
qwen2.5vl:32bv2.1.2 and v2.2.0; add RAG configvisual_agent v2.2.0; Helm improvements incl. imagePullSecrets and env pkg CLI (39aa3cd, 9d2865c, 25a701e, f7bad38, env_pkg_cli.sh).
- Add VLM
Fixes
- Conversation routing & UX
- Category detection only on AI messages; quick button text stripped; remove brightness strategy (e6a194e, 4e29e76).
- Enforce English target in set-target; prompt and validation tweaks (80e2d8b).
- Alert interval defaults/units clarified; minimums and conversions with tests (5e7fd81, bf64aae).
- Quick button response inclusion and tests (0bb8dd5).
- Vision/Enrichment correctness
- Detection scenario language match with query (265bf65).
- Persist rendered prompt in span metadata and correct trace updates (8df3c0e, ab58ade).
- Build/Tooling
- Black language_version cleanup; pre-commit/style fixes (90b6cb3, 7a9daf9, ce32292).
Config & Deployment
- Helm
- Add
imagePullSecretsfor eva-agent image; addenv_pkg_cli.shhelper; add k3s values; prune legacy values (25a701e, f7bad38, new k3s values files). - Update chart templates and values; secret dockerconfig template added.
- Add
- Env & Samples
.env.eva-agent.sampleand.env.sampleupdated;.gitignoreexpanded.
- Versioned runtime configs for VLM, RAG, and enrich_scenario aligned to the app wiring in
main.py.
Tests & Quality
- Extensive new unit tests across conversation tasks, providers (OpenAI), RAG engines, and vision message builders; integration markers for scenario enrichment and visual agent (multiple
tests/**additions). - Disabled Langfuse in unit tests; updated fixtures and conftest to match new schemas (b6ae545, tests additions in this release).
Upgrade Notes
- Vision API v2
- Ensure clients send
vlm_model_nameinVisualQueryRequestpayloads (f208a24). - Few-shot payloads now use fields inherited from
VLMScenarioAnalysis; renamefeedback_content→feedback_message(1d50d71).
- Ensure clients send
- Feedback API
- Update payload to include
camera_id,feedback_binary, and optionalfeedback_message(4c834ce).
- Update payload to include
- Vector Store & Embeddings
- Call
ensure_space()for each camera-namespaced space beforeadd/queryoperations (d9fce6c). - Migrate from legacy
AsyncSearchEnginetoAsyncVectorStoreEngineand fromencode()→embed()methods (a9df66b, 4b571e3).
- Call
- Conversation
- Remove any reliance on RAG context within IntentClassifier/PromptComposer (477e9ed, 31050ba).
Changelog (selected commits)
- feat(api): Vision API v2 (schemas) — 091140c
- feat(api/schemas): add required
vlm_model_name— f208a24 - feat(api/feedback): v2 request schema — 4c834ce
- refactor(vector-store,rag)!: add
AsyncVectorStoreEngine; dropAsyncSearchEngine— a9df66b - feat(rag/vector-stores): expose
ensure_space(); implement in Qdrant — d9fce6c - refactor(vision)!: wrap VLM result in typed envelope — 60e2fb4
- refactor(vision)!: wrap ScenarioEnricher result — c62e2fb4
- feat: implement OpenAIProvider and tests — 19dc1fb, ee7799e
- refactor(conversation): remove RAG from intent — 477e9ed
- feat(conversation): new tasks for thresholds/intervals — 6d91649
- fix: category/quick button/brightness updates — e6a194e, 4e29e76
- fix: alert interval defaults/units + tests — 5e7fd81, bf64aae
- fix: detection scenario language match — 265bf65
- o11y: Langfuse v3 wiring, prompt capture, scrubbing — 3f0cbd1, 28abd24, 7778b73
v2.1+a1.2 (Sep 10, 2025)
This release marks a significant step forward in the agent's capabilities, with a major focus on enhancing the Retrieval-Augmented Generation (RAG) system, improving observability and tracing, and streamlining the development and deployment experience.
Key Features
- Named-Vector RAG and Search: The RAG system has been fundamentally upgraded to support named vectors. This allows for more sophisticated, multi-modal search with weighted scoring across different modalities (e.g., image, text, detection scenario), leading to more relevant and accurate retrieval results.
- Feedback-Aware Alert Correction: The system can now intelligently correct alerts based on user feedback, creating a more robust and self-improving feedback loop.
- Dynamic Agent ID for LangFuse: LangFuse tracing now uses dynamic
agent_ids based on camera information. This greatly improves the traceability of requests and observations, making it easier to debug and monitor the system's behavior in a multi-camera environment. - Enhanced Development and Deployment:
- Docker Optimizations: The Docker build process has been optimized for speed, and the
docker-composesetup has been improved for a better local development experience. - Local Development Script: A new
run_agent.shscript has been added to simplify running the agent locally with hot-reloading. - Ollama Model Warm-up: Ollama models are now pre-loaded on container start to reduce the latency of the first request.
- EKS Deployment: Significant work has been done to enable deployment of the application to Amazon EKS.
- Docker Optimizations: The Docker build process has been optimized for speed, and the
GENERAL_CONVERSATIONTask Removed: TheGENERAL_CONVERSATIONtask has been removed from the list of supported tasks.
Fixes
- Qdrant Hybrid Search: Improved the reliability of score fusion in Qdrant's hybrid search by using a large prefetch to avoid missing modalities.
- Prompt Definitions: Corrected the message definition in prompts.
- Test Execution: Ensured the correct Python version is used in the test execution script.
Refactoring
- Observability: The
observabilitysubmodule has been removed, and the tracing logic has been refactored into decorators for a cleaner separation of concerns. - Environment Variables: Standardized and clarified environment variables, for example, by renaming
VLM_ENDPOINT_URLtoOLLAMA_BASE_URL. - API Naming: Renamed the
VisualAgent's execution methods fromrun/aruntoinvoke/ainvokefor better consistency. - RAG and Vector Store Codebase: The RAG and vector store codebase has been significantly refactored to support named vectors, improve modularity, and centralize data preprocessing and embedding.
Breaking Changes
- Environment Variables:
- The
PORTenvironment variable has been renamed toAGENT_INTERNAL_PORT. - The
VLM_ENDPOINT_URLenvironment variable has been renamed toOLLAMA_BASE_URL.
- The
VisualAgentAPI: Therunandarunmethods have been removed. Useinvokeandainvokeinstead.- RAG Configuration: The RAG configuration for the visual agent now requires a
score_configsection for named-vector search. - Search and Feedback APIs: The APIs for the
AsyncSearchEngineandAsyncFeedbackAgenthave been changed. Callers must now pre-embed queries and provide more explicit configuration.
v2.1+a1.1 (Aug 29, 2025)
Summary
This release introduces significant internal system optimizations and refinements across the RAG embedding system and Langfuse tracing integration, aiming to enhance performance, maintainability, and consistency.
Key Changes
- RAG Embeddings System Enhancements:
- Refactored the asynchronous embedding provider registry for improved efficiency and clarity, particularly for image embeddings.
- Streamlined embedder instantiation by removing redundant caching and threading locks, leading to a more direct and potentially faster setup.
- Reduced verbose logging during embedding model loading.
- Langfuse Tracing Integration Improvements:
- Optimized image data handling within the Langfuse tracing decorator, externalizing complex processing logic for better modularity and reduced overhead.
- Refined error logging to Langfuse, providing more precise error levels and metadata.
- Environment Variable Loading Standardization:
- Standardized environment variable loading to explicit calls at the application entry point or within test configurations, ensuring consistent and predictable environment setup across the project.
- Internal Code Cleanups:
- Implemented general code cleanups and minor structural adjustments across various utility and embedding-related modules, contributing to overall code health and maintainability.
v2.1.0 (Aug 26, 2025)
Added
- Localization for SET_TARGET task: Implemented localization for responses related to the
SET_TARGETtask. - Target Management Operations: Introduced
add,delete, andsetoperations for comprehensive target management. - Centralized Langfuse Tracing: Integrated a centralized Langfuse tracing system for improved observability and debugging.
- Conversation History: Incorporated chat history into the conversational agent's prompt for more context-aware interactions.
- Dynamic LLM Provider: Added a dynamic LLM provider mechanism, allowing for easier switching and management of different LLM backends.
- Langfuse Image Upload & VLM Preprocessing: Enhanced Langfuse integration with image upload capabilities and improved VLM preprocessing.
- Structured Fallback & Rich Context: Implemented a structured fallback mechanism for the conversational agent and enabled passing of richer context to tasks.
- Multilingual Support: Introduced multi-language support and refined task classification for broader applicability.
- New Conversational Tasks:
ANSWER_SYSTEM_QUESTION: A new task to handle system-related queries.HANDLE_UNSUPPORTED_REQUEST: A new task to gracefully manage unsupported user requests.
- Enhanced Task Prompting: Improved the prompting mechanism for each task type with more detailed descriptions and examples.
- Typed Few-Shot Meta & VLM v2.1.0: Added typed few-shot metadata and updated VLM configurations to version 2.1.0.
- Scenario Enhancer Pipeline: Introduced a scenario enhancer pipeline with new configurations (
v0.1.0,v1.0.0). - Detection Scenario Task: Added a new task type for handling detection scenarios.
- Brightness Control Task: Implemented a brightness control task with support for
bright_param. - Message Builders & Schemas: Introduced base, visual, and conversational message builders, along with a new detection schema.
- New VLM Prompt Configurations: Added several new VLM prompt configurations (
v0.1.0,v0.1.1,v0.1.2,v0.1.2a,v2.0.0,v2.0.1,v2.1.0,v2.1.0a,v2.1.1,v2.1.2).
Fixed
- Scenario Enhancer Prompt: Corrected an issue in the scenario enhancer prompt.
- Search Engine Tests: Updated search engine tests for compatibility with asynchronous operations.
- F-string Formatting: Fixed f-string formatting errors in status messages.
- Docker Base Image Pinning: Pinned the Docker base image to a specific version for improved stability.
- Asynchronous Synchronization: Resolved synchronization issues in asynchronous operations.
- Agent Logic & QueryRequest Alignment: Aligned agent logic and
QueryRequestmodel with updated API formats and test expectations. - VLM Preprocessing: Modified VLM preprocessing to use base64 pass-through/encode.
Changed
- ANSWER_SYSTEM_QUESTION Task Handling: Streamlined the handling of the
ANSWER_SYSTEM_QUESTIONtask. - QueryRequest Passing: Modified tasks to receive the full
QueryRequestobject. - Pre-commit Hooks: Updated pre-commit hooks to include
autoflakeand applied them repository-wide for consistent code formatting. - VLM Prompt Variants & Schema: Standardized YAML schema for VLM prompt variants.
- LM Model Requirement: Made LM
modela required field, extended few-shot metadata, and clarified analysis documentation. - Debug Logging: Improved debug logging for built messages with pretty-printing.
- README Updates: Updated the
README.mdfile with new information. - Tag Mechanism & Sample Environment: Updated the tag mechanism and sample environment configurations.
- Language Detection: Replaced
langdetectwith a heuristic-based approach for language detection. - Tracing Method: Modularized the tracing method for better organization.
- Provider Factory Function: Renamed the provider factory function for clarity.
- Default LLM Model: Changed the default LLM model to
qwen2.5vl. - VisualAgent Prompt Configuration: Switched
VisualAgentto usev2.0.0prompt configuration. - Ollama Configuration: Added Ollama concurrency environment variables, dropped non-GPU reservations, and updated the default LLM.
- VLM Prompt Language/Tone: Refined language and tone rules for VLM prompts.
- Modular Task Execution: Refactored the conversational agent to implement modular task execution via a task registry.
- Asynchronous ConversationalAgent: Made the
ConversationalAgentfully asynchronous, utilizingrun_in_threadpoolfor blocking operations. - Docstring & Comment Separation: Separated file paths from docstrings and moved them to comments.
- SearchEngine Asynchronicity: Made
SearchEnginetemporarily asynchronous. - Bullet Caps: Tightened short-mode bullet caps to 1.
- Externalized System Prompt: Externalized the conversational agent's system prompt.
- Embedding Models Device: Changed the default device for embedding models to CPU.
- Gitignore Update: Added the
docsdirectory to.gitignore.
v2.0.0 (Aug 8, 2025)
This is the first official release of the EVA Agent, a conversational and visual AI agent.
Added
- Conversational Agent: Implemented a sophisticated conversational agent (
src/conversation) with capabilities for intent classification, context building, and dynamic prompt composition. - Visual Agent: Introduced a visual agent (
src/vision) for image analysis tasks, leveraging a configurable Visual Language Model (VLM) backend. - RAG Pipeline: Built a Retrieval-Augmented Generation (RAG) pipeline (
src/rag) to enrich user queries with context from a knowledge base. - Vector Stores: Integrated support for FAISS and Qdrant vector stores for efficient similarity search.
- LLM Provider Architecture: Designed a flexible provider model (
src/providers) to easily switch between different LLM backends, with initial support for Ollama, Azure OpenAI, and Clova. - FastAPI Application: Developed a robust FastAPI application (
main.py) to serve the agent's capabilities via a RESTful API. - Docker-Based Deployment: Created a multi-service Docker environment (
docker-compose.yml) for easy and consistent deployment of the agent and its dependencies (Ollama, Qdrant). - Interactive Chat: Added a script (
interactive_chat.py) for local interactive testing of the conversational agent. - Testing Framework: Established a comprehensive testing suite using
pytest, with tests separated into unit and integration groups. - CI/CD: Integrated
pre-commithooks to enforce code formatting standards.
Fixed
- Korean Language in Responses: Corrected an issue where the agent would occasionally return Korean text in the
structured_response. The prompt has been updated with stricter constraints to enforce English-only output. - Clova Provider Bug: Fixed a bug in the
ClovaProviderwhere theendpoint_urlfrom the configuration was being ignored. - Docker Build and Runtime Issues: Resolved various issues to ensure a stable and reliable Docker deployment.
Changed
- Project Structure: Refactored the codebase to improve modularity and class relationships.
- API Schema: Unified and streamlined the API request and response schemas for clarity and consistency.
- Prompt Formatting: The prompt in
PromptComposerhas been reformatted usingtextwrap.dedentfor improved readability. - Test Configuration: The
pytest.inifile has been updated to exclude thetests/evaluationdirectory from the default test run.