EVA Agent Release Note
v2.2+a2.0 (Oct 2, 2025)
Highlights
- Vision API v2 end-to-end wiring with typed envelopes and Langfuse v3 observability across visual, scenario-enrichment, and feedback flows (5c85cf9, 3f0cbd1, 28abd24, c90133a).
- New OpenAI provider and factory integration, alongside existing Ollama support (19dc1fb, ee7799e, 7a9daf9).
- RAG layer refactor: introduce AsyncEmbeddingEngine and AsyncVectorStoreEngine; drop legacy AsyncSearchEngine; add
ensure_space()
and named-vectors flow (a9df66b, 5638948, 4b571e3, d9fce6c). - Conversation agent fully refactored to class-based tasks with improved intent/category routing and UX-oriented suggestions (f581c53, 6e45298, adf3676, 2acf2e1).
- Detection-scenario enrichment pipeline consolidated and renamed to “enrich_scenario”; camera_id handling relaxed to support pre-registration flows (3d8287f, 941db68, 2625ff8, 9c6595b).
- New, versioned configs for VLM (Qwen2.5-VL:32B v2.2.0) and RAG visual agent (v2.2.0) for consistent, reproducible environments (39aa3cd, 9d2865c, main.py).
Breaking Changes
- Vision API schemas
- VisualQueryRequest requires
vlm_model_name
(f208a24). - Few-shot template now derives from
VLMScenarioAnalysis
; field access updated,feedback_content
→feedback_message
(1d50d71). - ScenarioEnricher result now wrapped; VisualAgent returns typed envelope (
LanguageModelAgentResponse[T]
,ChainBuiltResult
) (c62e2fb4, 60e2fb4, 285b09b, 45f3f25).
- VisualQueryRequest requires
- Feedback API v2: request renamed and extended — add
camera_id
,feedback
→feedback_binary
, optionalfeedback_message
(4c834ce). Wiring updated inmain.py
. - RAG/Vector Store
- Add
AsyncVectorStoreEngine
; retireAsyncSearchEngine
; split/rename embed protocols (a9df66b, 5638948, 4b571e3). - Named-vector mapping required;
ensure_space()
must be called for per-camera spaces (5c85cf9, d9fce6c).
- Add
- Conversation
- IntentClassifier no longer pulls RAG context; prompt composer updated (477e9ed, 31050ba).
Features
- Observability
- Langfuse v3 client integration; per-request trace IDs; prompt previews recorded post-ainvoke; image scrubbing (3f0cbd1, 28abd24, 7778b73, 2910db8, 444b936).
- Providers
- OpenAI provider implementation with tests; provider utils to extract provider/model names (19dc1fb, ee7799e, 3f0cbd1).
- Conversation Tasks
- New tasks: set alert interval and feedback similarity threshold (6d91649).
- Task responses suggest next actions: e.g., start monitoring, adjust threshold (2acf2e1).
- Vision & Enrichment
- Scenario enrichment messages and schemas consolidated under
vision/detection_scenario
; new enrich configs (3d8287f, 941db68). - Visual agent decoupled message builder; guard for missing few-shot template (f23f50f, 371fcf6).
- Scenario enrichment messages and schemas consolidated under
- Configuration
- Add VLM
qwen2.5vl:32b
v2.1.2 and v2.2.0; add RAG configvisual_agent v2.2.0
; Helm improvements incl. imagePullSecrets and env pkg CLI (39aa3cd, 9d2865c, 25a701e, f7bad38, env_pkg_cli.sh).
- Add VLM
Fixes
- Conversation routing & UX
- Category detection only on AI messages; quick button text stripped; remove brightness strategy (e6a194e, 4e29e76).
- Enforce English target in set-target; prompt and validation tweaks (80e2d8b).
- Alert interval defaults/units clarified; minimums and conversions with tests (5e7fd81, bf64aae).
- Quick button response inclusion and tests (0bb8dd5).
- Vision/Enrichment correctness
- Detection scenario language match with query (265bf65).
- Persist rendered prompt in span metadata and correct trace updates (8df3c0e, ab58ade).
- Build/Tooling
- Black language_version cleanup; pre-commit/style fixes (90b6cb3, 7a9daf9, ce32292).
Config & Deployment
- Helm
- Add
imagePullSecrets
for eva-agent image; addenv_pkg_cli.sh
helper; add k3s values; prune legacy values (25a701e, f7bad38, new k3s values files). - Update chart templates and values; secret dockerconfig template added.
- Add
- Env & Samples
.env.eva-agent.sample
and.env.sample
updated;.gitignore
expanded.
- Versioned runtime configs for VLM, RAG, and enrich_scenario aligned to the app wiring in
main.py
.
Tests & Quality
- Extensive new unit tests across conversation tasks, providers (OpenAI), RAG engines, and vision message builders; integration markers for scenario enrichment and visual agent (multiple
tests/**
additions). - Disabled Langfuse in unit tests; updated fixtures and conftest to match new schemas (b6ae545, tests additions in this release).
Upgrade Notes
- Vision API v2
- Ensure clients send
vlm_model_name
inVisualQueryRequest
payloads (f208a24). - Few-shot payloads now use fields inherited from
VLMScenarioAnalysis
; renamefeedback_content
→feedback_message
(1d50d71).
- Ensure clients send
- Feedback API
- Update payload to include
camera_id
,feedback_binary
, and optionalfeedback_message
(4c834ce).
- Update payload to include
- Vector Store & Embeddings
- Call
ensure_space()
for each camera-namespaced space beforeadd
/query
operations (d9fce6c). - Migrate from legacy
AsyncSearchEngine
toAsyncVectorStoreEngine
and fromencode()
→embed()
methods (a9df66b, 4b571e3).
- Call
- Conversation
- Remove any reliance on RAG context within IntentClassifier/PromptComposer (477e9ed, 31050ba).
Changelog (selected commits)
- feat(api): Vision API v2 (schemas) — 091140c
- feat(api/schemas): add required
vlm_model_name
— f208a24 - feat(api/feedback): v2 request schema — 4c834ce
- refactor(vector-store,rag)!: add
AsyncVectorStoreEngine
; dropAsyncSearchEngine
— a9df66b - feat(rag/vector-stores): expose
ensure_space()
; implement in Qdrant — d9fce6c - refactor(vision)!: wrap VLM result in typed envelope — 60e2fb4
- refactor(vision)!: wrap ScenarioEnricher result — c62e2fb4
- feat: implement OpenAIProvider and tests — 19dc1fb, ee7799e
- refactor(conversation): remove RAG from intent — 477e9ed
- feat(conversation): new tasks for thresholds/intervals — 6d91649
- fix: category/quick button/brightness updates — e6a194e, 4e29e76
- fix: alert interval defaults/units + tests — 5e7fd81, bf64aae
- fix: detection scenario language match — 265bf65
- o11y: Langfuse v3 wiring, prompt capture, scrubbing — 3f0cbd1, 28abd24, 7778b73
v2.1+a1.2 (Sep 10, 2025)
This release marks a significant step forward in the agent's capabilities, with a major focus on enhancing the Retrieval-Augmented Generation (RAG) system, improving observability and tracing, and streamlining the development and deployment experience.
Key Features
- Named-Vector RAG and Search: The RAG system has been fundamentally upgraded to support named vectors. This allows for more sophisticated, multi-modal search with weighted scoring across different modalities (e.g., image, text, detection scenario), leading to more relevant and accurate retrieval results.
- Feedback-Aware Alert Correction: The system can now intelligently correct alerts based on user feedback, creating a more robust and self-improving feedback loop.
- Dynamic Agent ID for LangFuse: LangFuse tracing now uses dynamic
agent_id
s based on camera information. This greatly improves the traceability of requests and observations, making it easier to debug and monitor the system's behavior in a multi-camera environment. - Enhanced Development and Deployment:
- Docker Optimizations: The Docker build process has been optimized for speed, and the
docker-compose
setup has been improved for a better local development experience. - Local Development Script: A new
run_agent.sh
script has been added to simplify running the agent locally with hot-reloading. - Ollama Model Warm-up: Ollama models are now pre-loaded on container start to reduce the latency of the first request.
- EKS Deployment: Significant work has been done to enable deployment of the application to Amazon EKS.
- Docker Optimizations: The Docker build process has been optimized for speed, and the
GENERAL_CONVERSATION
Task Removed: TheGENERAL_CONVERSATION
task has been removed from the list of supported tasks.
Fixes
- Qdrant Hybrid Search: Improved the reliability of score fusion in Qdrant's hybrid search by using a large prefetch to avoid missing modalities.
- Prompt Definitions: Corrected the message definition in prompts.
- Test Execution: Ensured the correct Python version is used in the test execution script.
Refactoring
- Observability: The
observability
submodule has been removed, and the tracing logic has been refactored into decorators for a cleaner separation of concerns. - Environment Variables: Standardized and clarified environment variables, for example, by renaming
VLM_ENDPOINT_URL
toOLLAMA_BASE_URL
. - API Naming: Renamed the
VisualAgent
's execution methods fromrun
/arun
toinvoke
/ainvoke
for better consistency. - RAG and Vector Store Codebase: The RAG and vector store codebase has been significantly refactored to support named vectors, improve modularity, and centralize data preprocessing and embedding.
Breaking Changes
- Environment Variables:
- The
PORT
environment variable has been renamed toAGENT_INTERNAL_PORT
. - The
VLM_ENDPOINT_URL
environment variable has been renamed toOLLAMA_BASE_URL
.
- The
VisualAgent
API: Therun
andarun
methods have been removed. Useinvoke
andainvoke
instead.- RAG Configuration: The RAG configuration for the visual agent now requires a
score_config
section for named-vector search. - Search and Feedback APIs: The APIs for the
AsyncSearchEngine
andAsyncFeedbackAgent
have been changed. Callers must now pre-embed queries and provide more explicit configuration.
v2.1+a1.1 (Aug 29, 2025)
Summary
This release introduces significant internal system optimizations and refinements across the RAG embedding system and Langfuse tracing integration, aiming to enhance performance, maintainability, and consistency.
Key Changes
- RAG Embeddings System Enhancements:
- Refactored the asynchronous embedding provider registry for improved efficiency and clarity, particularly for image embeddings.
- Streamlined embedder instantiation by removing redundant caching and threading locks, leading to a more direct and potentially faster setup.
- Reduced verbose logging during embedding model loading.
- Langfuse Tracing Integration Improvements:
- Optimized image data handling within the Langfuse tracing decorator, externalizing complex processing logic for better modularity and reduced overhead.
- Refined error logging to Langfuse, providing more precise error levels and metadata.
- Environment Variable Loading Standardization:
- Standardized environment variable loading to explicit calls at the application entry point or within test configurations, ensuring consistent and predictable environment setup across the project.
- Internal Code Cleanups:
- Implemented general code cleanups and minor structural adjustments across various utility and embedding-related modules, contributing to overall code health and maintainability.
v2.1.0 (Aug 26, 2025)
Added
- Localization for SET_TARGET task: Implemented localization for responses related to the
SET_TARGET
task. - Target Management Operations: Introduced
add
,delete
, andset
operations for comprehensive target management. - Centralized Langfuse Tracing: Integrated a centralized Langfuse tracing system for improved observability and debugging.
- Conversation History: Incorporated chat history into the conversational agent's prompt for more context-aware interactions.
- Dynamic LLM Provider: Added a dynamic LLM provider mechanism, allowing for easier switching and management of different LLM backends.
- Langfuse Image Upload & VLM Preprocessing: Enhanced Langfuse integration with image upload capabilities and improved VLM preprocessing.
- Structured Fallback & Rich Context: Implemented a structured fallback mechanism for the conversational agent and enabled passing of richer context to tasks.
- Multilingual Support: Introduced multi-language support and refined task classification for broader applicability.
- New Conversational Tasks:
ANSWER_SYSTEM_QUESTION
: A new task to handle system-related queries.HANDLE_UNSUPPORTED_REQUEST
: A new task to gracefully manage unsupported user requests.
- Enhanced Task Prompting: Improved the prompting mechanism for each task type with more detailed descriptions and examples.
- Typed Few-Shot Meta & VLM v2.1.0: Added typed few-shot metadata and updated VLM configurations to version 2.1.0.
- Scenario Enhancer Pipeline: Introduced a scenario enhancer pipeline with new configurations (
v0.1.0
,v1.0.0
). - Detection Scenario Task: Added a new task type for handling detection scenarios.
- Brightness Control Task: Implemented a brightness control task with support for
bright_param
. - Message Builders & Schemas: Introduced base, visual, and conversational message builders, along with a new detection schema.
- New VLM Prompt Configurations: Added several new VLM prompt configurations (
v0.1.0
,v0.1.1
,v0.1.2
,v0.1.2a
,v2.0.0
,v2.0.1
,v2.1.0
,v2.1.0a
,v2.1.1
,v2.1.2
).
Fixed
- Scenario Enhancer Prompt: Corrected an issue in the scenario enhancer prompt.
- Search Engine Tests: Updated search engine tests for compatibility with asynchronous operations.
- F-string Formatting: Fixed f-string formatting errors in status messages.
- Docker Base Image Pinning: Pinned the Docker base image to a specific version for improved stability.
- Asynchronous Synchronization: Resolved synchronization issues in asynchronous operations.
- Agent Logic & QueryRequest Alignment: Aligned agent logic and
QueryRequest
model with updated API formats and test expectations. - VLM Preprocessing: Modified VLM preprocessing to use base64 pass-through/encode.
Changed
- ANSWER_SYSTEM_QUESTION Task Handling: Streamlined the handling of the
ANSWER_SYSTEM_QUESTION
task. - QueryRequest Passing: Modified tasks to receive the full
QueryRequest
object. - Pre-commit Hooks: Updated pre-commit hooks to include
autoflake
and applied them repository-wide for consistent code formatting. - VLM Prompt Variants & Schema: Standardized YAML schema for VLM prompt variants.
- LM Model Requirement: Made LM
model
a required field, extended few-shot metadata, and clarified analysis documentation. - Debug Logging: Improved debug logging for built messages with pretty-printing.
- README Updates: Updated the
README.md
file with new information. - Tag Mechanism & Sample Environment: Updated the tag mechanism and sample environment configurations.
- Language Detection: Replaced
langdetect
with a heuristic-based approach for language detection. - Tracing Method: Modularized the tracing method for better organization.
- Provider Factory Function: Renamed the provider factory function for clarity.
- Default LLM Model: Changed the default LLM model to
qwen2.5vl
. - VisualAgent Prompt Configuration: Switched
VisualAgent
to usev2.0.0
prompt configuration. - Ollama Configuration: Added Ollama concurrency environment variables, dropped non-GPU reservations, and updated the default LLM.
- VLM Prompt Language/Tone: Refined language and tone rules for VLM prompts.
- Modular Task Execution: Refactored the conversational agent to implement modular task execution via a task registry.
- Asynchronous ConversationalAgent: Made the
ConversationalAgent
fully asynchronous, utilizingrun_in_threadpool
for blocking operations. - Docstring & Comment Separation: Separated file paths from docstrings and moved them to comments.
- SearchEngine Asynchronicity: Made
SearchEngine
temporarily asynchronous. - Bullet Caps: Tightened short-mode bullet caps to 1.
- Externalized System Prompt: Externalized the conversational agent's system prompt.
- Embedding Models Device: Changed the default device for embedding models to CPU.
- Gitignore Update: Added the
docs
directory to.gitignore
.
v2.0.0 (Aug 8, 2025)
This is the first official release of the EVA Agent, a conversational and visual AI agent.
Added
- Conversational Agent: Implemented a sophisticated conversational agent (
src/conversation
) with capabilities for intent classification, context building, and dynamic prompt composition. - Visual Agent: Introduced a visual agent (
src/vision
) for image analysis tasks, leveraging a configurable Visual Language Model (VLM) backend. - RAG Pipeline: Built a Retrieval-Augmented Generation (RAG) pipeline (
src/rag
) to enrich user queries with context from a knowledge base. - Vector Stores: Integrated support for FAISS and Qdrant vector stores for efficient similarity search.
- LLM Provider Architecture: Designed a flexible provider model (
src/providers
) to easily switch between different LLM backends, with initial support for Ollama, Azure OpenAI, and Clova. - FastAPI Application: Developed a robust FastAPI application (
main.py
) to serve the agent's capabilities via a RESTful API. - Docker-Based Deployment: Created a multi-service Docker environment (
docker-compose.yml
) for easy and consistent deployment of the agent and its dependencies (Ollama, Qdrant). - Interactive Chat: Added a script (
interactive_chat.py
) for local interactive testing of the conversational agent. - Testing Framework: Established a comprehensive testing suite using
pytest
, with tests separated into unit and integration groups. - CI/CD: Integrated
pre-commit
hooks to enforce code formatting standards.
Fixed
- Korean Language in Responses: Corrected an issue where the agent would occasionally return Korean text in the
structured_response
. The prompt has been updated with stricter constraints to enforce English-only output. - Clova Provider Bug: Fixed a bug in the
ClovaProvider
where theendpoint_url
from the configuration was being ignored. - Docker Build and Runtime Issues: Resolved various issues to ensure a stable and reliable Docker deployment.
Changed
- Project Structure: Refactored the codebase to improve modularity and class relationships.
- API Schema: Unified and streamlined the API request and response schemas for clarity and consistency.
- Prompt Formatting: The prompt in
PromptComposer
has been reformatted usingtextwrap.dedent
for improved readability. - Test Configuration: The
pytest.ini
file has been updated to exclude thetests/evaluation
directory from the default test run.