Add RT-DETR V2 with VitPose model: Introduce new human detection model combining RT-DETR V2 object detection with VitPose pose estimation for human-only detection scenarios.
Add new API interface: Implement updated API interface compatible with EVA App v2.3.0, providing enhanced integration capabilities.
Add Kubernetes deployment support: Implement Helm chart configuration for containerized deployment on Kubernetes clusters, enabling scalable and managed production deployments.
Add foreground/background execution modes for proxy server: Introduce --foreground (-f) flag to run the proxy server in foreground mode with terminal output, while maintaining background mode with nohup as the default behavior for production deployments.
Add granular TorchServe log viewing: Implement individual log file viewing commands (access, errors, metrics, model, warnings) for more targeted debugging and monitoring of specific TorchServe components.
Add enhanced log viewer with file metadata: Display log file sizes, rotation status, and structured output for all TorchServe logs including access, errors, model metrics, model operations, and warnings.
Implement automatic absolute path resolution for log4j configuration: Add update_config_with_absolute_paths() function to dynamically convert relative log4j2.xml paths to absolute paths, ensuring proper logging regardless of execution context or working directory.
Enhance deployment script debugging capabilities: Add comprehensive debug output including config file changes, TorchServe startup command, periodic health checks during startup, and automatic error log display on startup failures.
Optimize Docker image for production readiness: Install essential system utilities (lsof, net-tools, procps) required by deployment script for port monitoring, process management, and service health checks.
Implement log retention strategy with automatic rotation: Configure separate retention policies for different log types (errors: 30 days, warnings: 7 days, access/metrics: 3 days) with gzip compression, reducing storage requirements by 90% while maintaining critical operational data.
Add intelligent metric sampling: Implement 6 logs/second sampling for MODEL_METRICS and TS_METRICS, capturing inference latency and performance metrics while reducing log volume by ~80% in high-traffic environments.
Optimize proxy server logging: Implement 1% sampling for INFO-level logs while preserving all ERROR/WARNING logs, reducing proxy log volume by 99% under high load (30+ req/sec).
Update Dockerfile to run services in foreground mode: Modify CMD to use --foreground flag, enabling proper Docker log streaming and preventing container exit while maintaining service availability.
Fix CUDA processor initialization: Modify processor to execute on CUDA devices instead of CPU, significantly reducing CPU usage and improving inference performance.
Fix config file handling for multiple path patterns: Update sed patterns to handle various log4j path formats (file:///, file://, relative paths) ensuring robust configuration file processing across different environments.
Improve temp config file cleanup: Add explicit cleanup of temporary .properties.tmp files in clean command and error handling paths to prevent accumulation of temporary configuration files.
Fix log4j2.xml configuration loading: Resolve path resolution issues by using vmargs=-Dlog4j.configurationFile with absolute paths and replacing ${sys:log_location} with direct logs path references.
Add system utilities to Docker image: Install lsof, net-tools, and procps packages to support deployment script's port checking, process management, and service monitoring capabilities in containerized environments.
Add new detection models for enhanced object detection capabilities: Integrate OmDet-Turbo and LLMDet models for improved open vocabulary zero-shot object detection.
Add flexible model endpoint routing: Updated proxy server to support dynamic model-specific endpoints (changed from predictions/Owl-v2 to predictions/{model_name}) to accommodate various model prediction methods.
Add comprehensive log management commands: Introduce new logs, status, and clean commands for easier service monitoring and maintenance.
logs command: View service logs with configurable line count (e.g., ./run.sh logs proxy -n 100)
status command: Check running services, port status, and disk usage
clean command: Remove old rotated logs and temporary files (e.g., ./run.sh clean --days 30)
Implement automatic log rotation with retention policy: Configure Log4j2-based log rotation with 15-day retention, daily rotation, 100MB size limit, and automatic .gz compression for all TorchServe logs (access, model, service, metrics).
Enhance service cleanup reliability: Improve stop_services() function to handle partial service states gracefully with specific pattern matching (python.*/proxy/main.py) and health checks before stopping services.
Fix deployment script cleanup behavior: Correct cleanup trap to preserve background services on normal exit and only trigger cleanup on errors, preventing unintended service termination.
Improve log configuration for Docker compatibility: Update eva_ts/config.properties and create eva_ts/log4j2.xml with relative paths to ensure proper operation in containerized environments.
TorchServe Migration: Migrated from ALO ML framework to TorchServe for production-grade model serving with improved reliability and scalability.
Real-Time HTTP-Based Inference: Replaced file-based API communication with real-time HTTP-based inference endpoints for faster and more efficient processing.
Unified API Endpoint: Introduced a FastAPI-based proxy server that consolidates TorchServe's multiple ports (inference, management, and metrics) into a single unified interface while maintaining consistent API endpoint paths.
Optimized OWLv2 Model Handlers: Separated OWLv2 model into dedicated handlers with zero-shot detection and image-guided detection split into independent handlers for optimized batch inference.
Few-Shot Learning Support: Added few-shot learning capabilities through the image-guided detection handler.
Project Architecture Restructured: Reorganized project architecture with clear separation between TorchServe handlers and proxy server components for improved maintainability and scalability.
Utility Modules Streamlined: Optimized utility modules for better reusability across handlers and cleaner codebase organization.
Model Workflow Improved: Enhanced model download and packaging workflow with dedicated scripts for more efficient model management.
LLMDet and OMDet-Turbo Temporarily Removed: Temporarily removed LLMDet and OMDet-Turbo models for migration to TorchServe architecture. These models will be reintroduced in version 1.1.0 with TorchServe support.
Add new detection models for enhanced object detection capabilities: Integrate LLMDet and OMDet-Turbo models for open vocabulary zero-shot object detection.
Remove YOLOE model from the supported model list: Due to licensing issues, YoloE has been excluded from the supported models.
Python 3.10 Compatibility Fixed: Resolved compatibility issues with Python 3.10 to ensure proper functionality across supported Python versions.
Ultralytics Dependencies Removed: Removed ultralytics dependencies to resolve version conflicts and improve package stability.
Bounding Box Handling Error Fixed: Fixed an error in bbox handling that occurred when processing multiple bounding boxes during few-shot learning operations.
Inference Dataset Folder Created: Added inference_dataset folder to establish a consistent inference workflow.