EVA
EVA is an innovative tool that combines vision models and Multi-modal LLMs to transform ordinary cameras into smart AI cameras.
Control AI through natural conversations without complex coding and easily implement AI services tailored to your site.
Listen to detailed information about EVA through the podcast player below.
Service Scenario
Perfect Harmony of Vision Models and Multi-modal LLMs

EVA combines vision models and Multi-modal LLMs to deliver powerful functionality.
The Multi-modal LLM sets tasks for the vision model through natural user conversations, and the vision model analyzes images from cameras in real time.
For example, tasks like hazard detection, activity monitoring, or quality inspection can be easily configured through conversation alone.

Once the vision model analyzes images, the Multi-modal LLM recognizes the context and executes predefined events.
These services leverage various technologies provided by Mellerikat to quickly and efficiently address on-site challenges.

Briefly explain the purpose and background of the service.
EVA extracts context from user descriptions and automatically configures the optimal model, AI logic, and operational environment to deliver tailored services.
Transform into a Smart AI Camera
Easily Create a Smart AI Camera
EVA can instantly transform an ordinary camera into a smart AI camera by simply entering its network information.
Connect EVA to various scenarios like hazard detection, activity monitoring, or quality inspection for immediate use.
Video demonstrating the camera registration process

Beyond simply capturing images, EVA also supports camera control functions required for the service.
AI Tailored Through Conversation
EVA enables intuitive AI control through natural conversations.
Users can set desired tasks through dialogue, customizing AI for camera footage and specific scenarios.
Video showing object detection setup
Video showing camera-specific settings
Domain Understanding
AI Solutions Optimized for Your Site
Each site requires different domain knowledge. EVA enables vision models tailored to your site through conversation alone, without complex coding, data collection, or labeling.
Its strength lies in being accessible to anyone, even without specialized expertise.

Normal vs. Defect Example
Video customizing a specific defect detection model
Context-Aware
Smart AI That Understands Context
By combining vision models and Multi-modal LLMs, EVA analyzes images and precisely understands contextual situations.
This supports intelligent decision-making optimized for the site, accurately addressing issues like anomaly detection, quality inspection, and safety monitoring.
Video showing context recognition through site-specific agent prompts
Video demonstrating EVA’s AI technology recognizing and analyzing situations in real time across various sites, intuitively describing movements of people, vehicles, objects, and hazardous situations

Notifications about recognized situations are provided through various channels such as Teams, Slack, and email.
Service Architecture
Efficient and Powerful System Structure

Users can interact with EVA Agent through the EVA Service App to configure customized logic or models for each camera on EVA Edge.
EVA Edge collects camera frames via the pipeline module and detects objects using Vision Solution in Zero-shot or Few-shot modes.
When an object is detected, EVA Agent is requested to assess the situation and determine if it matches predefined scenarios,
triggering notifications or providing data to external systems if conditions are met.
The pipeline module and Vision Solution operate multiple instances simultaneously based on the system environment,
optimizing resource usage and minimizing AI analysis latency for each camera.

EVA Agent is designed to flexibly configure LLM and VLM according to the user's infrastructure,
allowing the application of models tailored to specific purposes.
VLM continuously recognizes and evaluates situations through cameras, primarily running local models to minimize usage costs.
The LLM, activated during user interactions, can leverage VLM and supports models
that effectively understand multiple languages or utilize Public APIs as needed.
VLM can also utilize Public APIs, enabling the creation of an optimal environment by considering operational costs and model suitability.
Optimal Use of Computing Resources
EVA separates Vision Models and Multi-modal LLMs to optimize resource usage.
Operating Multi-modal LLMs alone results in high resource consumption and long processing times.
EVA’s separated architecture achieves up to 70% cost savings compared to standalone Multi-modal LLM operation when servicing 50 cameras, with cost efficiency increasing as the number of cameras grows.
Continuous High-Quality AI Services

EVA is built on an architecture combining various products and technologies from the Mellerikat platform.
Install EVA Service App and EVA Edge in the user environment to connect with on-site cameras, and add EVA Agent as needed to provide services.
EVA installed on-site continuously receives high-quality AI logic and models through the Mellerikat platform for updates.

The Mellerikat platform manages the AI model lifecycle through MLOps and LLMOps.
It continuously registers the latest validated models or problem-solving logic, deploys Vision Solutions to EVA Edge via Edge Conductor,
and deploys AI Packs to EVA Agent via Logic Deployer.
Rather than a one-time installation, EVA ensures high-quality service through ongoing performance management.