10 posts tagged with "EVA"

Launching EVA: The World’s First Commercial VLM Service on Rebellions NPU

In close collaboration with Rebellions, the EVA team has continuously advanced the technology stack and successfully built a production-grade NPU-based runtime environment for EVA. We are now moving beyond technical validation and officially entering the phase of commercial service deployment.

1. ATOM-MAX NPU Performance Validation: A New Standard for VLM Inference (As-Is)

EVA recently evaluated operational feasibility with the Qwen3 VL 8B model on Rebellions' latest ATOM-MAX NPU environment. This was not just a benchmark for model accuracy, but a validation of key operational requirements for real industrial services.

Rebellions ATOM / Qwen3 VL 8B / Accuracy 0.7996 / F1 0.6733
GPU A100 / Qwen3 VL 8B FP8 / Accuracy 0.7779 / F1 0.5979

Compared with GPU (A100), EVA achieved equivalent or better performance on overall inference metrics. In particular, in fire and smoke detection scenarios, the NPU environment demonstrated stronger processing capability, proving applicability to high-complexity industrial safety monitoring.

2. Optimization and Stability for Commercial Operations (As-Is)

In real-world deployments, AI systems must handle far more than clean benchmark inputs. Mixed text-image requests and multiple simultaneous camera streams can easily create bottlenecks. To address this, EVA has continuously improved optimization at both the NPU compiler and system levels.

It is critical to build a resource orchestration framework that efficiently distributes CPU, memory, and NPU workloads, so multiple AI Agents can run concurrently without performance degradation. It is equally important to resolve unexpected failures and ensure stable, uninterrupted operation when text-only and image-analysis requests arrive at the same time.

Complex data processing stabilization: We fully resolved potential malfunctions in multi-core environments where Text Only and Text + Image requests are mixed, significantly improving operational reliability.
Resource efficiency optimization: By precisely controlling data processing policies across CPU, memory, and NPU, we achieved a high-efficiency runtime where multiple VLM instances can run simultaneously without inference speed degradation.

3. Throughput Optimization Based on Parallel Architecture (To-Be)

EVA is also pushing forward full-stack parallelization to maximize the multi-core architecture of Rebellions NPU and further advance end-to-end technology integration.

Parallelization strategy: We are developing techniques to remove VLM inference bottlenecks by applying data parallelism (DP) to the Vision Encoder and tensor parallelism (TP) to the Text Decoder.
Integrated operations strategy: We are defining the optimal number of concurrent instances and core allocation ratios across multiple NPU resources. This enables GPU-level throughput while significantly improving performance-per-watt and reducing TCO (Total Cost of Ownership).

Closing: The Commercial Era of Efficient Industrial AI

The combination of EVA and Rebellions NPU is not a simple hardware replacement. It represents a full-stack transformation toward always-on AI inference in the field with predictable operations and a strong balance of high performance, high efficiency, and high stability. Based on validated NPU optimization technologies, EVA will accelerate digital transformation in industrial environments with a more cost-efficient operating model.

Launching EVA: The World’s First Commercial VLM Service on Rebellions NPU

2026년 5월 26일

Daniel ChoMellerikat Leader

EVA Release v2.7

2026년 4월 20일

Danbi LeeProduct Leader

EVA Release v2.6

2026년 3월 11일

Danbi LeeProduct Leader

In a previous post, I shared our commitment to collaborating with Rebellions NPUs to enable 24/7 “always-on AI” for industrial environments.

https://mellerikat.com/en/blog/News/rebellions

Today, I’m pleased to announce that this commitment has resulted in a tangible technical milestone.

mellerikat’s EVA (Evolved Vision Agent) has successfully completed end-to-end service validation on Rebellions’ latest server-grade NPU, ATOM™-Max, integrating Vision models, LLMs, and VLMs into a unified production pipeline.

🛠️ Beyond Running a Model — Executing the Entire Service Pipeline

Running a single model on an NPU is fundamentally different from operating an entire production service reliably. Through this validation, EVA demonstrated uninterrupted execution of the full pipeline on ATOM™-Max:

Camera Input → Object Detection (Vision) → Scenario Interpretation (VLM) → Situation Assessment (LLM) → Alert & Control Dispatch

This result confirms that complex AI pipelines required in real-world operations — beyond isolated model benchmarks — can be fully orchestrated on NPUs.

Rebellions has also recognized this milestone as “the first real-world operation of a VLM-based AI service on a commercial NPU platform,” expressing strong expectations for future adoption.

📈 Next Phase: Quantifying TCO Innovation Through Stress Testing

Following successful end-to-end validation, EVA now enters the stress testing phase, simulating real factory environments.

We will analyze system stability, throughput, and power efficiency under extreme conditions where multiple cameras generate simultaneous input streams. The insights gained will be delivered to customers as actionable guidance, including:

Optimal NPU Configuration Standards Cost-efficient hardware configuration guidelines based on camera count and required inference performance.
Quantified TCO Reduction vs. GPUs Practical economic analysis including power consumption and operational costs — not just hardware pricing.
Minimized Deployment Risk Standardized NPU configurations that shorten deployment time and accelerate large-scale adoption.

✨ Conclusion: Reducing GPU Dependence and Enabling Sustainable AI

The key takeaway from this validation is clear: Multimodal industrial AI has reached a level where real-world operations are possible using NPUs alone.

For organizations that have hesitated to adopt AI due to high GPU costs, the combination of EVA and Rebellions offers a practical and powerful alternative.

By breaking the high-cost barrier and enabling safer, higher-quality, and more productive operations at lower cost, EVA and Rebellions are working together to establish a new standard for sustainable industrial AI.