22 posts tagged with "Tech"

Mellerikat의 기술, 제품 구조, 개발 인사이트를 공유합니다.

Real-Time Streaming Rendering Optimization - Improving EVA Architecture Based on Canvas, Web Worker, OffscreenCanvas

April 6, 2026 · 7 min read

junhyung yoo

Product Developer

Hello, I’m Junhyung Yoo, a frontend engineer on the EVA team.

One of the core features of the EVA service is real-time streaming, which allows users to monitor video feeds from dozens of cameras simultaneously. As usage expanded from brief checks to long-term on-site monitoring, unexpected performance bottlenecks began to surface.

"When I leave the screen on for a long time, the browser gradually slows down and eventually the tab crashes."

To address this issue, we’d like to share our journey of improving the rendering architecture using Canvas, Web Worker, and OffscreenCanvas.

1. Background: Why Did Problems Appear Over Time?

In its early days, EVA adopted a very common streaming approach using the <img> tag combined with Blob (Object URL).

Previous Approach (Blob-Based Rendering)

MJPEG stream data is received from the server in Blob form.
A temporary URL is generated using URL.createObjectURL(blob).
The URL is assigned to the src of an <img> tag, allowing the browser to render the image.

While this implementation was simple, two critical issues emerged in the specialized environment of long-term monitoring.

Memory Overhead: A unique URL string is generated for every frame (around 30 frames per second). Even when calling revokeObjectURL, delays in the browser’s internal image cache and garbage collection (GC) caused memory usage to continuously increase, eventually leading to Out of Memory (OOM) errors.
Main Thread Blocking: Image decoding occurs on the main (UI) thread. When processing high-resolution video, the event loop is delayed, resulting in UI lag such as slow clicks or scrolling—commonly known as jank.

2. Network Tab Analysis: Understanding MJPEG

The first step in improving performance was analyzing the network layer. MJPEG streaming behaves differently from typical HTTP requests.

multipart/x-mixed-replace

MJPEG uses the Content-Type: multipart/x-mixed-replace; boundary=... header, which allows the server to continuously push image frames over a single HTTP connection.

Network Tab Characteristics: The request never completes and remains in a 'Pending' state. The browser keeps the connection open and continuously receives binary data.
Binary Data Structure: Each frame consists of JPEG binary data (0xFF 0xD8 ... 0xFF 0xD9) separated by a specific boundary string.

Because the previous approach converted this massive stream of binary data into Blobs and parsed it on the main thread, browser load increased exponentially as more data accumulated.

3. First Optimization: Canvas and createImageBitmap

To move away from memory management that relied heavily on the browser’s garbage collector, we introduced the Canvas API and adopted an explicit memory management approach.

Asynchronous Bitmap Rendering

The createImageBitmap API allows images to be decoded asynchronously in the background before being rendered to the screen.

// @src/entities/devices/components/stream/MJPEGStream.tsx
// Immediately release memory after drawing the bitmap on the canvas
const bitmap = await createImageBitmap(blob);
ctx.drawImage(bitmap, 0, 0);
bitmap.close(); // Explicitly release memory

The key point of this approach is bitmap.close(). By explicitly destroying bitmap resources after use, we were able to keep memory usage stable. In addition, by eliminating reflow caused by changing the src of an <img> tag and switching to GPU-accelerated canvas drawing, overall rendering efficiency was significantly improved.

4. Second Optimization: Separating Computation with Web Workers

While rendering became lighter, the task of receiving stream data and extracting JPEG frames from binary data (boundary parsing) was still handled by the main thread. Performing real-time string searches on millions of bytes per second places a heavy burden on the CPU.

To solve this, we introduced Web Workers and applied a clear division of responsibilities:
"Data processing in the background, rendering on the main thread."

Optimizing Data Transfer (Transferable Objects)

When sending large images from a worker to the main thread, copying data results in severe performance degradation. We leveraged Transferable Objects to transfer ownership of memory without copying, enabling a zero-copy data flow.

5. Final Optimization: Introducing OffscreenCanvas

Despite these improvements, the final drawing step still occurred on the main thread. The final piece of the puzzle was OffscreenCanvas, which allows control of the canvas itself to be transferred to a worker.

Even when the main thread is blocked (left), image processing running in the worker continues to update in real time without interruption. (Source: Kakao Tech Blog)

Toward 0% Rendering Load on the Main Thread

After transferring control using transferControlToOffscreen(), rendering is performed entirely inside the worker.

// @src/entities/devices/components/stream/mjpeg.worker.ts
const bitmap = await createImageBitmap(blob);
if (ctx &amp;&amp; canvas) {
  // Worker directly draws on the canvas (0% main-thread interference)
  ctx.drawImage(bitmap, 0, 0);

  if (config.showArea &amp;&amp; config.area) {
    drawPolygonArea(ctx, config.area); // Area overlay logic also runs in the worker
  }
}
bitmap.close();

With this architecture, no matter how heavy the workload on the main thread becomes, streaming video continues to play smoothly and independently on a separate thread.

🌐 Browser Compatibility and Automatic Fallback

While OffscreenCanvas offers powerful capabilities, browser support varies. In the EVA service, browser features are automatically detected and conditionally handled based on the user’s environment.

Browser	Supported Version	Notes
Chrome	69+	Primary support
Edge	79+	Supported from Chromium-based versions
Firefox	105+	Enabled by default starting from v105
Safari	16.4+	Latest macOS/iOS recommended
Opera	56+	-

EVA’s Adaptive Rendering Strategy:

Modern browsers: Enable OffscreenCanvas to keep main-thread load at 0%.
Older browsers (e.g., Safari 15 or below): Detect feature availability and automatically fall back to the first optimization approach—main-thread Canvas rendering.

This ensures a seamless streaming experience across all browser environments.

6. Additional Optimizations: Buffer Reuse and Faster Parsing

Performance is determined by details. We applied several additional optimizations within the worker logic.

Fixed Buffer Reuse: Instead of creating new Uint8Array instances each time, we reused fixed-size buffers and managed data using copyWithin. This significantly reduced the frequency of garbage collection (GC).
High-Speed Parsing with indexOf: Rather than using simple loops to find matching bytes in binary data, we leveraged the built-in indexOf method to skip unnecessary byte comparisons. Even this simple optimization dramatically reduced frame drops.

7. Conclusion: A More Robust EVA Monitoring Environment

Through this optimization effort, the EVA service achieved the following results:

Memory Stability: Memory usage remains stable even during long-term operation, eliminating OOM errors.
UI Responsiveness: UI interactions such as menu navigation and button clicks remain smooth—even during high-resolution streaming—at a near-native app level.
Stable Frame Rates: By separating threads, consistent frame rates are maintained regardless of network latency or main-thread load.

This project reinforced a key principle of frontend performance optimization:
"How free you keep the browser’s main thread makes all the difference."

Thank you for reading!

Technology Summary

Web Workers API: Execute computations on background threads
OffscreenCanvas: Rendering independent of the main thread
createImageBitmap: Asynchronous image decoding with explicit memory management
Transferable Objects: High-speed data transfer without copy overhead

References

How EVA Evolved Requirement Management with Agent-Based Workflows

March 23, 2026 · 4 min read

Gyulim Gu

Tech Leader

Danbi Lee

Product Leader

Beyond Simple Intake: Agents Handle Requirements

EVA uses AI Agents as core executors not only in development, but also throughout the requirement management process.

When a problem needs to be solved or a new improvement idea emerges in the field, the process begins simply by sending an email to eva-req@mellerikat.com.

What matters here is that writing the email itself does not need to be complicated.

Requesters do not need to spend time formatting their requests. Instead, they can briefly describe the pain points they experienced and the improvements they expect.

From that point forward, automated Agent workflows take over the refinement, analysis, and prioritization process.

1) Even Simple Emails Are Structured by Agents

Requesters focus on context rather than format when sending requirement emails.

The Review Agent transforms these free-form inputs into structured requirements that can be directly reviewed by development teams.

During refinement, the Agent supplements missing context and converts ambiguous expressions into actionable language suitable for technical review.

In other words, this step transforms human-friendly input into a system-readable requirement structure.

At this stage, the requirement is no longer just a note—it becomes an organized unit that supports implementation review and priority assessment.

2) Analysis Based on EVA Manuals and Logic Documentation

Once refined, the requirement is analyzed together with EVA’s internal knowledge base.

This includes:

User Manuals
Technical and Logic Documentation

Using these documents, the Review Agent performs an initial analysis to identify affected areas, possible solution paths, and implementation priorities.

The analysis covers:

Impacted features and logic
Whether the issue can be resolved with existing functionality
Whether new implementation is required
Technical risks and expected impact
Development priority

Through this process, a requirement evolves beyond simply describing what the problem is.

It becomes an executable development unit that explains how it can be solved and why it should be addressed now.

3) The Agent Loop Continues After Release

The workflow does not stop after implementation.

Once a release is completed, the Release Agent updates the manuals and logic documentation based on the latest changes.

These updated documents then become the knowledge base for future requirement analysis, allowing Agents to make increasingly accurate decisions as the product evolves.

The key point in EVA’s requirement operations is that this process is continuous.

Requirement collection, analysis, development, release, and documentation updates are all connected as a single automated loop without fragmentation.

Through this loop, Agents continuously learn from signals such as:

Which requirements were actually implemented
How they were resolved
Which original expressions were unclear or inaccurate

The Review Agent structures inputs, analyzes them, and proposes priorities.

The Release Agent reflects implementation results back into documentation, enabling more accurate analysis for future incoming requirements.

In practice, this creates meaningful improvements such as:

Higher requirement analysis accuracy
Automatic detection and cleanup of duplicate requests
Improved documentation quality

Conclusion: Agents Are Not Features, but Operational Processes

In EVA, Agents are not just assistive features for isolated tasks.

From the moment a requirement is submitted by email, Agents refine the content, analyze it based on manuals and logic documentation, derive priorities, and update documentation again after release.

In other words, EVA’s requirement management is not a manual process where people repeatedly format requests and organize follow-up work.

It is a workflow where AI Agents manage the entire lifecycle, structure information into maintainable forms, and accumulate knowledge for future decision-making.

As a result, requirements are no longer consumed as one-time requests.

They accumulate as operational data that helps define the product’s evolution more accurately.

EVA uses Agents not simply to automate tasks, but to turn requirement management itself into a continuously learning and improving operational process.

Risk Management in Data Centers with EVA

March 17, 2026 · 4 min read

Daniel Cho

Mellerikat Leader

1. Introduction: Data Center Fires — A “Billion-Dollar” Threat to Business Continuity 🥵

Recent data center fires have gone beyond physical damage, leading to massive financial liabilities due to service disruptions.

👉 SK C&C Pangyo Data Center Fire (2022): A fire originating in a lithium-ion UPS battery room caused major service outages, including Kakao, with estimated damages reaching trillions of KRW.

👉 OVHcloud Fire in France (2021): Triggered by UPS power equipment, this incident resulted in approximately €105M in total damages, with €58M covered by insurance—significantly increasing insurer exposure.

These large-scale incidents highlight that modern AI-era data centers carry risks that can no longer be controlled with traditional physical security measures alone.

😲 2. Data Center Insurance Structure and the Surge in AI GPU Center Premiums

Data center insurance is typically structured as a bundled package including:

Property (buildings and servers)
Business Interruption
Cyber Liability
General Liability

👉 Premium Rates Based on Asset Value Property insurance premiums usually range from 0.2% to 0.5% of total asset value. However, AI data centers are now facing rapidly increasing premiums due to elevated risk classifications.

👉 Risks of High-Density Servers AI GPU clusters have significantly higher power density compared to traditional servers, directly increasing fire risk.

Server Type	Power Consumption per Rack	Key Risk Factors
Traditional Servers	5 – 10 kW	Standard cooling and power management
High-Performance Computing	15 – 25 kW	Increased thermal management requirements
GPU Clusters	40 – 120 kW	Cable overheating, PDU overload, electrical arcs

😎 3. Key Underwriting Checklist from Insurers

Global insurance brokers (Aon, Marsh, FM Global, etc.) evaluate risk based not on facility size, but on technical measures that reduce incident probability. EVA provides strong advantages across these evaluation criteria.

👉 Power Infrastructure Risks
Current Status: 40–50% of data center fires originate from electrical systems. Lithium-ion battery thermal runaway is a major driver of rising premiums.
🌈 EVA’s Role: Detects minute temperature variations and thermal anomalies at the battery cell level in real time, significantly reducing lithium battery risks.

👉 Advanced Fire Detection and Suppression Systems
Current Status: Early smoke detection and gas-based suppression systems are top underwriting priorities.
🌈 EVA’s Role: AI-powered visual intelligence enables instant detection of flames and smoke, reducing detection time to seconds.

👉 Operational Risk (Human Error)
Current Status: Insurers heavily assess 24/7 monitoring systems and thermal inspection practices.
🌈 EVA’s Role: Transforms manual, human-dependent inspections into automated AI-driven monitoring.

❣️ 4. Economic Impact of EVA Adoption: A Risk Engineering Approach

The insurance market is shifting from post-incident compensation to proactive risk engineering—reducing the likelihood of future incidents. AI safety solutions like EVA create a win-win structure for both insurers and policyholders.

👉 Direct Insurance Premium Reduction (15%–30%) Well-implemented risk management systems can lead to premium reductions of 15% to 30%. For data centers worth hundreds of billions of KRW, this translates into direct financial benefits that exceed the cost of deploying the solution.

👉 Five Core Risk Control Points EVA addresses the key underwriting factors that directly impact insurance premiums:

Continuous UPS battery monitoring: Early detection of thermal runaway
High power density 대응: Focused monitoring of cable and PDU overheating in AI GPU environments
Intelligent fire detection: Ultra-fast alerts based on visual data
24/7 uninterrupted monitoring: Detection of unsafe behavior and human error
Faster incident response: Reduced time from alert to action

🥰 5. AI Safety Systems Define Data Center Business Continuity

From a data center operator’s perspective, EVA is not just a “security CCTV system.”

Financial Value: Reduces insurance premiums and prevents large-scale business interruption losses
Operational Value: Establishes standards for managing power and fire risks in high-density AI environments
Reputational Value: Strengthens brand trust as a “safe data center” validated by strict insurance assessments

Through a virtuous cycle of AI Safety System → Risk Reduction → Insurance Discount, data centers can achieve both maximum safety and economic efficiency.

EVA x Rebellions: Journey of EVA on NPU

March 16, 2026 · 4 min read

Gyulim Gu

Tech Leader

The integration and optimization journey between Mellerikat EVA and Rebellions NPU clearly demonstrates the future direction of next-generation AI infrastructure. Through this project, we verified that NPU-based architectures can address the high cost and power consumption challenges of traditional GPU-centric infrastructures. In particular, in Physical AI environments—where real-time perception and reasoning are critical—we confirmed the potential to achieve both significant TCO (Total Cost of Ownership) reduction and high performance simultaneously.

Today, we would like to share the porting process of moving GPU-based models to NPUs, along with the technical challenges behind it, which many people have been curious about.

1. The NPU Porting Process for GPU Models

Since NPUs are designed to accelerate specific types of computations, newly released models cannot be executed immediately without adaptation. To fully utilize the hardware’s capabilities, several essential steps are required.

Model Conversion

The original models developed in PyTorch or TensorFlow must be converted into an executable format that the NPU can understand. Using the ATOM Compiler from Rebellions, the model’s computational graph is analyzed and converted into the .rbln executable format optimized for the NPU architecture.
NPU-Optimized Compilation

The model is compiled into a hardware-optimized executable using the compiler in the Rebellions SDK (RBLN SDK).
- Graph Optimization: Removes redundant operations and reorganizes the data flow.
- Operator Fusion: Combines multiple small operations into a single large kernel to reduce memory access and execution overhead.
- Data Layout Optimization: Adjusts tensor layouts to match the NPU memory architecture, improving data access efficiency.
Quantization

Computational precision is adjusted to match the NPU architecture, improving both performance and memory efficiency. In the case of EVA, we optimized the model to ensure stable performance under an FP16-based inference environment.
vLLM Integration and Validation

The optimized model is deployed within the vLLM-RBLN serving framework. Key metrics such as TTFT (Time To First Token) and throughput are measured and validated against GPU-based environments.

2. EVA Application Optimization and Technical Challenges

After porting the foundation model, the next step is deploying the actual service layer—the EVA Application. During this stage, we have been implementing the following optimization roadmap.

EVA Vision Optimization (1:1 Mapping & Batching)

We mapped NPU cores and Vision Workers in a 1:1 configuration, eliminating context-switching overhead. In addition, by applying continuous batching techniques, we are building a foundation capable of processing data from hundreds of cameras in real time without latency.
EVA Agent Optimization (Reducing VLM Load)

The input resolution of the Vision-Language Model (VLM) was standardized to 1280×720, and a two-stage reasoning architecture was applied to minimize unnecessary VLM calls. This immediately reduces the computational load on the Vision Encoder, which is one of the most expensive components in the pipeline.
System Memory Management and KV Cache Optimization

In collaboration with Rebellions, we analyzed the memory usage patterns of vLLM-RBLN instances and improved resource utilization using a page-based memory management structure. This optimization allows the system to process a larger volume of visual data reliably within the same hardware environment.
Parallel Processing of the VLM Vision Encoder

We are also improving the parallel execution architecture of the Vision Encoder, which accounts for a large portion of the computation in VLM inference. By optimizing how Vision Encoder operations are distributed across multiple NPU cores, we aim to significantly improve VLM serving throughput.

3. Conclusion: Evolving from PoC to a Production-Ready Solution

We are continuously addressing technical challenges discovered during stress testing while refining optimizations that maximize hardware utilization. From parallel processing of the Vision Encoder through close collaboration with Rebellions to the development of an intelligent scheduler within the EVA platform, every step is part of transforming “EVA on NPU” from a simple proof-of-concept (PoC) into a production-ready solution.

Ultimately, the success of AI services depends on meeting three essential conditions: economic efficiency, scalability, and service quality. EVA will continue to actively adopt the latest NPU technologies and present a global standard for Physical AI platforms—delivering the most competitive TCO and outstanding performance for our customers.

Multi-Frame Based VLM Detection: Moving Beyond Single Image Limits to Temporal Context

March 11, 2026 · 7 min read

Gyulim Gu

Tech Leader

Seongwoo Kong

AI Specialist

Taehoon Park

AI Specialist

Jisu Kang

AI Specialist

Is a Single Frame Enough?

Recently, Vision-Language Models (VLMs) have demonstrated exceptional performance in understanding individual images. Large-scale multimodal models have theoretically expanded the possibilities of multi-frame reasoning by introducing architectures that process multiple images alongside text prompts.

However, real-world industrial detection scenarios are far more complex than controlled research environments. Problems that seem straightforward with a single frame often lead to various false positives and edge cases in production.

Consider a scene where a person is lying on the floor. Looking at that single moment, it is easy to categorize it as a "collapse." But what if the previous frame showed them stretching, or simply changing posture while working?

In nighttime environments, lens flares, light reflections, or glare can mimic the color patterns of fire, leading to false fire detections when based on a single image. When even humans find it difficult to be certain from a single snapshot, providing a model with only one frame inevitably creates structural limitations.

These cases all share a common problem: a "lack of context."

Time is the Most Powerful Context

Many detection scenarios inherently rely on a temporal flow.

For instance, "loitering" can only be defined by observing a pattern of staying in the same space for a certain period. Similarly, "long-term abandonment" requires the condition that an object remains unchanged for a specific duration after being placed.

Attempting to solve these problems with a single frame is structurally difficult because the focus must be on "change," not just "state."

We have categorized this into three levels of context:

Single Image-based Judgment
Short-term Multi-image Contextual Judgment (Momentary context)
Temporal Judgment (Involving long-term flow)

In actual operating environments, these three levels coexist. Some scenarios are sufficient with a single frame, some require consecutive frames at intervals of a few seconds, and others require tracking a flow over tens of seconds.

EVA's Multi Frame Manager

In EVA, user-defined scenarios are not treated as simple text conditions. The system analyzes the "level of context" required by each scenario and determines an appropriate frame collection strategy.

For example, "fainting detection" requires multi-images covering a few seconds before and after the event, rather than a single frame. In contrast, "long-term abandonment" requires continuous frame collection over a specific duration based on a sliding window.

The module responsible for this process is the Multi Frame Manager. This module dynamically determines the following based on the scenario characteristics:

Number of frames required
Collection intervals
Retention time
Event trigger expansion

Collected images are not simply listed. They are delivered to the VLM in a clearly sorted chronological order, accompanied by system prompts that guide the model to compare changes between frames.

Multi-Image Based VLM Inference Strategy

When multi-frame input is received, the VLM does more than just return independent detection results. In EVA, we designed the inference structure to interpret multi-images as a continuous temporal context rather than an independent set of images.

To achieve this, frames are delivered to the model using the following strategies:

Chronological Frame Alignment: Constructs time-series data from past to present to understand causality.
Comparative System Prompts: Uses instructions like "Identify changes compared to the previous frame" to analyze inter-frame correlations.
Temporal Reasoning: Derives logical conclusions based on state changes over time rather than fragmented snapshot judgments.

Case Study: The Power of Temporal Context in Reducing False Positives

The following case demonstrates how fragmented information from a single frame is accurately corrected through the "context" of multiple frames.

Single Image: A person is stationary in a low, prone position. A VLM looking only at this moment is highly likely to misinterpret the situation as "Collapse."
Multi-Image: In the subsequent frames, subtle movements are captured—the person moves their arms to operate a phone and tilts their head to look at the screen.
Result: Through Temporal Reasoning, EVA correctly concludes this is "Sitting and using a phone detected".

The core idea is to guide the model to understand the situation by comparing differences between frames, rather than judging each frame individually.

For high-risk detections like fainting, the model undergoes a process of Progressive Situation Refinement:

Initial State Identification: Identifying the target object and initial visual features (e.g., prone posture).
Dynamic Change Detection: Tracking meaningful changes in body angles or voluntary movements compared to previous frames.
Consistency Verification: Determining if the posture is a forced freeze due to impact or involves intentional actions.
Final Context Determination: Distinguishing between visual noise with similar patterns and actual events.

This Temporal Reasoning structure significantly reduces false positives in edge cases that plague single-image systems, providing much more stable results in real-world operations.

Category	Single Image			Multi Image
Category	Accuracy	Precision	Recall	Accuracy	Precision	Recall
No PPE	0.66	0.87	0.68	0.76	0.87	0.82
No Mask (Working)	0.94	0.69	0.54	0.93	0.76	0.52
Loitering	0.49	0.92	0.33	0.63	0.85	0.64
Fainting	0.87	1.0	0.36	0.96	1.0	0.82

Ultimately, EVA’s multi-frame inference structure is not just about increasing the number of input images—it is an approach that directly integrates temporal change into the model's reasoning process.

The Cost of Multi-Frame: Computational Overload

Improvements in accuracy come with a price.

While multi-frame reasoning allows for more visual information, it also leads to increased computational costs. In multimodal models, image inputs are generally converted into embeddings via a Vision Encoder before being passed to the LLM, a process that is relatively resource-intensive.

Specifically, multi-frame analysis often encounters the following:

Identical or very similar images repeating in a sequence.
Multiple requests referencing the same camera frame.
Multiple queries performed on the same set of images.

In these cases, if the Vision Encoder processes the same image repeatedly, it creates unnecessary overhead.

In EVA, we developed a structure that maximizes the Encoder Cache feature provided by vLLM to solve this. vLLM offers an Encoder Cache Manager that allows the system to cache and reuse Vision Encoder results during multimodal processing.

https://docs.vllm.ai/en/latest/api/vllm/v1/core/encoder_cache_manager/

By leveraging this, we can reuse previously generated encoder embeddings for identical image inputs, eliminating the need to repeat Vision Encoder operations. EVA applies a request management structure at the Agent Layer to effectively utilize this caching.

The Agent coordinates requests in the following ways:

Organizing requests so that identical image inputs can be reused.
Managing requests based on image units to enable cache hits.
Optimizing request flow to prevent redundant encoding.

This allows us to minimize Vision Encoder operations and utilize GPU resources more efficiently, even in a multi-frame analysis environment.

Conclusion

Multi-frame based VLM inference is an approach that significantly improves situational understanding and detection accuracy compared to single-image analysis.

However, as the number of frames increases, the computational load on the Vision Encoder grows significantly. Therefore, it is crucial to design a system that balances performance gains with computational efficiency and infrastructure costs.

EVA addresses this by actively utilizing vLLM's Encoder Cache and managing requests through the Agent Layer. Through this architecture, we maintain high inference performance while reducing unnecessary computations, continuously improving GPU efficiency and infrastructure operating costs.

This feature is available starting from EVA v2.6.0.

The Future of AI Services Shown by OpenClaw

February 24, 2026 · 4 min read

Daniel Cho

Mellerikat Leader

Recently, OpenClaw has been generating significant buzz in the AI community. Running in local environments such as a Mac mini, this service interprets a user’s screen in real time and directly controls various applications — signaling an important shift in how we evaluate AI.

The competitive edge in AI is no longer defined by “how large or powerful a foundation model is,” but rather by “how effectively that model can perform complex tasks in real-world applications.”

A Paradigm Shift: From Performance to Execution

Old Paradigm: “How intelligent is it?” Until now, the AI industry has focused heavily on the scale and performance of foundation models. Large language models such as GPT-4, Claude, and Gemini competed on parameters, dataset size, and benchmark scores. The central question was: “How smart is the AI?”
New Paradigm: “How much work can it actually perform?” OpenClaw introduces a fundamentally different question: “How effectively can the model perform complex tasks in real-world environments?” AI value is no longer measured by raw intelligence alone, but by its ability to execute within real computing environments.

Teaching VLMs to Multitask: Enhancing Situation Awareness through Scenario Decomposition

February 5, 2026 · 8 min read

Hyunchan Moon

AI Specialist

At the core of EVA lies the ability to truly understand critical situations that occur simultaneously within a single scene—such as fires, people falling, or traffic accidents—without missing any of them. However, no matter how capable a Vision-Language Model (VLM) is, asking it to reason about too many things at once leads to a sharp degradation in cognitive performance.[2,3]

In this post, inspired by the recent text-to-video retrieval research Q₂E (Query-to-Event Decomposition)[1], we introduce Scenario Decomposition, a technique that enables VLMs to deeply understand complex, multi-scenario situations within a single frame.

Physical AI Implemented with EVA

January 22, 2026 · 3 min read

Gyulim Gu

Tech Leader

When Can AI Intervene in the Real World?

Accidents in industrial environments happen without warning. Moments such as a worker collapsing, an arm getting caught in machinery, or a fire breaking out usually occur within seconds.

Physical AI should not stop at recognizing these moments. It must be capable of translating perception into physical action on site.

In this post, we walk through a LEGO-based simulation to show how EVA detects incidents and how its decisions are connected to real equipment actions as a single, continuous flow.

Simplifying Industrial Scenarios with LEGO

Instead of replicating complex industrial environments in full detail, we simplified accident scenarios using LEGO.

We designed independent scenarios for:

a worker collapsing,
an arm being caught in equipment,
and a fire breaking out.

Arm caught in equipment – conveyor belt stops and warning light activates

EVA: A New Standard for Safety Management Beyond Physical Sensors

January 15, 2026 · 3 min read

Daniel Cho

Mellerikat Leader

EVA Accelerates the Golden Time for Fire Response

Securing the “golden time” during a fire incident in manufacturing facilities is one of the most critical factors in protecting both human life and physical assets. Traditional fire detection systems have long relied on physical sensors, but camera-based intelligent detection technologies are now rapidly replacing this role.

In this post, we analyze EVA’s smoke detection performance through a real-world validation test conducted at an LG Electronics facility and examine the technical significance of the results.

Field Validation Test: 8 Seconds vs. 38 Seconds

A smoke detection test simulating a real fire scenario was conducted at an LG Electronics production site. The core objective of this test was to compare the detection speed between the existing smoke detectors and the newly introduced EVA system.

The results were highly encouraging. Based on the moment when smoke began to rise, the average response times of each system were as follows:

EVA: Smoke detected approximately 8 seconds after occurrence

Conventional smoke detector: Smoke detected approximately 38 seconds after occurrence

As a result, EVA identified and propagated the hazardous situation more than four times faster than conventional smoke detectors. This 30-second difference represents a decisive window that can determine the success or failure of initial fire suppression.

The Synergy of EVA and Workflow Builder

January 3, 2026 · 6 min read

Gyulim Gu

Tech Leader

Beyond Observation: AI That Takes Action

The core challenge for AI today is no longer just analyzing data or describing scenes. A truly intelligent system must be able to drive meaningful actions in the physical world or corporate operational systems based on its analysis.

EVA is now moving beyond the role of 'eyes' and 'brain' that perceive visual information and judge situations, to join with the 'hands'—the Workflow Builder. This marks the completion of an End-to-End automation structure that moves past passive, notification-centric monitoring to independently judging site conditions and solving problems.

1. Background: Why Did Problems Appear Over Time?​

Previous Approach (Blob-Based Rendering)​

2. Network Tab Analysis: Understanding MJPEG​

multipart/x-mixed-replace​

3. First Optimization: Canvas and createImageBitmap​

Asynchronous Bitmap Rendering​

4. Second Optimization: Separating Computation with Web Workers​

Optimizing Data Transfer (Transferable Objects)​

5. Final Optimization: Introducing OffscreenCanvas​

Toward 0% Rendering Load on the Main Thread​

🌐 Browser Compatibility and Automatic Fallback​

6. Additional Optimizations: Buffer Reuse and Faster Parsing​

7. Conclusion: A More Robust EVA Monitoring Environment​

Technology Summary​

References​

Beyond Simple Intake: Agents Handle Requirements​

1) Even Simple Emails Are Structured by Agents​

2) Analysis Based on EVA Manuals and Logic Documentation​

3) The Agent Loop Continues After Release​

Conclusion: Agents Are Not Features, but Operational Processes​

1. Introduction: Data Center Fires — A “Billion-Dollar” Threat to Business Continuity 🥵​

😲 2. Data Center Insurance Structure and the Surge in AI GPU Center Premiums​

😎 3. Key Underwriting Checklist from Insurers​

❣️ 4. Economic Impact of EVA Adoption: A Risk Engineering Approach​

🥰 5. AI Safety Systems Define Data Center Business Continuity​

1. The NPU Porting Process for GPU Models​

2. EVA Application Optimization and Technical Challenges​

3. Conclusion: Evolving from PoC to a Production-Ready Solution​

Is a Single Frame Enough?​

Time is the Most Powerful Context​

EVA's Multi Frame Manager​

Multi-Image Based VLM Inference Strategy​

Case Study: The Power of Temporal Context in Reducing False Positives​

The Cost of Multi-Frame: Computational Overload​

Conclusion​

A Paradigm Shift: From Performance to Execution​

When Can AI Intervene in the Real World?​

Simplifying Industrial Scenarios with LEGO​

EVA Accelerates the Golden Time for Fire Response​

Field Validation Test: 8 Seconds vs. 38 Seconds​

Beyond Observation: AI That Takes Action​

1. Background: Why Did Problems Appear Over Time?

Previous Approach (Blob-Based Rendering)

2. Network Tab Analysis: Understanding MJPEG

multipart/x-mixed-replace

3. First Optimization: Canvas and createImageBitmap

Asynchronous Bitmap Rendering

4. Second Optimization: Separating Computation with Web Workers

Optimizing Data Transfer (Transferable Objects)

5. Final Optimization: Introducing OffscreenCanvas

Toward 0% Rendering Load on the Main Thread

🌐 Browser Compatibility and Automatic Fallback

6. Additional Optimizations: Buffer Reuse and Faster Parsing

7. Conclusion: A More Robust EVA Monitoring Environment

Technology Summary

References

Beyond Simple Intake: Agents Handle Requirements

1) Even Simple Emails Are Structured by Agents

2) Analysis Based on EVA Manuals and Logic Documentation

3) The Agent Loop Continues After Release

Conclusion: Agents Are Not Features, but Operational Processes

1. Introduction: Data Center Fires — A “Billion-Dollar” Threat to Business Continuity 🥵

😲 2. Data Center Insurance Structure and the Surge in AI GPU Center Premiums

😎 3. Key Underwriting Checklist from Insurers

❣️ 4. Economic Impact of EVA Adoption: A Risk Engineering Approach

🥰 5. AI Safety Systems Define Data Center Business Continuity

1. The NPU Porting Process for GPU Models

2. EVA Application Optimization and Technical Challenges

3. Conclusion: Evolving from PoC to a Production-Ready Solution

Is a Single Frame Enough?

Time is the Most Powerful Context

EVA's Multi Frame Manager

Multi-Image Based VLM Inference Strategy

Case Study: The Power of Temporal Context in Reducing False Positives

The Cost of Multi-Frame: Computational Overload

Conclusion

A Paradigm Shift: From Performance to Execution

When Can AI Intervene in the Real World?

Simplifying Industrial Scenarios with LEGO

EVA Accelerates the Golden Time for Fire Response

Field Validation Test: 8 Seconds vs. 38 Seconds

Beyond Observation: AI That Takes Action