Skip to main content

9 posts tagged with "Tech"

Mellerikat의 기술, 제품 구조, 개발 인사이트를 공유합니다.

View All Tags

Turning Simple User Requests into AI-Understandable Instructions

· 11 min read
Seongwoo Kong
Seongwoo Kong
AI Specialist
Jisu Kang
Jisu Kang
AI Specialist
Keewon Jeong
Keewon Jeong
Solution Architect

Expanding User Queries So AI Can Clearly Understand Intent

EVA is a system that operates based on user-issued commands. For EVA to make stable and accurate decisions, it is crucial that user requests are delivered in a form that AI can clearly understand.

However, even if the natural language expressions we use daily seem simple and clear to humans, they can be ambiguous from an AI model’s perspective, or they may require excessive implicit reasoning. This gap is exactly what often leads to AI system malfunctions or inaccurate decisions.

To fundamentally address this, EVA uses a Few-Shot prompting technique to automatically expand simple user requests into a structured query representation.

In this post, we focus on:

  • Why simple natural-language requests are difficult for AI
  • How query expansion can improve AI’s understanding
  • How much performance improved in actual field deployments

and share practical methods and their impact for helping AI understand user intent more clearly.

Complete Mastery of vLLM: Optimization for EVA

· 17 min read
Taehoon Park
Taehoon Park
AI Specialist

In this article, we will explore how we optimized LLM service in EVA. We will walk through the adoption of vLLM to serve LLMs tailored for EVA, along with explanations of the core serving techniques.




1. Why Efficient GPU Resource Utilization is Necessary

Most people initially interact with cloud-based LLMs such as GPT / Gemini / Claude. They deliver the best performance available without worrying about model operations — you simply need a URL and an API key. But API usage incurs continuous cost and data must be transmitted externally, introducing security risks for personal or internal corporate data. When usage scales up, a natural question arises:

“Wouldn’t it be better to just deploy the model on our own servers…?”

There are many local LLMs available such as Alibaba’s Qwen and Meta’s LLaMA. As the open-source landscape expands, newer high-performance models are being released at a rapid pace, and the choices are diverse. However, applying them to real services introduces several challenges.

Running an LLM as-is results in very slow inference. This is due to the autoregressive nature of modern LLMs. There are optimizations like KV Cache and Paged Attention that dramatically reduce inference time. Several open-source serving engines implement these ideas — EVA uses vLLM. Each engine differs in model support and ease of use. Let’s explore why EVA chose vLLM.

Eliminating False Positives in Human Detection Using Pose Estimation

· 6 min read
Euisuk Chung
Euisuk Chung
AI Specialist

Introduction

“There’s a person over there!” Our AI vision system confidently reported. Yet all we saw on the screen was an empty chair with a coat draped over it.

Human detection technology has advanced rapidly, but the real world is far more chaotic than polished demo videos. In the environments we focus on, the problem becomes even more noticeable:

  • 🏢 Office: empty chairs with jackets
  • 🔬 Laboratory: lab coats and protective clothing hanging on chairs
  • 💼 Work areas: vacant meeting rooms and lounges

Such false positives aren’t just “slightly wrong” results. They directly degrade system trust and efficiency.

For example:

  • Energy-saving systems may misjudge how many people are present and waste power.
  • Security systems may focus on “phantom personnel” and waste monitoring resources.

Example: an empty chair mistakenly detected as a seated human

Advancing the Lightweight Model-Based Scenario Detection Agent

· 9 min read
Seongwoo Kong
Seongwoo Kong
AI Specialist
Jisu Kang
Jisu Kang
AI Specialist
Keewon Jeong
Keewon Jeong
Solution Architect

Detection과 Exception을 분리해 알람 품질을 끌어올린 실전 아키텍처

CCTV 안전 모니터링, 왜 2-Step Inference가 답이었나
사고는 놓치지 않되, 쓸데없는 알람은 줄이기.

리소스 효율성과 실시간 응답성을 고려하여, Qwen2.5-32B 대신 더 가벼우면서도 경쟁력 있는 모델 아키텍처로 전환하고 2-Step Inference 방식을 도입하였습니다. EVA Beta Test시, Qwen3 8B 모델은 Qwen2.5 32B 모델보다 전반적인 추론 능력은 더 뛰어났지만, 한 번에 여러 가지 답변을 일관성 있게 생성하는 데에는 어려움이 있었고, 사용자 언어에 맞춰 alert response를 출력하는 태스크에서도 한계를 보였습니다.

예를 들어, 실제로는 alert가 True인 상황인데도, 모델이 생성한 alert response에서는 마치 alert가 False인 경우처럼 서술하는 식의 모순이 발생하곤 했습니다. 저희는 이런 식의 long-context inference에 한계가 있는 8B 모델을 더 효과적으로 활용하기 위한 방법으로 2-Step Inference를 설계했습니다.

따라서 본 포스트는 2-Step Inference 아키텍처에 초점을 맞추어, 기존 1-Step Inference의 한계2-Step Inference로 Detection과 Exception 판단을 분리했을 때 precision / recall 트레이드오프가 어떻게 변화했는지를 중심으로 정리했습니다.

PoV on Physical AI

· 6 min read
Daniel Cho
Daniel Cho
Mellerikat Leader

Beyond Robot AI...

The concept of Physical AI is often equated with robotic technology. Many envision a future where robots freely navigate spaces and perform tasks on behalf of humans. However, the reality is that it will take considerable time until technology reaches that level. Despite this, much of the current discussion around Physical AI remains robot-centric — which is limiting.

Physical AI does not need to exist solely in the form of a robot. There are already a wide variety of interfaces in our physical world that can interact with AI.

Attention-Based Image-Guided Detection for Domain-Specific Object Recognition

· 5 min read
Hyunchan Moon
Hyunchan Moon
AI Specialist

Introduction: Practical Implementation of Image-Guided Detection

In the field of Open-Vocabulary Detection, OWL-v2 (Open-World Localization Vision Transformer v2) is a powerful model that can use both text and images as prompts. Particularly, Image-Guided Detection using "Visual Prompting" is a powerful feature that allows users to find desired objects with just example images.

This post shares 3 core optimization techniques that we applied while implementing OWL-v2's Image-Guided Detection methodology to fit production environments.

Meta-Intelligence of LLM Observability

· 3 min read
Daniel Cho
Daniel Cho
Mellerikat Leader

The Evolution of Observability into Meta-Intelligence in LLMOps

To effectively implement LLM services, a robust LLMOps framework is essential. Among its components, observability (o11y) has evolved beyond simple monitoring to become a critical enabler of the system’s meta-intelligence.





The Evolution of o11y into Meta-Intelligence

Early LLM o11y focused on collecting metrics such as token usage, response time, response content, and user feedback to monitor performance. We adopted Langsmith, a commercial tool, to monitor the execution process of AI logic. Later, we integrated Langfuse, an open-source tool, allowing our organization to selectively use either tool based on licensing requirements.

However, as the number of AI Agent service users grew, it became clear that accumulated data could no longer provide meaningful insights through simple log analysis. Consequently, we decided to transform o11y data from mere "observation logs" into a meta-intelligence tool. This system leverages AI Agent outputs and user feedback to automatically reformulate questions or enhance response quality by adjusting model behavior.

In essence, o11y data transcends real-time performance monitoring to become the cornerstone of a feedback loop that enables AI Agents to self-improve.

Academically, this approach aligns with the growing focus on AgentOps or Agentic AI observation systems. There is a movement to propose comprehensive observation frameworks for AgentOps, tracking various artifacts such as execution paths, internal logic, tool calls, and planning stages. Beyond black-box evaluations, the importance of inferring and optimizing behavioral patterns based on agent execution logs is increasingly emphasized.

Next-Gen Camera - EVA x Meraki

· 6 min read
Daniel Cho
Daniel Cho
Mellerikat Leader

Background

Meraki’s Cloud-Managed Service already boasts an exceptional infrastructure. If a variety of third-party apps, particularly AI-based services, could seamlessly integrate with this cloud platform, the true potential for enhancing Meraki’s value could be realized.

Currently, Meraki Cloud includes an App Store with some available apps, but it faces clear limitations:

  • Integration with Meraki Cloud Services
    • App installation and deployment are restricted. Only select partners can officially register apps, and the installation process is complex or not automated.
    • Third-party apps are not fully integrated with the Meraki Dashboard, leading to fragmented user experiences or dispersed management points.
    • Limitations in APIs and SDKs hinder sufficient integration and scalability with external services.

Upgrading Meraki Cloud to the next level and establishing best practices for a third-party app ecosystem make the integration of Meraki Smart Camera with mellerikat EVA a highly significant case study.

Gen AI and Domain-Specific AI

· 4 min read
Daniel Cho
Daniel Cho
Mellerikat Leader

Specialized Intelligence: The Key to Business Innovation Beyond General Intelligence

Since the digital revolution, artificial intelligence (AI) has rapidly advanced, bringing transformative changes to our daily lives and industries. The emergence of Generative AI (Gen AI) has made AI technology accessible to everyone, but it has also introduced various challenges. While a universal AI capable of excelling in all domains is an ideal goal, in reality, specialized intelligence tailored to specific fields often delivers greater value.