Trend

Smart Tech Korea (STK) 2026 Nota AI Booth Preview: Physical AI, Built at the Edge

|

June 9, 2026

|

4

min read

Smart Tech Korea (STK) 2026 Nota AI Booth Preview: Physical AI, Built at the Edge

AI is stepping off the screen. Physical AI is a new wave of AI that moves beyond generating text and images to perceiving and acting in the physical world through robots and smart devices. The key to bringing it into industrial settings is edge optimization: running heavy AI models with no latency on the device itself, not in the cloud.

Nota AI has consistently demonstrated the importance of this edge optimization on global stages, including the Embedded Vision Summit (EVS) 2026 in the United States and NVIDIA's APAC Partner Day at Computex 2026, where it was the only Korean company on the panel. At STK 2026, Nota AI brings Physical AI to life through live demos that show how it works on real hardware, from edge-optimizing the large AI models that perceive and control the physical world to the automation platform that supports them.

1. Robotics AI Optimization: Running VLA Models on an Edge NPU

Figure 1: A live VLA robotics demo running SmolVLA on the Qualcomm IQ-9075
Figure 1: A live VLA robotics demo running SmolVLA on the Qualcomm IQ-9075

A VLA (Vision-Language-Action) model, which carries out commands like "pick up the object and move it" for a robot, works in three main stages.

  • Vision Encoder: extracts features from the camera feed
  • Large Language Model (LLM): jointly reasons over the visual input and the command
  • Action Head: converts the reasoning output into actual motion

Because a VLA model contains a large language model that demands massive computation, it is very heavy to run in an edge environment. To fit it onto a performance-limited Neural Processing Unit (NPU), the common approach is quantization, which compresses the model's weights.

The problem is that a VLA model's three stages run in sequence. If you quantize the weights of the front stages (vision and language) to make the model lighter, the tiny errors introduced there accumulate and amplify down the pipeline, degrading the quality of the final action. The smaller the model, as with SmolVLA (0.45B), the more vulnerable it is to these errors.

So instead of the common approach of shrinking the weights wholesale, Nota AI chose to preserve the front-stage weights entirely and optimize only the inference of the "Action Head," the final stage of the cascade.

With this approach, Nota AI successfully ported the model onto an ultra-compact NPU board (the Dragonwing™ IQ-9075), even though Qualcomm does not officially support it. The task success rate dropped by only 1 percentage point (86% → 85%), while overall inference speed improved by 1.63x. For the Action Head alone, speed increased by about 7x (218 ms → 31 ms).

2. Intelligent Perception of the Physical World: Nota Vision Agent (NVA)

Figure 2: NVA industrial-safety monitoring demo
Figure 2: NVA industrial-safety monitoring demo

For Physical AI to act correctly in the physical world, it must first accurately perceive and understand the scene through a camera. Following the robot-control technology above, the other core pillar you can meet at the booth is exactly this: perception.

NVA is Korea's first commercialized real-time video monitoring solution that understands the context within video, built on a Vision-Language Model (VLM). It was developed by combining NVIDIA's Video Search and Summarization (VSS) technology.

Unlike conventional video monitoring that stopped at simple object detection, NVA interprets the relationships between objects, violations of standard operating procedures (SOPs), and compound risk signals in real time. Operators can easily search for the information they need in natural language and receive summarized reports.

The most important technical point here is that the heavy VLM runs in real time at the site (the edge) rather than being sent to the cloud. Nota AI's optimization, which brings video perception fully into the edge environment, is already proving its value in real industrial settings such as the following.

  • Traffic control: Deployed in the traffic management system of the Daejeon Regional Construction and Management Administration, it detects incidents, fires, and obstacles in road CCTV footage in real time and automatically summarizes and reports the lane-by-lane response status. The solution earned the top grade, 99% accuracy, in the Ministry of Land, Infrastructure and Transport's ITS (Intelligent Transport Systems) performance evaluation.
  • Industrial safety: Applied at Kolon Industries' Gimcheon Plant 2 in partnership with Kolon Benit, it monitors worker safety, entry into hazardous zones, and safety-rule violations in real time to prevent serious accidents.

In recognition of this innovation, NVA recently won the Edge AI and Vision Product of the Year award at the global vision technology conference EVS 2026. At this exhibition, vivid demo footage shows how this powerful VLM-optimized monitoring technology is applied in the field.

3. Model Optimization, Done Through Conversation: The NetsPresso Agent Feature

As the robotics (VLA) and intelligent video monitoring (NVA) cases above show, optimizing a large AI model for an edge environment is extremely demanding. Manually testing which optimization technique to apply, to which layer, and at what strength requires deep expertise and an enormous amount of time.

The third demo, unveiled for the first time at the booth, shows how Nota AI automates this complex, tedious search process by adding a conversational AI agent feature to NetsPresso®, its hardware-aware model optimization platform.

Figure 3: Manual CLI vs. the NetsPresso agent feature
Figure 3: Manual CLI vs. the NetsPresso agent feature

When the user describes their target performance and constraints (target hardware, acceptable accuracy loss, and so on) in natural language, the agent strips away the unnecessary search space and proposes an optimal compression recipe. The expected benefits are as follows.

  • Lower R&D cost: It narrows the unnecessary search space and utilizes the NetsPresso API as an execution tool to skip trial and error, sharply cutting engineering time and cloud compute cost.
  • Lower barrier to entry: It improves usability by turning a CLI-based optimization tool into a conversational UI, so anyone can produce an optimized model without deep AI or hardware expertise.
  • Faster time to market: Backed by Nota AI's technology, it recommends the optimal compression recipe to ensure high-quality results, and the shorter optimization cycle accelerates the deployment of the entire AI service.

You can experience firsthand at the booth how NetsPresso's conversational UI reduces the optimization resources a company actually spends.

4. Extending to General-Purpose Devices: Running an On-Device LLM on Apple Silicon (M4)

This edge optimization capability is not limited to robot control or dedicated hardware for specialized monitoring. It applies equally to the general-purpose devices we use every day.

The final demo runs the Llama 1B Instruct model smoothly using only the pure CPU compute of a standard Apple Silicon Mac (M4 target), with no separate AI accelerator, demonstrating scalability to general-purpose devices.

Figure 4: Two core techniques applied to the on-device LLM on an Apple Silicon CPU, and their effects
Figure 4: Two core techniques applied to the on-device LLM on an Apple Silicon CPU, and their effects

By applying two core optimization techniques here, mixed-precision quantization and speculative decoding, Nota AI achieved text generation 1.3x faster than an 8-bit model under the same memory budget and 2.3x faster than the 16-bit (FP16) baseline, all without raising peak memory. At the booth, you can directly compare the real-time generation speed of an on-device LLM that runs on a general device's CPU alone, with no accelerator.

See Physical AI in Action at the Nota AI Booth, STK 2026

All four demos point to one thing: running heavy AI on real-world devices, each with its own conditions, without giving up performance. When that capability turns toward the physical world, it becomes the action of robots (VLA) and perception in the field (NVA), that is, Physical AI; when it turns toward general-purpose devices, it becomes an on-device LLM on a laptop. Optimizing for each site, where the chip, the power budget, and the acceptable latency all differ, is exactly what Nota AI has been solving with NetsPresso®.

Come see for yourself, at Smart Tech Korea, the reality of the Physical AI and edge optimization technology drawing global attention.

  • Dates: June 10 (Wed) to 12 (Fri), 2026
  • Venue: COEX Hall B, Seoul
  • Booth: B642 (right in front of the cafeteria)

Frequently Asked Questions (FAQ)

Q. What is Physical AI? 

Physical AI refers to intelligent systems that have evolved beyond conventional AI, which merely generates text or images on a screen. Using cameras and sensors, Physical AI perceives the complex variables of the real world in real time and connects its decisions directly to the movement of robots or devices, so it can complete actual physical tasks.

Q. Why is AI optimization essential on edge devices? 

Tasks that demand real-time responsiveness, such as robot control or live monitoring, cannot tolerate cloud communication latency. But a device's own compute, memory, and power are limited, so without optimization that reduces a model's weight and improves computational efficiency, it is hard to run a large model reliably.

Q. What technologies can I see at the Nota AI booth at STK 2026? 

We will demonstrate a Qualcomm NPU-based robotics VLA optimization model, a commercialized case of the VLM-based NVA, NetsPresso's conversational model-optimization agent feature, and an Apple M4 CPU-based on-device LLM.

Related