Press Release

Nota AI Has Two MoE Quantization Papers Accepted at ICML 2026 Workshop, Demonstrating Global Competitiveness in Large-Scale AI Optimization

June 11, 2026

‍

▶ Two papers on MoE-specific quantization algorithms accepted at a workshop held in conjunction with ICML 2026

▶ Recognition follows Nota AI’s overall win at the NVIDIA Nemotron Hackathon

▶ Strengthening core optimization technologies to make large-scale AI models smaller and more efficient to run
‍

SEOUL, South Korea, June, 11, 2026 – Nota AI, a company specializing in AI model compression andoptimization, announced that two of its papers on MoE-specific quantizationalgorithms have been accepted to the Resource-Adaptive Foundation ModelInference (AdaptFM) Workshop at ICML 2026, one of the world’s leading machinelearning conferences.

ICML is widely recognized as one of thepremier global conferences in machine learning and artificial intelligence,bringing together the latest research from global technology companies, leadinguniversities, and major research institutions. The AdaptFM Workshop focuses ontechnologies that enable large-scale AI models to run efficiently under limitedcomputing resources. Researchers from global companies and researchinstitutions, including Amazon and Meta, serve on the organizing committee,while researchers from leading AI companies such as NVIDIA, Qualcomm, OpenAI,Apple, and Microsoft are also participating as members of the programcommittee.

This achievement is significant as itrecognizes Nota AI’s accumulated technical expertise in optimizingMixture-of-Experts (MoE) models, an architecture increasingly regarded as acore structure for large language models (LLMs). MoE models improve both performanceand efficiency by activating only a subset of expert models as needed. However,their complex structure requires a different approach to quantization, theprocess of making models smaller and more efficient, compared to conventionalmodel architectures.

Nota AI previously won both its track andthe overall competition at the NVIDIA Nemotron Hackathon with a data-driven MoEquantization method. With the acceptance of these two papers, Nota AI will onceagain present research outcomes specifically designed for MoE architectures ona global research stage.

The first accepted paper, “DREAM-MoE,”proposes a method to reduce changes in a model’s decision flow that can occurwhen large-scale AI models are quantized across multiple segments. The methodfocuses on the fact that even a small error in an earlier segment can affectexpert selection in later segments. DREAM-MoE helps the quantized model selectexperts in a way that remains closer to the original model.

The second paper, “SRA-MoE,” proposes amethod that identifies and prioritizes important inputs that have a greaterimpact on the model’s final output. Rather than treating all inputs equally,SRA-MoE is designed to prevent expert selection from being significantlydisrupted for these key inputs, helping maintain model quality more effectivelyunder limited resources.

Both studies demonstrated higherperformance compared to the latest MoE-specific quantization methods. Thisshows that large-scale AI models can be executed with less memory and fewercomputing resources while reducing quality degradation. As the cost, powerconsumption, and hardware burden of running large AI models continue toincrease, MoE-specific quantization technologies are becoming increasinglyimportant.

Nota AI has been proactively focusing itsR&D efforts on optimizing large AI models that require substantial memoryand computing resources. The company is advancing large-scale modeloptimization, including Solar MoE, as part of the sovereign foundation modelproject led by the Upstage consortium. It is also expanding its experience inquantizing NVIDIA Nemotron 3 Nano to newer large models such as Nemotron Ultra,further broadening the scope of its optimization technologies.

“This paper acceptance reflects Nota AI’scontinued advancement of MoE-specific quantization technologies,” said MyungsuChae, CEO of Nota AI. “Following our overall win at the NVIDIA NemotronHackathon, we are pleased to present our research at the ICML 2026 AdaptFMWorkshop. We will continue developing optimization technologies that enablelarge-scale AI models to be used more efficiently and practically.”

In addition, Nota AI will host “Nota AI -Korea Efficient Days” during ICML 2026 at COEX in Seoul. The event will bringtogether global researchers, engineers, and business leaders visiting Korea toshare research trends and industrial applications of Efficient AI. Through theevent, Nota AI plans to introduce its research achievements in large-scale AImodel optimization and expand opportunities for technical collaboration andbusiness engagement.

paper acceptance

Nota AI Has Two MoE Quantization Papers Accepted at ICML 2026 Workshop, Demonstrating Global Competitiveness in Large-Scale AI Optimization

▶ Two papers on MoE-specific quantization algorithms accepted at a workshop held in conjunction with ICML 2026

▶ Recognition follows Nota AI’s overall win at the NVIDIA Nemotron Hackathon

▶ Strengthening core optimization technologies to make large-scale AI models smaller and more efficient to run‍

Related

Nota AI Signs AI Model Optimization Technology Supply Contract with FuriosaAI, Expanding Commercialization to Data Centers

Nota AI delivers new real-time data extraction tool using Artificial Intelligence

Nota AI Successfully Optimizes 236B-Parameter Large AI Model on Korean NPU…“Maintains Original-Level Performance While Reducing Model Size by 71%”

▶ Strengthening core optimization technologies to make large-scale AI models smaller and more efficient to run
‍