GOSIM Paris 2026 Has Concluded
Thank you to all attendees, speakers, and sponsors for an incredible event!
Speaker Slides Speaker Slides Photo Album Photo Album
Filter
Agentic AI on Edge

OminiX: Fully Automated Native C++ Deployment for Diverse Large-Scale Learning Models

Date May 6 Time 11:10 - 11:35 Location Central Room
Running deep learning inference in native C++ enables efficient edge deployment, eliminates Python/PyTorch dependencies, and allows fast, accurate quantization. However, converting a PyTorch
model to native C++ requires weeks of labor-intensive development, and only a narrow range of LLMs have been manually ported. We propose OminiX cpp, an automated pipeline where an AI agent
with structured procedural skills converts arbitrary PyTorch models into optimized C++ inference code targeting the GGML runtime. OminiX cpp generalizes beyond LLMs to support diverse model
families, including image and video generation, speech recognition, text-to-speech models, world models, and Vision-Language-Action (VLA) models. As a case study, we show the results on OpenVLA, a 7B-parameter VLA model, achieving a near-lossless task success rate, up to 63% memory reduction, and up to 1.52× speedup.