GOSIM Paris 2026 Has Concluded
Thank you to all attendees, speakers, and sponsors for an incredible event!
Speaker Slides Speaker Slides Photo Album Photo Album
Filter
Agentic AI on Edge

Vulkan for Edge AI: Expanding the Hardware Frontier with llama.cpp

Date May 6 Time 11:35 - 12:00 Location Central Room
Agentic AI on the edge requires accessible, low-latency inference, yet hardware fragmentation limits deployment. While CUDA dominates acceleration, its vendor lock-in constrains local intelligence. This talk examines Vulkan as a vendor-neutral alternative, showcasing how it expanded compatibility and reduced deployment complexity in llama.cpp across Intel, AMD, and Nvidia GPUs.

However, Vulkan is not a silver bullet. I will outline engineering roadblocks, from driver inconsistencies to compute limitations. Looking ahead, we explore VK_NV_cooperative_matrix2 as a blueprint for offloading hardware-specific optimizations to the driver. This enables peak performance via vendor optimizations while still allowing broad support through generic shader fallbacks, unifying the edge AI ecosystem.