GOSIM Paris 2026 Has Concluded
Thank you to all attendees, speakers, and sponsors for an incredible event!
Speaker Slides Speaker Slides Photo Album Photo Album
Filter
Open Source Models

Building Scalable LLM Inference Infrastructure

Date May 6 Time 16:40 - 17:10 Location Open Stage
LLM serving infrastructure has become a key pillar for modern society, but building scalable LLM infrastructure remains challenging at scale due to system issues like load imbalancing, stragglers and lack of elasticity. In this talk, I will present our recent work on scalable and efficient LLM infrastructure, including simple but efficient multiplication-based LLM global router and ultra-fast autoscaling mechanisms. Some of these works have been or are being deployed on world's largest LLM service providers.