SGLang Workshop

Production Image/Video Serving with SGLang Diffusion

Date May 6 Time 14:50 - 15:10 Location Central Room

Diffusion models have become the backbone of modern image and video generation, but serving them efficiently remains challenging. In this talk, we introduce SGLang-Diffusion, a high-performance inference framework designed for scalable diffusion generation. We present its system architecture and key optimizations — including advanced parallelism, distributed VAE, kernel fusion, and serving improvements — that enable efficient and production-ready deployment of diffusion models. We also demonstrate how SGLang-Diffusion accelerates popular open-source models and supports large-scale multimodal generation workloads.

Speakers

Eva Ma Algorithmic Tamer, Atlas Cloud AI LLC