GOSIM Paris 2026 Has Concluded
Thank you to all attendees, speakers, and sponsors for an incredible event!
Speaker Slides Speaker Slides Photo Album Photo Album
Filter
vLLM Workshop

Building, Testing and Contributing to vLLM: A Developer's Guide

Date May 5 Time 10:25 - 11:05 Location Founders Cafe
Building, Testing and Contributing to vLLM: A Developer's Guide

Large Language Models (LLMs) have revolutionized the AI landscape, and vLLM has emerged as a leading inference engine that dramatically accelerates LLM serving through innovations like PagedAttention. But how do you actually build, test, and contribute to this rapidly evolving project?

In this talk, we'll take you through vLLM's architecture and explore the practical aspects of working with this complex Python/C++ codebase. We'll start with an overview of vLLM's core optimizations including PagedAttention, then dive into the build process for different targets as well as third party hardware plugins, such as Google TPU, AWS Neuron, Intel Gaudi and more.

You'll learn about testing strategies such as performance benchmarking with GuideLLM and model evaluation using lm-evaluation-harness. We'll also cover contribution best practices to the vLLM community and how Red Hat AI Inference Server (RHAIIS) provides a trustworthy and validated platform to run LLM workflows across diverse hardware environments.