限时优惠 立即抢购早鸟票,享7折优惠! · 4月13日截止 — 立即购票!
筛选
vLLM Workshop

构建、测试和贡献 vLLM:开发者指南

日期 5月5日 时间 11:40 - 12:00 地点 创始人咖啡厅
Building, Testing and Contributing to vLLM: A Developer's Guide

Large Language Models (LLMs) have revolutionized the AI landscape, and vLLM has emerged as a leading inference engine that dramatically accelerates LLM serving through innovations like PagedAttention. But how do you actually build, test, and contribute to this rapidly evolving project?

In this talk, we'll take you through vLLM's architecture and explore the practical aspects of working with this complex Python/C++ codebase. We'll start with an overview of vLLM's core optimizations including PagedAttention, then dive into the build process for different targets as well as third party hardware plugins, such as Google TPU, AWS Neuron, Intel Gaudi and more.

You'll learn about testing strategies such as performance benchmarking with GuideLLM and model evaluation using lm-evaluation-harness. We'll also cover contribution best practices to the vLLM community and how Red Hat AI Inference Server (RHAIIS) provides a trustworthy and validated platform to run LLM workflows across diverse hardware environments.