LIMITED TIME Grab your Early Bird ticket and save 30%! · Deal ends April 13th — Get Tickets Now!
Filter
Agentic AI Summit

Pie: A Programmable Serving System for Agentic Applications

Date May 6 Time 14:00 - 14:20 Location Master Stage
Emerging large language model (LLM) applications involve diverse reasoning strategies and agentic workflows, straining the capabilities of existing serving systems built on a monolithic token generation loop. This talk presents Pie, a programmable LLM serving system designed for flexibility and efficiency.

Pie decomposes the traditional generation loop into fine-grained service handlers exposed via an API and delegates control of the generation process to user-provided programs called inferlets. This enables applications to implement new KV cache strategies, bespoke generation logic, and seamlessly integrate computation and I/O—entirely within the application, without requiring modifications to the serving system.

Pie executes inferlets using WebAssembly, benefiting from its lightweight sandboxing. Evaluation shows Pie improves latency and throughput by 1.3×–3.4× on agentic workflows. Pie is open-source at pie-project.org.