Khal and Inferi: Towards Single-Language Cross-Platform GPU Inference with Rust and rust-gpu
DateMay 5Time11:10 - 11:35Location Central Room
After multiple technological explorations for cross-platform GPU programming like WGSL and Slang, we present our latest experiment with a codebase where both CPU code and GPU kernels are written with the same language: Rust. By leveraging cargo and the rust compiler (through rust-gpu), we implemented common tensor and LLM inference operators running on all major platforms, including the web. While still in early-stage and with modest performances, the benefit of writing regular Rust code, strong typing across the GPU/CPU boundary, and the Cargo package manager, enable smooth integration and interoperation with codebases across multiple domains for AI and beyond.