This talk explores building a complete self-hosted LLM stack in Rust: Paddler, a distributed load balancer for serving LLMs at scale, and Poet, a static site generator that consumes those LLMs for AI-powered content features.

This talk explores building a complete self-hosted LLM stack in Rust: Paddler, a distributed load balancer for serving LLMs at scale, and Poet, a static site generator that consumes those LLMs for AI-powered content features.
We'll dive into the hard problems: async request routing across dynamic agent fleets, integrating with llama.cpp's C++ codebase, managing KV cache in custom slots, and implementing zero-to-N autoscaling with request buffering. You'll see how Rust's ownership model prevented entire classes of bugs in distributed state management, and walk away with concrete patterns for building and consuming LLM infrastructure in production.
This technical talk examines the most prevalent pain points facing Rust web developers today and explores how the community is addressing them.
In this talk, we’ll re-create the core ideas of Karpathy’s micrograd, but entirely in Rust.
This session we will delve into the sometimes murky world of procedural macros - showing some of the great tooling available for understanding the code generated, such as cargo expand, and the key building blocks we will need for writing our own.
During this talk we'll build a basic, working async runtime using nothing more than a standard library. The point? To see it's approachable for mere mortals.
In 2024, I added the `Option::as_slice` and `Option::as_mut_slice` methods to libcore. This talk is about what motivated the addition, and looks into the no less than 4 different implementations that made up the methods. It also shows that even without a deep understanding of all compiler internals, it is possible to add changes both to the compiler and standard library.
In this talk, we'll explore the current state of AI development in Rust, highlighting key crates, frameworks, and tools. Covering the essentials from ML and NLP to integrating LLMs and agent-based automation.