LocalFTW
Why Local
All Posts
Guides
Contribute
Clinic
Topic Graph
Bookmarks
Tagged "inference-pipeline-design"
Local LLMs Work Best When You're Not Loyal to Just One
2 May 2026
Prefill Is Compute-Bound, Decode Is Memory-Bound: Optimizing GPU Utilization for LLM Inference
16 April 2026