LocalFTW
Why Local
All Posts
Guides
Contribute
Clinic
Topic Graph
Bookmarks
Tagged "decode-optimization"
Prefill Is Compute-Bound, Decode Is Memory-Bound: Optimizing GPU Utilization for LLM Inference
16 April 2026