Tagged "inference-performance"

Nvidia Enters Windows Laptop Market, Taking on Intel and AMD 1 June 2026
Microsoft and Nvidia to Unveil First Windows PCs with Nvidia CPUs and AI Capabilities 31 May 2026
Tweaking Local Language Model Settings with Ollama 29 May 2026
Qualcomm's AI-Device Strategy Reflects Growing Market Momentum in On-Device Intelligence 24 May 2026
Benchmarking a Portable AI Workstation: Lenovo ThinkPad P16 Gen 3, Part 2 21 May 2026
Hipfire: A Rust-Native AMD Inference Engine That Outperforms llama.cpp 28 April 2026
Elastic KV Cache Memory Breakthrough Enables Efficient Bursty LLM Serving and GPU Sharing 26 April 2026
Users Report Significant Performance Improvements After Migrating from Ollama to llama.cpp 12 April 2026
AMD Announces Day 0 Support for Google Gemma 4 Across Processors and GPUs 7 April 2026
Unpaved: Audit Toolkit for AI Developer Tool Bias in Global South Contexts 5 April 2026
Mistral Small 4 119B Released with NVFP4 Quantisation Support 17 March 2026
Comprehensive MoE Backend Benchmarks for Qwen3.5-397B: Real Numbers vs Hype 12 March 2026
HP OMEN MAX 16 Review: Is Local AI on a Laptop Viable in 2026? 10 March 2026
HP Refreshes Lineup with AI-Focused Workstations 8 March 2026
HP ZBook Ultra 14 G1a Workstation Reclaims Local AI Workflows for Professionals 2 March 2026
DeepSeek Paper – DualPath: Breaking the Bandwidth Bottleneck in LLM Inference 26 February 2026
A Tool to Tell You What LLMs Can Run on Your Machine 23 February 2026
Switching From Ollama and LM Studio to llama.cpp: Performance Benefits 13 February 2026