Tagged "inference-efficiency"

Arm China Unveils "Tianxuan" CPU and Xingchen 300 Platform, Targeting Ubiquitous AIoT with On-Device AI Portfolio 22 July 2026
Tencent Open-Sources Hy3 295B MoE Model Built for STEM Reasoning 9 July 2026
Compressor V2: Three Compression Layers for 50% LLM Agent Cost Cut 6 July 2026
Mistral AI Launches Mistral Vibe 28 May 2026
The Brain vs. Deep Learning Part I: Computational Complexity Analysis 22 May 2026
Running a Local LLM on a 12-Year-Old Raspberry Pi: Practical Edge Inference 12 May 2026
Gemma 4 Replaces Entire Local LLM Stack for Many Practitioners 12 May 2026
DistillFast: AI Cost Optimization Tool for Model Efficiency 10 May 2026
Bun's Experimental Rust Rewrite Achieves 99.8% Test Compatibility on Linux 9 May 2026
Google Releases Gemma 4 Multi-Token Prediction Drafters To Accelerate AI Inference 8 May 2026
A 49-Line Physics Classifier That Beats kNN on 76% of Benchmarks 5 May 2026
Xmemory: Benchmarking Structured AI Memory Against RAG and Hybrid RAG 1 May 2026
Economic Implications of AI Adoption: Why Local Deployment Matters for Cost Control 28 April 2026
Google's Gemma 4 Could Put Powerful AI on Your Phone and Laptop 27 April 2026
CricketBrain: Neuromorphic Signal Processor in Rust (0.175us/step, 944 bytes) 7 April 2026
TurboQuant: Understanding the Quantization Breakthrough 29 March 2026
RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra 27 March 2026