Tagged "model-quantisation"

ByteShape Releases Qwen 3.5 9B Quantisations with Hardware-Matched Tuning Guide 1 April 2026
Google's TurboQuant Shows Memory Constraints Remain Critical for Local LLM Inference 29 March 2026
Qwen3 512k Context via TurboQuant on Mac mini 28 March 2026
RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra 27 March 2026
Google TurboQuant: Extreme Compression for Local LLM Deployment 25 March 2026
Mistral Small 4 119B Released with NVFP4 Quantisation Support 17 March 2026
Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support 6 March 2026
Qwen 3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Quantisation Benchmarks 28 February 2026
Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation 17 February 2026