Tagged "model-quantisation"
- ByteShape Releases Qwen 3.5 9B Quantisations with Hardware-Matched Tuning Guide
- Google's TurboQuant Shows Memory Constraints Remain Critical for Local LLM Inference
- Qwen3 512k Context via TurboQuant on Mac mini
- RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra
- Google TurboQuant: Extreme Compression for Local LLM Deployment
- Mistral Small 4 119B Released with NVFP4 Quantisation Support
- Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support
- Qwen 3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Quantisation Benchmarks
- Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation