Tagged "model-scaling"
- Qwen 3.5 397B emerges as top-performing local coding model
- Mamba 3: State Space Model Architecture Optimized for Inference
- Sarvam Open-Sources 30B and 105B Reasoning Models
- Sarvam Open-Sources 30B and 105B Reasoning Models
- Running Local AI Models on Mac Studio 128GB: 4B, 20B & 120B Tested
- Qwen 3.5-27B Demonstrates Exceptional Performance with Thoughtful Prompt Engineering
- Krasis: Hybrid CPU/GPU MoE Runtime Achieves 3,324 Tokens/Second Prefill on RTX 5080
- GLM-5 Technical Report: DSA Innovation Reduces Training and Inference Costs
- Context Management Identified as Real Bottleneck in AI-Assisted Coding
- GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks