Tagged "hugging-face"

PLLuM: Poland's Ministry of Digital Affairs Releases Open Models on HuggingFace 22 May 2026
An Update on GitHub Availability: Infrastructure Lessons for Hosted LLM Tools 28 April 2026
MiniMax M2.7 GGUF Investigation Reveals NaN Issues Affecting 21-38% of Hugging Face Conversions 15 April 2026
DGX Spark Setup Guide: Running vLLM and PyTorch for Local LLM Inference Backend 15 April 2026
DFlash Doubles Token Generation Speed of Qwen3.5 27B on Mac M5 Max 15 April 2026
MiniMax M2.7 Achieves SOTA Performance Under 64GB on Mac with TQ Quantization 14 April 2026
On-Device AI Inference Emerges as New Security Blind Spot for CISOs 13 April 2026
Unsloth Completes Comprehensive MiniMax M2.7 GGUF Quantization Suite 12 April 2026
MiniMax M2.7 Released: New Model Available for Local Deployment 12 April 2026
VoxCPM2: New Open-Source TTS Model with Voice Cloning and Design 9 April 2026
Hugging Face Moves Safetensors Under PyTorch Foundation 9 April 2026
Ollama is Still the Easiest Way to Start Local LLMs, But It's the Worst Way to Keep Running Them 9 April 2026
Netflix Open-Sources VOID Model for Video Object Deletion 4 April 2026
ByteShape Releases Qwen 3.5 9B Quantisations with Hardware-Matched Tuning Guide 1 April 2026
PrismML Announces 1-Bit Bonsai: First Commercially Viable 1-Bit LLMs 1 April 2026
Mistral AI Releases Voxtral: Open-Source TTS Model Beating ElevenLabs on Local Hardware 27 March 2026
Liquid AI's LFM2-24B Achieves 50 Tokens/Second in Web Browser via WebGPU 26 March 2026
Hugging Face Releases One-Liner for Automatic Hardware Detection and Model Selection 18 March 2026
Mistral Small 4 119B Released with NVFP4 Quantisation Support 17 March 2026
OmniCoder-9B: Efficient Coding Model for 8GB GPUs 16 March 2026
Cicikus v3 Prometheus 4.4B – An Experimental Franken-Merge for Edge Reasoning 15 March 2026
Qwen 3.5 Derestricted Model Available for Local Deployment 9 March 2026
Sarvam AI Releases 30B and 105B Open-Source Models Trained from Scratch 7 March 2026
Local LLM Performance Improvements: A Year of Progress Since DeepSeek R1 Moment 2 March 2026
Open-Source llama.cpp Finds Long-Term Home at Hugging Face 23 February 2026
O-TITANS: Orthogonal LoRA Framework for Gemma 3 with Google TITANS Memory Architecture 22 February 2026
GGML Joins Hugging Face: What This Means for Local Model Optimization 22 February 2026
Open-Source + AI: ggml Joins Hugging Face, llama.cpp Stays Open—Local AI's Long-Term Home 21 February 2026
GGML.AI Acquired by Hugging Face 21 February 2026
Matmul-Free Language Model Trained on CPU in 1.2 Hours 18 February 2026
Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation 17 February 2026
GPT-OSS 20B Now Runs 100% Locally in Browser via WebGPU 14 February 2026
MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace 13 February 2026