Tagged "release"

Velr: Embedded Property-Graph Database for Local LLM Applications 23 March 2026
Self-Hostable AI Agents and Internal Software Framework Released 23 March 2026
Qt 6.11 Released with Enhanced Cross-Platform Deployment Capabilities 23 March 2026
MiniMax M2.7 Model to Be Released as Open Weights 23 March 2026
LM Studio Releases Reworked Plugins with Fully Local Web Research 23 March 2026
Qwen 3.5 122B Uncensored (Aggressive) Released with New K_P Quantisations 22 March 2026
Nvidia Nemotron Cascade 2 30B Emerges as Powerful Alternative to Qwen Models 22 March 2026
BrowserOS 0.44.0 Release: Advances in Local AI Integration for Web-Based Applications 22 March 2026
Pydantic-Deep: Production Deep Agents for Pydantic AI 21 March 2026
Atuin v18.13 – Better Search, a PTY Proxy, and AI for Your Shell 21 March 2026
NVIDIA Nemotron Cascade 2 30B Delivers 120B-Class Performance in Compact Form Factor 20 March 2026
NVIDIA Nemotron 3 Nano 4B Enables On-Device Inference Directly in Web Browsers via WebGPU 20 March 2026
Llamafile 0.10 Released with GPU Support and Rebuilt Core 20 March 2026
Cybersecurity Skills for AI Agents – agentskills.io Standard Implementation 20 March 2026
ASUS ExpertCenter PN55 Mini PC Combines AMD AI CPU and 55 TOPS NPU 20 March 2026
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet 19 March 2026
Multiverse Computing Targets On-Device AI With Compressed Models and New API Portal 19 March 2026
Dell Pro Max 16 Plus Launches With Enterprise-Grade Discrete NPU for On-Device AI 19 March 2026
Tether's QVAC Introduces Cross-Platform Bitnet LoRA Framework for On-Device AI Training 19 March 2026
Unsloth Studio: Open-Source Web UI for Training and Running LLMs Locally 18 March 2026
On-Device AI: Tether's QVAC Fabric Enables Local Training 18 March 2026
MiniMax-M2.7: New Compact Model Announced for Local Deployment 18 March 2026
Mamba 3: State Space Model Architecture Optimized for Inference 18 March 2026
Hugging Face Releases One-Liner for Automatic Hardware Detection and Model Selection 18 March 2026
Mistral Small 4 119B Released with NVFP4 Quantisation Support 17 March 2026
Mistral Releases Small 4 Open-Source Model Under Apache 2.0 17 March 2026
Mistral Releases Leanstral: First Open-Source Code Agent for Lean 4 Proof Assistant 17 March 2026
Kimi Introduces Attention Residuals: 1.25x Compute Performance at <2% Overhead 17 March 2026
OmniCoder-9B: Efficient Coding Model for 8GB GPUs 16 March 2026
NVIDIA Updates Nemotron 3 122B License, Removes Deployment Restrictions 16 March 2026
StepFun Releases SFT Dataset Used to Train Step 3.5 Flash for Community Fine-Tuning 15 March 2026
Nvidia's Nemotron 3 Super: Understanding the Significance for Local LLM Deployment 15 March 2026
Cicikus v3 Prometheus 4.4B – An Experimental Franken-Merge for Edge Reasoning 15 March 2026
AMD Launches Agent System Optimized for Local AI Inference With Ryzen and Radeon 15 March 2026
Intel OpenVINO Backend Support Now Available in llama.cpp 14 March 2026
Lemonade v10 Brings Linux NPU Support and Multi-Modal Capabilities 14 March 2026
Intel Updates LLM-Scaler-vLLM With Support For More Qwen3/3.5 Models 13 March 2026
Sarvam Open-Sources 30B and 105B Reasoning Models 12 March 2026
Qwodel – An Open-Source Unified Pipeline for LLM Quantization 12 March 2026
Nvidia Releases Nemotron 3 Super: 120B MoE Model for Local Deployment 12 March 2026
Llama.cpp Adds True Reasoning Budget Support 12 March 2026
Texas Instruments Launches NPU-Powered MCUs for Low-Power Edge AI 11 March 2026
SK Hynix Completes Qualification for LPDDR6 Memory Optimized for AI Inference 11 March 2026
Sarvam Open-Sources 30B and 105B Reasoning Models 11 March 2026
Qwen 3.5-35B Uncensored GGUF Models Now Available 11 March 2026
Kali Linux Integrates Local Ollama and MCP for AI-Driven Penetration Testing 11 March 2026
SK Hynix Develops 1c LPDDR6 DRAM to Boost On-Device AI Performance in Mobile Devices 10 March 2026
Gloss: Open-Source, Local-First RAG Alternative to NotebookLM Built in Rust 10 March 2026
FreeBSD 14.4 Released: Implications for Local LLM Deployment 10 March 2026
Fish Audio Open-Sources S2: Expressive Text-to-Speech with Natural Language Control and 100ms Latency 10 March 2026
Sarvam Open-Sources 30B and 105B Reasoning Models 9 March 2026
Qwen 3.5 Small Expands On-Device AI to Phones and IoT with Offline Support 9 March 2026
Qwen 3.5 Derestricted Model Available for Local Deployment 9 March 2026
Engram – Open-Source Persistent Memory for AI Agents 9 March 2026
Snapdragon Wear Elite Unveiled at MWC 2026, Advancing Wearable AI Inference 8 March 2026
HP Refreshes Lineup with AI-Focused Workstations 8 March 2026
Apple Launches MacBook Neo with A18 Pro Chip for Affordable Local AI Inference 8 March 2026
Sarvam AI Releases 30B and 105B Open-Source Models Trained from Scratch 7 March 2026
Open WebUI Adds Native Terminal Tool Calling with Qwen3.5 35B Support 7 March 2026
Llama.cpp Merges Automatic Parser Generator to Mainline 7 March 2026
Jse v2.0 AI Output Specification 7 March 2026
IBM Granite 4.0 1B Speech Model Released for Multilingual Speech Recognition 7 March 2026
Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support 7 March 2026
Final Qwen3.5 Unsloth GGUF Update with Improved Size/Quality Tradeoffs 6 March 2026
Building PyTorch-Native Support for IBM Spyre Accelerator 6 March 2026
llama.cpp Merges Agentic Loop and MCP Client Support 6 March 2026
Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support 6 March 2026
Kakao Launches Kanana AI for On-Device Schedule and Recommendation Management 5 March 2026
Apple Unveils MacBook Pro with M5 Pro and M5 Max Featuring On-Device AI 5 March 2026
RunAnywhere Launches Production-Grade On-Device AI Platform for Enterprise Scale 4 March 2026
Qualcomm Snapdragon Wear Elite Brings On-Device AI to Smartwatches 4 March 2026
OpenWrt 25.12.0 – Stable Release 4 March 2026
Apple Unveils MacBook Pro With M5 Pro and M5 Max for On-Device AI 4 March 2026
Apple M5 Pro and M5 Max: 4× Faster LLM Processing 4 March 2026
AMD Launches Copilot+ Desktop Chips to Compete in On-Device AI Market 4 March 2026
ÆTHERYA Core – Deterministic Policy Engine for Governing LLM Actions 4 March 2026
Qwen 3.5 Small Models Released: 0.8B to 9B Parameters Optimized for On-Device Inference 3 March 2026
Qualcomm Snapdragon Wear Elite: 2B Parameter NPU for Personal AI Wearables 3 March 2026
Apple M4 iPad Air Targets AI Users with Double M1 Speed Performance 3 March 2026
AMD Ryzen AI 400 Series Desktop Processors Launch with Integrated 60 TOPS NPU 3 March 2026
Alibaba's Qwen 3.5 Small Model Runs Directly on iPhone 17 3 March 2026
Qualcomm Launches Snapdragon Wear Elite for On-Device AI on Wearables 2 March 2026
Jan Releases Code-Tuned 4B Model for Efficient Local Code Generation and Development Tasks 2 March 2026
GitDelivr: A Free CDN for Git Clones Built on Cloudflare Workers and R2 2 March 2026
AMD Expands Ryzen AI 400 Series Portfolio for Consumer and Enterprise AI PC Options 2 March 2026
Alibaba's Open-Source CoPaw AI Agent Now Compatible with MCP and ClawHub Skills 2 March 2026
Qwen 3.5-35B-A3B Emerges as Efficient Daily Driver, Replacing 120B Models 1 March 2026
ParseHive – AI-Powered Invoice Data Extraction for Windows and Mac 1 March 2026
Huawei's SuperPoD Portfolio Creates New Option for Global Computing at MWC Barcelona 2026 1 March 2026
DeepSeek V4 Multimodal Model Coming Next Week With Image and Video Generation 1 March 2026
Unsloth Dynamic 2.0 GGUFs 28 February 2026
Qwen3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Across Nearly All Quantisation Levels 28 February 2026
Qwen 3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Quantisation Benchmarks 28 February 2026
The ML.energy Leaderboard 28 February 2026
LLmFit: One-Command Hardware-Aware Model Selection Across 497 Models and 133 Providers 28 February 2026
Krasis: Hybrid CPU/GPU MoE Runtime Achieves 3,324 Tokens/Second Prefill on RTX 5080 28 February 2026
Seco Launches Edge AI System-on-Module at Embedded World 2026 27 February 2026
Snapdragon 8 Elite Gen 5 Powers Galaxy S26 Series With Enhanced On-Device AI 27 February 2026
On-Device Function Calling in Google AI Edge Gallery 27 February 2026
DeepSeek Releases DualPath: Addressing Storage Bandwidth Bottlenecks in Agentic Inference 26 February 2026
DeepSeek Paper – DualPath: Breaking the Bandwidth Bottleneck in LLM Inference 26 February 2026
Apple: Python bindings for access to the on-device Apple Intelligence model 26 February 2026
Red Hat Launches AI Enterprise for Hybrid AI Deployments 25 February 2026
Qwen3.5 Thinking Mode Can Be Disabled for Production Inference Optimization 25 February 2026
Qwen3.5 Series Releases Comprehensive Model Lineup Across All Tiers 25 February 2026
Qwen3.5-35B-A3B Emerges as Game-Changer for Agentic Coding Tasks 25 February 2026
Meta's OpenClaw Release Raises Questions About Open-Source Model Safety and Alignment 24 February 2026
Kioxia Sampling UFS 5.0 Embedded Flash Memory for Next-Generation Mobile Applications 24 February 2026
Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search 24 February 2026
Making Wolfram Technology Available as Foundation Tool for LLM Systems 23 February 2026
Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search 23 February 2026
Ouro 2.6B Thinking Model GGUFs Released with Q8_0 and Q4_K_M Quantization 22 February 2026
Ollama 0.17 Released With Improved OpenClaw Onboarding 22 February 2026
Google Open-Sources NPU IP, Synaptics Implements It for Hardware Acceleration 22 February 2026
DietPi Released a New Version v10.1 22 February 2026
Asus ExpertBook B3 G2 with 50 TOPS AI Sets New Enterprise Standard 22 February 2026
Vellium v0.3.5: Major Writing Mode Overhaul and Native KoboldCpp Support 21 February 2026
[Release] Ouro-2.6B-Thinking: ByteDance's Recurrent Model Now Runnable Locally 21 February 2026
Claude Code Open – AI Coding Platform with Web IDE and Agents 21 February 2026
SanityBoard Adds 27 New Model Evaluations Including Qwen 3.5 Plus, GLM 5, and Gemini 3.1 Pro 20 February 2026
PaddleOCR-VL Now Integrated into llama.cpp for Multilingual OCR 20 February 2026
NVIDIA Releases Dynamo v0.9.0: Infrastructure Overhaul With FlashIndexer and Multi-Modal Support 20 February 2026
Kitten TTS V0.8 Released: New State-of-the-Art Super-Tiny TTS Model Under 25 MB 20 February 2026
Free ASIC-Accelerated Llama 3.1 8B Inference at 16,000 Tokens/Second 20 February 2026
LayerScale Launches Inference Engine Faster Than vLLM, SGLang, and TRT-LLM 19 February 2026
Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB 19 February 2026
Aegis.rs: Open Source Rust-Based LLM Security Proxy Released 19 February 2026
Tailscale Releases New Tool to Prevent Sensitive Data Leakage to Cloud AI Services 18 February 2026
Sarvam AI Launches Edge Model to Challenge Major AI Players with Local-First Approach 18 February 2026
Alibaba's Qwen3.5-397B Achieves #3 Position in Open Weights Model Rankings 18 February 2026
OpenClaw Refactored in Go, Runs on $10 Hardware 18 February 2026
GLM-5 Technical Report: DSA Innovation Reduces Training and Inference Costs 18 February 2026
Cloudflare Releases Agents SDK v0.5.0 with Rust-Powered Infire Engine for Edge Inference 18 February 2026
AMD Announces Day 0 Support for Qwen 3.5 LLM on Instinct GPUs 18 February 2026
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet 17 February 2026
Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation 17 February 2026
Cohere Releases Tiny Aya: Efficient 3.3B Multilingual Model for 70+ Languages 17 February 2026
ASUS Zenbook 14 Launches in India with AI-Capable Hardware, Starting at Rs 1,15,990 17 February 2026
Asus ExpertBook B3 G2 Laptop Features Ryzen AI 9 HX 470 CPU in 1.41kg Ultraportable Form Factor 17 February 2026
InitRunner: YAML-Based AI Agent Framework with RAG and Memory 16 February 2026
GPU-Accelerated DataFrame Library for Local Inference Workloads 16 February 2026
Alibaba Unveils Major AI Model Upgrade Ahead of DeepSeek Release 16 February 2026
NVIDIA's Dynamic Memory Sparsification Cuts LLM Inference Costs by 8x 14 February 2026
MiniMax Releases M2.5 Model with SOTA Coding and Agent Capabilities 14 February 2026
MiniMax-M2.5 230B MoE Model Released with GGUF Support for Local Deployment 14 February 2026
LLaDA2.1 Introduces Token Editing for Massive Speed Gains in Local Inference 14 February 2026
GPT-OSS 20B Now Runs 100% Locally in Browser via WebGPU 14 February 2026
GPT-OSS 120B Uncensored Model Released in Native MXFP4 Precision 14 February 2026
GNOME's AI Assistant Newelle Adds llama.cpp Support and Command Execution 14 February 2026
ByteDance Releases Seed2.0 LLM with Complex Real-World Task Improvements 14 February 2026
WinClaw: Windows-Native AI Assistant with Office Automation 13 February 2026
Ring-1T-2.5 Released with SOTA Deep Thinking Performance 13 February 2026
GitHub Announces Support for Open Source AI Project Maintainers 13 February 2026
MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace 13 February 2026
Ming-flash-omni-2.0: 100B MoE Omni-Modal Model Released 13 February 2026
Student Releases Dhi-5B: Multimodal Model Trained for Just $1,200 13 February 2026
ByteDance Releases Seedance 2.0 AI Development Platform 12 February 2026
Samsung's REAM: Alternative Model Compression Technique 12 February 2026
OpenClaw with vLLM Running for Free on AMD Developer Cloud 12 February 2026
Microsoft MarkItDown: Document Preprocessing Tool for LLMs 12 February 2026
Memio Launches AI-Powered Knowledge Hub for Android with Local Processing 12 February 2026
New Header-Only C++ Benchmark Tool for Predictive Models on Raw Binary Streams 12 February 2026
GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks 12 February 2026
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts 11 February 2026
Godot MCP Gives AI Assistants Full Access to Game Engine Editor 11 February 2026
DeepSeek Launches Model Update with 1M Context Window 11 February 2026
Arm SME2 Technology Expands CPU Capabilities for On-Device AI 11 February 2026