Tagged "resource-optimization"
- Hipfire: A Rust-Native AMD Inference Engine That Outperforms llama.cpp
- I Replaced My Local LLM With a Model Half Its Size and Got Better Results
- OpenNebula 7.2 "Dark Horse" Released with Enhanced Infrastructure Support
- Ollama is Still the Easiest Way to Start Local LLMs, But It's the Worst Way to Keep Running Them
- Google's Gemma 4 Brings Powerful On-Device AI to Android and iOS
- Bonsai 1-Bit Models Deliver Exceptional Local Inference Performance
- Qwen 3.5-27B Demonstrates Superior Performance vs Gemini 3.1 Pro and GPT-5.3
- GPU Passthrough to LXCs in Proxmox Simplifies Local Inference Infrastructure
- Local AI Ecosystem Extends Far Beyond Ollama
- NVIDIA Releases GPT-OSS-Puzzle-88B, a Deployment-Optimized Model
- Qwen 3.5 Models: Optimal Settings and Reduced Overthinking Configuration
- LMCache Dramatically Accelerates LLM Inference on Oracle Data Science Platform
- Custom GPU Multiplexer Achieves 0.3ms Model Switching on Legacy Hardware
- Kimi Introduces Attention Residuals: 1.25x Compute Performance at <2% Overhead
- FreeBSD 14.4 Released: Implications for Local LLM Deployment
- Fine-Tuned Qwen SLMs (0.6–8B) Demonstrate Competitive Performance Against Frontier LLMs on Specialized Tasks
- Snapdragon Wear Elite Unveiled at MWC 2026, Advancing Wearable AI Inference
- SynthesisOS – A Local-First, Agentic Desktop Layer Built in Rust
- RunAnywhere Launches Production-Grade On-Device AI Platform for Enterprise Scale
- Qwen 3.5-27B Q4 Quantization Comparison and Analysis
- The ML.energy Leaderboard
- DeepSeek Paper – DualPath: Breaking the Bandwidth Bottleneck in LLM Inference
- Show HN: A Ground Up TLS 1.3 Client Written in C
- O-TITANS: Orthogonal LoRA Framework for Gemma 3 with Google TITANS Memory Architecture
- At India AI Impact Summit, Intel Showcases Its AI PCs and Cost-Efficient Frugal AI
- 24 Simultaneous Claude Code Agents on Local Hardware
- TemplateFlow – Build AI Workflows, Not Prompts
- Mirai Secures $10M to Optimize On-Device AI Amid Cloud Cost Surge
- Local-First RAG: Vector Search in SQLite with Hamming Distance
- Sarvam AI Launches Edge Model to Challenge Major AI Players with Local-First Approach
- OpenClaw Refactored in Go, Runs on $10 Hardware
- Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet
- Switching From Ollama And LM Studio To llama.cpp: A Performance Comparison
- MiniMax Releases M2.5 Model with SOTA Coding and Agent Capabilities
- Switching From Ollama and LM Studio to llama.cpp: Performance Benefits
- Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts
- Energy-Based Models Compared Against Frontier AI for Sudoku Solving