Tagged "local-llm-deployment"
- 75% of US Health Systems Are Using AI. Only 18% of That Deployment Is Governed
- Critical Security Flaw: Hackers Can Exploit Ollama Model Uploads to Leak Sensitive Server Data
- Seed3D 2.0
- How to Make Sense of AI
- Intel OpenVINO 2026.1 Integrates llama.cpp with Wildcat Lake and Arc Pro B70
- Developer Replaced GPT-4 with a Local SLM and CI/CD Pipeline Stability Improved
- Llama.cpp's Auto Fit Feature Quietly Reshapes Local AI Inference on Consumer Hardware
- Google's Gemma 4 Finally Makes Local LLM Deployment Compelling for Practitioners
- Malicious GGUF Models Could Trigger Remote Code Execution on SGLang Servers
- Intel Extends AI PC Reach With New Core Ultra Series 3 Launch
- Running DeepSeek R1 Locally: Your Complete Setup Guide
- The AI-Ready Product Data Framework for B2B Commerce
- AI Quota Inflation Is No Token Effort. It's Baked In
- Minisforum Launches N5 Max AI NAS with OpenClaw
- Local AI Isn't Just Ollama—Here's the Ecosystem That Actually Makes It Useful
- Gemma 4 Just Replaced My Whole Local LLM Stack
- We Built a Local Model Arena in 30 Minutes — Infrastructure Mattered More Than the App
- Laimark – 8B LLM That Self-Improves on Consumer GPUs
- Project Glasswing and the ASF: Open-Source's Chance to Win the AI Era
- Book Translator: Two-Pass Local Translation with Self-Reflection via Ollama
- Self-Hosted LLMs Transform Personal Knowledge Management Systems
- Minisforum N5 MAX AI NAS Delivers 126 TOPS with 200TB Storage for Local LLM Workloads
- Developer Shares Golden Stack for Local Coding Assistant Integration Directly Inside Code Editors
- Copilot Rate-Limiting Issues Highlight Cloud AI Service Limitations
- Show HN: SkillCompass – Open-Source Quality Evaluator for Your AI Skills
- Defender – Local Prompt Injection Detection for AI Agents
- ASUS Malaysia to Bring UGen300 USB AI Accelerator in Q2 for Portable On-Device AI Inferencing
- Universal Knowledge Store and Grounding Layer for AI Reasoning Engines
- The Best Local AI Model for Home Assistant Isn't Always the Biggest One
- Self-Hosted LLMs Transform Personal Knowledge Management Systems
- Google's Gemini Nano 4 Offers Faster, Smarter Local Inference Capabilities
- Ollama's Limitations for Production Local LLM Deployments
- LLM Wiki v2: Extended Knowledge Base for LLM Practitioners
- 5 Open-Source Projects Running Transformers on CPUs to GPUs in Pure Java
- Speculative Decoding Made My Local LLM Actually Usable
- Run Qwen3.5 on an Old Laptop: A Lightweight Local Agentic AI Setup Guide
- Ask HN: Local-First Meetings Recorder and Transcriber
- LiteLLM Integrates with Ollama to Simplify Running 100+ Models Locally
- Quantization Strategy Comparison: Balancing Quality and Speed on Consumer Laptops
- Qwen 3.6 Free Model Available via OpenRouter
- Google Previews Gemini Nano 4 for Android AICore with On-Device Capabilities
- Gemma 4 26B MoE Emerges as Optimal All-Around Local Model for Consumer Hardware
- Samsung Launches Galaxy Book6 Series with NVIDIA RTX 5070 and On-Device AI
- NVIDIA and Google Optimize Gemma 4 AI Models for Local RTX Deployment
- GPUs vs. TPUs: Decoding the Powerhouses of AI
- Gemma 4 KV Cache Memory Issues Fixed in llama.cpp
- 5 Useful Docker Containers for Agentic Developers
- Gemma 4 Makes Local AI Agents Practical
- How to Integrate VS Code with Ollama for Local AI Assistance
- Qwen 3.6-Plus Released
- Show HN: Memsearch – Persistent, Cross-Agent, Cross-Session Memory for AI Agents
- Lotte Innovate and DeepX Collaborate on Mass Production of Domestic AI Semiconductors
- git11 Is an AI Workspace for GitHub Engineering Teams
- Satcove – Query 5 AI Models Simultaneously and Get Structured Verdicts
- If Your AI Agent Ran NPM Install During the Axios Attack, You're Compromised
- Local AI Ecosystem Extends Far Beyond Ollama
- Intel's Arc GPU Offers 32GB VRAM for Local AI, But Software Ecosystem Lags Behind
- GPU Passthrough to LXCs in Proxmox Simplifies Local Inference Infrastructure
- ByteShape Releases Qwen 3.5 9B Quantisations with Hardware-Matched Tuning Guide
- PrismML Announces 1-Bit Bonsai: First Commercially Viable 1-Bit LLMs
- I built an O(1) physics engine to stop LLM hallucinations in construction
- Closed Source AI = Neofeudalism
- Select the Right Hardware for Your Local LLM Deployment with This Online Guide
- Dell Technologies Unveils 10 AI PC Models for Business, from Ultralight Laptops to Ultracompact Desktops
- DeepSeek-R1 Chain-of-Thought Debugging: A Developer's Guide
- Google's TurboQuant Shows Memory Constraints Remain Critical for Local LLM Inference
- Samsung Galaxy Book6 Brings Consumer-Grade On-Device AI Hardware to Market
- Samsung Galaxy Book6 Series Brings Intel Core Ultra Chips for On-Device LLM Inference
- Prompt Security Challenges Emerge as Critical Concern for Local LLM Deployments
- Introduction to Nyreth v1.0
- M5 Max Delivers 1.7x Faster Inference Than M3 Max on Qwen 3.5 Models
- GPU Passthrough to LXCs in Proxmox Simplifies Local LLM Deployment
- Acer TravelMate AI Laptops Launch in UAE for Business On-Device Inference
- This Self-Hosted Tool Makes My Local LLMs Feel Exactly Like ChatGPT, but Nothing Leaves My Network
- RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra
- mlx-Code: Run Claude Code Locally with MLX-LM
- Homelab Consolidation: Replacing 3 Models with Single 122B MoE Model on AMD Ryzen AI MAX+
- Book on AI Agents for the Layman: Understanding Agent-Based Systems
- Google's TurboQuant: The Unsexy AI Breakthrough Worth Watching