Tagged "production-deployment"

I built Rubric, an open source Sentry for AI. Looking for beta testers 24 March 2026
Qwen 3.5 Models: Optimal Settings and Reduced Overthinking Configuration 23 March 2026
LM Studio Releases Reworked Plugins with Fully Local Web Research 23 March 2026
How to Build a Self-Hosted AI Server with LM Studio: Step-by-Step Guide 23 March 2026
Nvidia Nemotron Cascade 2 30B Emerges as Powerful Alternative to Qwen Models 22 March 2026
Qwen 3.5 Emerges as Top Performer for Local Deployment with Extensive Quantization Options 20 March 2026
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet 19 March 2026
LucidShark – Local-first, open-source quality and security gate 18 March 2026
Auto-retry Claude Code on subscription rate limits (zero deps, tmux-based) 18 March 2026
Open-Source LLMs Rapidly Displacing Proprietary SOTA Models 16 March 2026
NVIDIA Updates Nemotron 3 122B License, Removes Deployment Restrictions 16 March 2026
Nvidia's Nemotron 3 Super: Understanding the Significance for Local LLM Deployment 15 March 2026
Nvidia Pushes Jetson as Edge Hub for Open AI Models 12 March 2026
MeepaChat – Slack for AI Agents (iOS, macOS, Web / Cloud, Self-Hosted) 12 March 2026
Show HN: Detect When an LLM Silently Changes Behavior for the Same Prompt 12 March 2026
Ex-Manus Backend Lead Shares: Moving Beyond Function Calling in Agent Design 12 March 2026
Qwen 3.5-35B Uncensored GGUF Models Now Available 11 March 2026
NVIDIA Jetson Brings Open Models to Life at the Edge 11 March 2026
Gyro-Claw – Secure Execution Runtime for AI Agents 9 March 2026
OpenSpec: Spec-driven development (SDD) for AI coding assistants 8 March 2026
Continuum – CI Drift Guard for LLM Workflows 3 March 2026
AgentLens – Open-Source Observability for AI Agents 1 March 2026
Qwen3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Across Nearly All Quantisation Levels 28 February 2026
Accuracy vs. Speed in Local LLMs: Finding Your Sweet Spot 28 February 2026
Show HN: MCP Server for AI Compliance Documentation 27 February 2026
The Complete Developer's Guide to Running LLMs Locally: From Ollama to Production 26 February 2026
Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search 24 February 2026
Enterprise Infrastructure Guide: Running Local LLMs for 70-150 Developers 24 February 2026
South Korea to Launch $687 Million Project to Develop On-Device AI Semiconductors 23 February 2026
Qwen3-Code-Next Proves Practical for Local Development: Real-World Coding Tasks on Mac Studio 23 February 2026
Open-Source llama.cpp Finds Long-Term Home at Hugging Face 23 February 2026
The Complete Stack for Local Autonomous Agents: From GGML to Orchestration 23 February 2026
Ollama 0.17 Released With Improved OpenClaw Onboarding 22 February 2026
24 Simultaneous Claude Code Agents on Local Hardware 21 February 2026
Qwen3 Coder Next 8FP Demonstrates Exceptional Long-Context Performance on 128GB System 20 February 2026
Ollama Production Deployment: Docker-Compose Setup Guide 20 February 2026
NVIDIA Releases Dynamo v0.9.0: Infrastructure Overhaul With FlashIndexer and Multi-Modal Support 20 February 2026
Show HN: Forked – A Local Time-Travel Debugger for OpenClaw Agents 20 February 2026
Self-Hosted AI: A Complete Roadmap for Beginners 17 February 2026
I broke into my own AI system in 10 minutes. I built it 17 February 2026
Researchers Find 175,000 Publicly Exposed Ollama AI Servers Across 130 Countries 12 February 2026