Tagged "production-deployment"
- I built Rubric, an open source Sentry for AI. Looking for beta testers
- Qwen 3.5 Models: Optimal Settings and Reduced Overthinking Configuration
- LM Studio Releases Reworked Plugins with Fully Local Web Research
- How to Build a Self-Hosted AI Server with LM Studio: Step-by-Step Guide
- Nvidia Nemotron Cascade 2 30B Emerges as Powerful Alternative to Qwen Models
- Qwen 3.5 Emerges as Top Performer for Local Deployment with Extensive Quantization Options
- Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet
- LucidShark – Local-first, open-source quality and security gate
- Auto-retry Claude Code on subscription rate limits (zero deps, tmux-based)
- Open-Source LLMs Rapidly Displacing Proprietary SOTA Models
- NVIDIA Updates Nemotron 3 122B License, Removes Deployment Restrictions
- Nvidia's Nemotron 3 Super: Understanding the Significance for Local LLM Deployment
- Nvidia Pushes Jetson as Edge Hub for Open AI Models
- MeepaChat – Slack for AI Agents (iOS, macOS, Web / Cloud, Self-Hosted)
- Show HN: Detect When an LLM Silently Changes Behavior for the Same Prompt
- Ex-Manus Backend Lead Shares: Moving Beyond Function Calling in Agent Design
- Qwen 3.5-35B Uncensored GGUF Models Now Available
- NVIDIA Jetson Brings Open Models to Life at the Edge
- Gyro-Claw – Secure Execution Runtime for AI Agents
- OpenSpec: Spec-driven development (SDD) for AI coding assistants
- Continuum – CI Drift Guard for LLM Workflows
- AgentLens – Open-Source Observability for AI Agents
- Qwen3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Across Nearly All Quantisation Levels
- Accuracy vs. Speed in Local LLMs: Finding Your Sweet Spot
- Show HN: MCP Server for AI Compliance Documentation
- The Complete Developer's Guide to Running LLMs Locally: From Ollama to Production
- Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search
- Enterprise Infrastructure Guide: Running Local LLMs for 70-150 Developers
- South Korea to Launch $687 Million Project to Develop On-Device AI Semiconductors
- Qwen3-Code-Next Proves Practical for Local Development: Real-World Coding Tasks on Mac Studio
- Open-Source llama.cpp Finds Long-Term Home at Hugging Face
- The Complete Stack for Local Autonomous Agents: From GGML to Orchestration
- Ollama 0.17 Released With Improved OpenClaw Onboarding
- 24 Simultaneous Claude Code Agents on Local Hardware
- Qwen3 Coder Next 8FP Demonstrates Exceptional Long-Context Performance on 128GB System
- Ollama Production Deployment: Docker-Compose Setup Guide
- NVIDIA Releases Dynamo v0.9.0: Infrastructure Overhaul With FlashIndexer and Multi-Modal Support
- Show HN: Forked – A Local Time-Travel Debugger for OpenClaw Agents
- Self-Hosted AI: A Complete Roadmap for Beginners
- I broke into my own AI system in 10 minutes. I built it
- Researchers Find 175,000 Publicly Exposed Ollama AI Servers Across 130 Countries