Tagged "showcase"
-
Singapore's Foreign Minister Builds an AI "Second Brain" Using NanoClaw
-
Pluggable's TBT5-AI: First Thunderbolt Dock Explicitly Targeting Local LLM Workstations
-
Show HN: Phonetic Formatter – Offline English Text to IPA on iPhone and iPad
-
SiGit Code: Local-First Coding Agent
-
Rust Open-Source Headless Browser for AI Agents and Web Scraping
-
Run a Local LLM Server on Raspberry Pi with Remote Access Capabilities
-
Show HN: A Karpathy-Style LLM Wiki Your Agents Maintain
-
GPU Passthrough to LXCs in Proxmox Outperforms VMs and Simplifies Local AI Infrastructure
-
I Built a Local AI Stack With 5 Docker Containers, and Now I'll Never Pay for ChatGPT Again
-
Building Real-World On-Device AI with LiteRT and NPU
-
AI Agent Designs a RISC-V CPU Core from Scratch
-
Show HN: We built an OCR server that can process 270 dense images/s on a 5090
-
Cortex Auth – Rust secrets vault for AI agents (exec-based injection)
-
Tesseron: New API Framework for AI Agents with Developer-Defined Configuration
-
Sarvam Edge: India's Offline AI Model Runs on Phones and Laptops Without Internet
-
Developer Turns Phone Into Local LLM Server with Vision, Voice, and Tool Calling Capabilities
-
Cursor-Autoresearch: AI Research Automation Port for Local Workflows
-
ZeusHammer: Built an AI Agent That Thinks Locally
-
Complete Local Coding Assistant Stack Running Inside Your Editor
-
Waterloo's Live AI-Goose Tracker: Real-Time Edge Vision
-
PCMind: Local AI Analysis of Docs, Audio, Video and Images
-
Memjar: Uncompromising Local-First Second Brain
-
LlaMa.cpp Robot Wars
-
Kilo is the VS Code Extension That Actually Works with Every Local LLM
-
Show HN: I Can't Write Python. It Works Anyway – Local LLM Automation
-
115 TOPS in 0.67L: CHUWI AuBox X Packs On-Device AI Power Into a Palm-Sized Mini PC
-
Build a More Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw
-
BibCrit – LLM Grounded in ETCBC Corpus Data for Biblical Textual Criticism
-
Kilo Is the VS Code Extension That Actually Works With Every Local LLM I Throw at It
-
After Two Months of Open WebUI Updates, I'd Pick It Over ChatGPT's Interface for Local LLMs
-
Show HN: An MCP server that lets AI compose music on a hardware synth
-
Community Computer: Collaborative Autoresearch on a Peer-to-Peer Network
-
ChatMCP – Connect your AI browser chats to your coding agents
-
Building a Voice AI Wearable in a Casio F91W with Whisper and BLE
-
Open WebUI Emerges as Superior Interface for Local LLMs After Two Months of Active Development
-
N8n, Dify, and Ollama Emerge as Leading Self-Hosted AI Automation Stack
-
Book Translator: Two-Pass Local Translation with Self-Reflection via Ollama
-
Bonsai 1.7B in the Browser: A 290MB 1-bit LLM on WebGPU
-
Xiaomi 12 Pro Converted Into 24/7 Headless AI Server With Ollama and Gemma4
-
Slop-scan – Detect AI Code Slop Patterns in Your Repo
-
SigMap – Shrink AI Coding Context 97% with Auto-Scaling Token Budget
-
Self-Hosted LLMs Transform Personal Knowledge Management Systems
-
Noi Enables Running ChatGPT and Claude Side-by-Side on Your Desktop
-
Running Gemma 4 on an iPhone 13 Pro
-
GBrain – System to Make Your AI Agent Better Reflect You
-
DotLLM – Building an LLM Inference Engine in C#
-
DGX Spark Setup Guide: Running vLLM and PyTorch for Local LLM Inference Backend
-
DFlash Doubles Token Generation Speed of Qwen3.5 27B on Mac M5 Max
-
Ubiquiti UniFi G6 Turret 4K Camera Features On-Device AI Processing at $199 Price Point
-
Talking to a Local LLM in the Firefox Sidebar
-
Minisforum N5 MAX AI NAS Delivers 126 TOPS with 200TB Storage for Local LLM Workloads
-
MiniMax M2.7 Achieves SOTA Performance Under 64GB on Mac with TQ Quantization
-
Local LLM Connected to Home Assistant via MCP Now Enables Autonomous Smart Home Management
-
Developer Shares Golden Stack for Local Coding Assistant Integration Directly Inside Code Editors
-
Build a Sovereign Local AI Stack: Ollama and Open WebUI and Pgvector 2026
-
Show HN: SkillCompass – Open-Source Quality Evaluator for Your AI Skills
-
Self-Hosted LLM Took Personal Knowledge Management System to the Next Level
-
Defender – Local Prompt Injection Detection for AI Agents
-
Unsloth Completes Comprehensive MiniMax M2.7 GGUF Quantization Suite
-
Universal Knowledge Store and Grounding Layer for AI Reasoning Engines
-
Self-Hosted LLM Elevates Personal Knowledge Management Systems to New Levels
-
MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platforms for Complex AI Applications
-
Google Gemma 4 Delivers Exceptional Speed and Accuracy for Local Inference
-
DFlash Speculative Decoding Achieves 3.3x Speedup on Apple Silicon
-
I Gave My AI Shell Access and Felt Uneasy – So I Sandboxed It
-
Self-Hosted LLMs Transform Personal Knowledge Management Systems
-
Parakeet Streaming ASR on Apple Silicon via CoreML
-
AIYO Wisper: Local Voice-to-Text for macOS Using WhisperKit
-
Self-Installing Skill Manager for AI Agents
-
Tether Launches QVAC SDK for Cross-Platform Local AI Development
-
5 Open-Source Projects Running Transformers on CPUs to GPUs in Pure Java
-
AI Scans 400k Reddit Posts to Flag Overlooked GLP-1 Side Effects
-
VoxCPM2: New Open-Source TTS Model with Voice Cloning and Design
-
Speculative Decoding Made My Local LLM Actually Usable
-
Running a 1.7B Parameters LLM on an Apple Watch
-
I Replaced My Local LLM With a Model Half Its Size and Got Better Results — and It Wasn't About the Parameters
-
Gemini-CLI, Llama.cpp, and Qwen3.5 Running on NVIDIA Jetson TK1
-
Google AI Edge Gallery Showcases Offline Inference with Gemma 4
-
Google's Gemma 4 Brings Powerful On-Device AI to Android and iOS
-
Show HN: Willitrun – Check if Any ML Model Runs on Any Device (Benchmark-Backed)
-
StyleSeed – Design Rules That Make AI Coding Tools Produce Professional UI
-
Quansloth Using Google's Turboquant Breaks the VRAM Wall for Local LLMs
-
Octopoda: Open Source Memory Layer for Fully Offline AI Agents
-
MemPalace, the Highest-Scoring AI Memory System Ever Benchmarked
-
Gemma 4 26B Achieves Impressive Local Performance With Proper Configuration
-
CricketBrain: Neuromorphic Signal Processor in Rust (0.175us/step, 944 bytes)
-
METATRON: Open-Source AI Penetration Testing with Local LLMs
-
Show HN: Lightweight LLM Tracing Tool with CLI
-
HunyuanOCR 1B: High-Quality OCR Now Viable on Budget Consumer Hardware
-
Real-time Multimodal AI on Apple Silicon: Gemma E2B Demo Shows Practical Edge Deployment
-
Gemma 4 31B Achieves Exceptional Performance on Local Hardware
-
Show HN: Turn Photos Into Wordle Puzzles with AI That Runs 100% in Your Browser
-
Vektor – Local-First Associative Memory for AI Agents
-
Unpaved: Audit Toolkit for AI Developer Tool Bias in Global South Contexts
-
Satsgate: Monetize AI Agents and APIs with Lightning L402 Protocol
-
Qwen 3.5 397B Reduced to 35% Parameters With Usable Quality on 96GB GPU
-
GMKtec NucBox K17 Launches with 97 TOPS AI Performance for Local Inference
-
Gemma 4 26B MoE Emerges as Optimal All-Around Local Model for Consumer Hardware
-
Nex Life Logger: Local Activity Tracker with AI Agent Integration
-
Mixed Precision Quantization on MLX with TurboQuant Implementation
-
Kokoro TTS Achieves 20× Realtime Speed on CPU-Only On-Device Inference
-
Free AI Video Clipper Using Scene and Speech-Based Segmentation
-
Autonet: Decentralized AI Training with Constitutional Governance
-
SkillCompass – Diagnose and Improve AI Agent Skills Across 6 Dimensions
-
OpenUMA – Apple-Style Unified Memory for x86 AI Inference
-
Gemma 4 Shows Strong Reasoning Performance with Thinking Tokens
-
Gemma 4 2B Successfully Runs on Raspberry Pi 5
-
Apfel – The Free AI Already on Your Mac
-
TurboQuant Enables Qwen 3.5-27B on 16GB Consumer GPUs
-
SmolLM2-360M Running on Samsung Galaxy Watch 4 with 74% Memory Reduction
-
Show HN: Memsearch – Persistent, Cross-Agent, Cross-Session Memory for AI Agents
-
TinyGPU Adds Mac Support for External Nvidia GPU Acceleration
-
git11 Is an AI Workspace for GitHub Engineering Teams
-
Show HN: Extra-Platforms, Python Library to Detect OS, Arch, Shell, CI, AI
-
Satcove – Query 5 AI Models Simultaneously and Get Structured Verdicts
-
Qwen 3.5-27B Demonstrates Superior Performance vs Gemini 3.1 Pro and GPT-5.3
-
Claw64 – Full Agentic Loop in <4KB on Commodore 64
-
Orca – Executable skills and capabilities for AI agent workflows
-
I built an O(1) physics engine to stop LLM hallucinations in construction
-
Miasma: A Tool to Protect Data from AI Web Scrapers
-
Lat.md: Agent Lattice – A Knowledge Graph for Your Codebase in Markdown
-
IBM Granite 4.0 3B Vision: Compact Enterprise-Grade Document AI
-
DaVinci-MagiHuman: Open-Source AI Model for Realistic Video Generation
-
Qwen3 512k Context via TurboQuant on Mac mini
-
M5 Max Delivers 1.7x Faster Inference Than M3 Max on Qwen 3.5 Models
-
Reverse-Engineering the Apollo 11 Code with AI
-
This Wearable Runs an On-Device AI With 2-Week Battery Life
-
This Self-Hosted Tool Makes My Local LLMs Feel Exactly Like ChatGPT, but Nothing Leaves My Network
-
RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra
-
Coding Implementation to Run Qwen3.5 Reasoning Models Distilled With Claude-Style Thinking Using GGUF and 4-Bit Quantization
-
mlx-Code: Run Claude Code Locally with MLX-LM
-
Homelab Consolidation: Replacing 3 Models with Single 122B MoE Model on AMD Ryzen AI MAX+
-
See What Your AI Agents Are Doing: Multi-Agent Observability Tool
-
RF-DETR Nano and YOLO26 Enable On-Device Object Detection on Smartphones
-
NVIDIA Releases GPT-OSS-Puzzle-88B, a Deployment-Optimized Model
-
MCP-Manticore: Let Your AI Assistant Write Manticore Queries for You
-
Show HN: Beforeyouship – Pre-Build Tool to Estimate LLM Cost
-
Liquid AI's LFM2-24B Achieves 50 Tokens/Second in Web Browser via WebGPU
-
Operating Systems. One USB. ZFS on Root. AI-Powered. Free
-
Running an Open-Weight LLM Locally on an Apple Watch
-
Show HN: Open Agent Spec – Treat AI Agents Like Typed Functions, Not Prompt Chains
-
OmniCoder v2 Released: Improved Code Generation for Local Deployment
-
Private Brain LLM Setup on Windows PC Eliminates Need for Paid Cloud Services
-
Researcher Successfully Runs Local LLMs on Legacy "Dead" GPU With Surprising Results
-
Council: A Structured Deliberation Protocol Across Diverse AI Models
-
Ultra-Large 400B-Class LLM Runs on iPhone in Test
-
Velr: Embedded Property-Graph Database for Local LLM Applications
-
Self-Hostable AI Agents and Internal Software Framework Released
-
Running a Private AI Brain on Windows PC as Alternative to Cloud Services
-
Claude Usage Monitor: Track API Usage with macOS Menu Bar App
-
Powerful AI Search Engine Built on Single GeForce RTX 5090
-
Developer Builds Fully Local Multi-Agent System Using vLLM and Parallel Inference
-
ik_llama.cpp Fork Delivers 26x Faster Prompt Processing on Qwen 3.5 27B
-
Careless Whisper – Personal Local Speech to Text
-
Brezn – Decentralized Local Communication
-
AI Playground for Developers Built in Vite and Python
-
Running an AI Agent on a 448KB RAM Microcontroller
-
MacinAI Local brings functional LLM inference to classic Macintosh hardware
-
Atuin v18.13 – Better Search, a PTY Proxy, and AI for Your Shell
-
SwarmHawk – Open-Source CLI for Vulnerability Scanning with AI Synthesis
-
Ultra-Compact 28M Parameter Models Show Promise for Specialized Domain Tasks
-
Qwen 3.5 Emerges as Top Performer for Local Deployment with Extensive Quantization Options
-
Llamafile 0.10 Released with GPU Support and Rebuilt Core
-
Claude Code Permissions Hook – Delegate Permission Approval to LLM
-
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet
-
Kilo Is the VS Code Extension That Actually Works With Every Local LLM I Throw At It
-
Unsloth Studio: Open-Source Web UI for Training and Running LLMs Locally
-
Skills Manager – manage AI agent skills across Claude, Cursor, Copilot
-
LucidShark – Local-first, open-source quality and security gate
-
Custom GPU Multiplexer Achieves 0.3ms Model Switching on Legacy Hardware
-
Auto-retry Claude Code on subscription rate limits (zero deps, tmux-based)
-
Browser-Based Transcription Tools
-
Show HN: Process Mining for AI Agent Systems
-
OpenJarvis: Local-First AI Agents That Run Entirely On-Device
-
Mistral Small 4 119B Released with NVFP4 Quantisation Support
-
Local Qwen Models Master Browser Automation Through Iterative Replanning
-
KAIST Develops World's First Hyper-Personalized On-Device AI Chip
-
OpenClaw Isn't the Only Raspberry Pi AI Tool—Here Are 4 Others You Can Try This Week
-
Qwen 3.5 122B Demonstrates Exceptional Reasoning for Local Deployment
-
OmniCoder-9B: Efficient Coding Model for 8GB GPUs
-
Show HN: Merrilin.ai – Code Blocks in Your Books, Finally
-
LoKI – Local AI Assistant for Linux and WSL
-
This External GPU Enclosure Tries to Break Cloud Dependence for Local AI Inference
-
Dictare – Open-source Voice Layer for AI Coding Agents (100% Local)
-
Show HN: Generate, Clean, and Prepare LLM Training Data, All-in-One
-
Custom AI Smart Speaker
-
VS Code Agent Kanban – Task Management for AI-Assisted Development
-
Nota AI to Showcase End-to-End On-Device AI Optimization at Embedded World 2026
-
Nemotron 9B Powers Large-Scale Local Inference: Patent Classification and Real-Time Applications
-
Gyro-Claw – Secure Execution Runtime for AI Agents
-
Engram – Open-Source Persistent Memory for AI Agents
-
commitgen-cc – Generate Conventional Commit Messages Locally with Ollama
-
VoiceShelf: Fully Offline Android Audiobook Reader Using Kokoro TTS
-
IBM Granite 4.0 1B Speech Model Released for Multilingual Speech Recognition
-
Qwen3.5 122B Achieves 25 tok/s on 72GB VRAM Setup
-
Researchers Develop Persistent Memory System for Local LLMs—No RAG Required
-
Show HN: Anonymize LLM traffic to dodge API fingerprinting and rate-limiting
-
Agent System – 7 specialized AI agents that plan, build, verify, and ship code
-
VaultAI – 42 AI Models on a Portable SSD, Works Offline for $399
-
TemplateFlow – Build AI Workflows, Not Prompts
-
SanityBoard Adds 27 New Model Evaluations Including Qwen 3.5 Plus, GLM 5, and Gemini 3.1 Pro
-
Qwen3 Coder Next 8FP Demonstrates Exceptional Long-Context Performance on 128GB System
-
I Stopped Paying for ChatGPT and Built a Private AI Setup That Anyone Can Run
-
Using Local LLMs With Self-Hosted Tools to Manage Documents in Paperless-ngx
-
Kitten TTS V0.8 Released: New State-of-the-Art Super-Tiny TTS Model Under 25 MB
-
Show HN: Forked – A Local Time-Travel Debugger for OpenClaw Agents
-
Self-Hosted Local LLMs for Document Management with Paperless-ngx
-
GPT-OSS 20B Now Runs 100% Locally in Browser via WebGPU
-
GNOME's AI Assistant Newelle Adds llama.cpp Support and Command Execution
-
Ring-1T-2.5 Released with SOTA Deep Thinking Performance
-
Godot MCP Gives AI Assistants Full Access to Game Engine Editor
-
Developer Creates Custom Local AI Headshot Generator After Commercial Solutions Fail