Tagged "local-inference"
- Show HN: Pluckr – LLM-Powered HTML Scraper That Caches Selectors and Auto-Heals
- How AI is Redefining Price and Performance in Modern Laptops
- Mirai Tech Raises $10 Million for On-Device AI Innovation
- Enhanced Interface Speed Enables High-Performance On-Device AI Features in Smartphones
- Anthropic Has Never Open-Sourced an LLM: Implications for Local Deployment Strategy
- Qwen3 Demonstrates Advanced Voice Cloning via Embeddings
- Open-Source Framework Achieves Gemini 3 Deep Think Level Performance Through Local Model Scaffolding
- Local GPT-OSS 20B Model Demonstrates Practical Agentic Capabilities
- Open-Source llama.cpp Finds Long-Term Home at Hugging Face
- Yet Another Fix Coming for Older AMD GPUs on Linux – Thanks to Valve Developer
- Show HN: Horizon – My AI-Powered Personal News Aggregator and Summarizer
- GGML Joins Hugging Face: What This Means for Local Model Optimization
- AI PCs Explained: 7 Critical Truths About NPUs and Privacy
- [Release] Ouro-2.6B-Thinking: ByteDance's Recurrent Model Now Runnable Locally
- Apple Researchers Develop On-Device AI Agent That Interacts With Apps for You
- Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB
- Tailscale Releases New Tool to Prevent Sensitive Data Leakage to Cloud AI Services
- GLM-5 Technical Report: DSA Innovation Reduces Training and Inference Costs
- Can We Leverage AI/LLMs for Self-Learning?
- Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet
- Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
- Show HN: PgCortex – AI enrichment per Postgres row, zero transaction blocking
- High Bandwidth Flash Memory Could Alleviate VRAM Constraints in Local LLM Inference
- Asus ExpertBook B3 G2 Laptop Features Ryzen AI 9 HX 470 CPU in 1.41kg Ultraportable Form Factor
- GPU-Accelerated DataFrame Library for Local Inference Workloads
- Alibaba Unveils Major AI Model Upgrade Ahead of DeepSeek Release
- First Vibecoded AI Operating System for Local Deployment
- Optimal llama.cpp Settings Found for Qwen3 Coder Next Loop Issues
- Ming-flash-omni-2.0: 100B MoE Omni-Modal Model Released
- Running Mistral-7B on Intel NPU Achieves 12.6 Tokens/Second