Tagged "latency-reduction"
- Money Printer Pro – Open-source AI Content Generator
- Show HN: Interactive and Stylized AI Chat Chrome Extension
- Adobe Photoshop Update Brings On-Device AI Processing
- Open-Source Local LLM Emerges as Viable Cloud AI Competitor
- Chrome Automatically Downloads 4GB AI Model for Local Processing
- What If AI Systems Weren't Chatbots?
- I Built My Second Brain for Meetings. No Monthly Subscription
- Small On-Device AI Model Beats Claude Sonnet 4.5 and GPT-5
- Ask HN: Real life autonomous AI Agents
- Supercharging LLM Inference on Google TPUs: Achieving 3X Speedups With Diffusion-Style Speculative Decoding
- NordVPN Adds On-Device AI Voice Detector to Chrome Extension to Identify Synthetic Audio
- The Tooling Problem in Local AI Is Finally Getting Solved and That Matters as Much as the Models
- llama.cpp Merges Speculative Checkpointing for Major Inference Speed Boost
- Speculative Decoding Achieves 29% Speed Boost for Gemma-4 31B
- DFlash Speculative Decoding Achieves 3.3x Speedup on Apple Silicon
- Tether Launches QVAC SDK for Cross-Platform Local AI Development
- Apple Brings Enhanced On-Device AI Features to iPhone
- Apfel – The Free AI Already on Your Mac
- HP Launches Copilot+ PCs in India with On-Device AI Capabilities for Local Inference
- RF-DETR Nano and YOLO26 Enable On-Device Object Detection on Smartphones