Tagged "latency-reduction"

On-Device AI vs Cloud AI: Which One Should Power Your Next Phone? 20 July 2026
Google's LiteRT.js Enables On-Device AI Inference in Web Browsers 14 July 2026
Amazon Confirms On-Device AI Capabilities in New AZ3 Chip for Alexa 5 July 2026
Run Llama.cpp In-Process from Java with Project Panama FFM 5 June 2026
Microsoft Expands On-Device AI Models in Edge Browser with New APIs for Local Inference 3 June 2026
Money Printer Pro – Open-source AI Content Generator 28 May 2026
Show HN: Interactive and Stylized AI Chat Chrome Extension 22 May 2026
Adobe Photoshop Update Brings On-Device AI Processing 21 May 2026
Open-Source Local LLM Emerges as Viable Cloud AI Competitor 15 May 2026
Chrome Automatically Downloads 4GB AI Model for Local Processing 14 May 2026
What If AI Systems Weren't Chatbots? 13 May 2026
I Built My Second Brain for Meetings. No Monthly Subscription 11 May 2026
Small On-Device AI Model Beats Claude Sonnet 4.5 and GPT-5 10 May 2026
Ask HN: Real life autonomous AI Agents 7 May 2026
Supercharging LLM Inference on Google TPUs: Achieving 3X Speedups With Diffusion-Style Speculative Decoding 5 May 2026
NordVPN Adds On-Device AI Voice Detector to Chrome Extension to Identify Synthetic Audio 4 May 2026
The Tooling Problem in Local AI Is Finally Getting Solved and That Matters as Much as the Models 3 May 2026
llama.cpp Merges Speculative Checkpointing for Major Inference Speed Boost 20 April 2026
Speculative Decoding Achieves 29% Speed Boost for Gemma-4 31B 13 April 2026
DFlash Speculative Decoding Achieves 3.3x Speedup on Apple Silicon 12 April 2026
Tether Launches QVAC SDK for Cross-Platform Local AI Development 10 April 2026
Apple Brings Enhanced On-Device AI Features to iPhone 6 April 2026
Apfel – The Free AI Already on Your Mac 3 April 2026
HP Launches Copilot+ PCs in India with On-Device AI Capabilities for Local Inference 28 March 2026
RF-DETR Nano and YOLO26 Enable On-Device Object Detection on Smartphones 26 March 2026