Tagged "offline-deployment"
-
Sarvam Brings AI to Feature Phones, Cars, and Smart Glasses
-
Running Local LLMs and VLMs on Arduino UNO Q with yzma
-
Mihup and Qualcomm Collaborate to Advance Secure On-Device Voice AI for BFSI
-
Complete Offline AI System: Voice Control and Smart Home via Local LLM and Radio Without Internet
-
Local Vision-Language Models for Document OCR and PII Detection in Privacy-Critical Workflows
-
LayerScale Launches Inference Engine Faster Than vLLM, SGLang, and TRT-LLM
-
Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB
-
GPT4All Replaces Ollama On Mac After Quick Trial
-
Clipthesis: Free Local App for Video Tagging and Search Across Drives
-
Why My Country's AI Scene Is Built on Sand
-
Tailscale Releases New Tool to Prevent Sensitive Data Leakage to Cloud AI Services
-
Show HN: Shiro.computer Static Page, Unix/NPM Shimmed to Host Claude Code
-
Sarvam AI Launches Edge Model to Challenge Major AI Players with Local-First Approach
-
Alibaba's Qwen3.5-397B Achieves #3 Position in Open Weights Model Rankings
-
Qualcomm Ventures Positions India as Blueprint for Affordable On-Device AI Infrastructure
-
OpenClaw Refactored in Go, Runs on $10 Hardware
-
Same INT8 Model Shows 93% to 71% Accuracy Variance Across Snapdragon Chipsets
-
GLM-5 Technical Report: DSA Innovation Reduces Training and Inference Costs
-
Matmul-Free Language Model Trained on CPU in 1.2 Hours
-
Cloudflare Releases Agents SDK v0.5.0 with Rust-Powered Infire Engine for Edge Inference
-
Can We Leverage AI/LLMs for Self-Learning?
-
AMD Announces Day 0 Support for Qwen 3.5 LLM on Instinct GPUs
-
Self-Hosted AI: A Complete Roadmap for Beginners
-
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet
-
Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
-
Open-Source Models Now Comprise 4 of Top 5 Most-Used Endpoints on OpenRouter
-
I attacked my own LangGraph agent system. All 6 attacks worked
-
High Bandwidth Flash Memory Could Alleviate VRAM Constraints in Local LLM Inference
-
Cohere Releases Tiny Aya: Efficient 3.3B Multilingual Model for 70+ Languages
-
Chinese AI Chipmaker Axera Semiconductor Plans $379 Million Hong Kong IPO for Edge Inference Hardware
-
ASUS Zenbook 14 Launches in India with AI-Capable Hardware, Starting at Rs 1,15,990
-
Asus ExpertBook B3 G2 Laptop Features Ryzen AI 9 HX 470 CPU in 1.41kg Ultraportable Form Factor
-
Ask HN: What is the best bang for buck budget AI coding?
-
Sourdine: Open-Source macOS App for 100% Local AI Transcription
-
Security Alert: Open Claw Designed for Self-Hosting, Stop Sharing Credentials
-
InitRunner: YAML-Based AI Agent Framework with RAG and Memory
-
GPU-Accelerated DataFrame Library for Local Inference Workloads
-
Alibaba Unveils Major AI Model Upgrade Ahead of DeepSeek Release
-
WinClaw: Windows-Native AI Assistant with Office Automation
-
First Vibecoded AI Operating System for Local Deployment
-
Simile AI Raises $100M Series A for Local AI Infrastructure
-
MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace
-
Ming-flash-omni-2.0: 100B MoE Omni-Modal Model Released
-
Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues
-
Running Your Own AI Assistant for €19/Month: Complete Self-Hosting Guide
-
Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues
-
Samsung's REAM: Alternative Model Compression Technique
-
Running Mistral-7B on Intel NPU Achieves 12.6 Tokens/Second
-
OpenClaw with vLLM Running for Free on AMD Developer Cloud
-
Researchers Find 175,000 Publicly Exposed Ollama AI Servers Across 130 Countries
-
Memio Launches AI-Powered Knowledge Hub for Android with Local Processing
-
GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks
-
I Tried a Claude Code Rival That's Local, Open Source, and Completely Free
-
NAS System Achieves 18 tok/s with 80B LLM Using Only Integrated Graphics
-
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts
-
5 Practical Ways to Use Local LLMs with MCP Tools
-
Building a RAG Pipeline on 2M+ Pages: EpsteinFiles-RAG Project
-
Energy-Based Models Compared Against Frontier AI for Sudoku Solving
-
DeepSeek Launches Model Update with 1M Context Window
-
Carmack Proposes Using Long Fiber Lines as L2 Cache for Streaming AI Data
-
Arm SME2 Technology Expands CPU Capabilities for On-Device AI
-
Anthropic Releases Claude Opus 4.6 Sabotage Risk Assessment
-
Community Member Builds 144GB VRAM Local LLM Powerhouse