Tagged "optimization"

MacinAI Local brings functional LLM inference to classic Macintosh hardware 21 March 2026
AI's Impact on Mathematics Analogous to Car's Impact on Cities 20 March 2026
You're Using Your Local LLM Wrong If You're Prompting It Like a Cloud LLM 18 March 2026
Auto-retry Claude Code on subscription rate limits (zero deps, tmux-based) 18 March 2026
India's Mobile-First AI Strategy Could Accelerate Local Inference Adoption in Emerging Markets 15 March 2026
Linux 7.0 AMDGPU Fixing Idle Power Issue For RDNA4 GPUs After Compute Workloads 13 March 2026
Show HN: VmExit – An Experiment in AI-Native Computing 12 March 2026
Quantization Explained: Q4_K_M vs AWQ vs FP16 for Local LLMs 12 March 2026
SK Hynix Completes Qualification for LPDDR6 Memory Optimized for AI Inference 11 March 2026
Llama.cpp Prompt Processing Optimization: Ubatch Size Configuration Guide 8 March 2026
ETH Zurich Research Challenges Context-Length Assumptions in LLM Agents 8 March 2026
OpenWrt 25.12.0 – Stable Release 4 March 2026
Building a Dependency-Free GPT on a Custom OS 3 March 2026
Critical: Qwen 3.5 Requires BF16 KV Cache, Not FP16 for Accurate Inference 2 March 2026
How to Run High-Performance LLMs Locally on the Arduino UNO Q 1 March 2026
Bare-Metal LLM Inference: UEFI Application Boots Directly Into LLM Chat 1 March 2026
Unsloth Dynamic 2.0 GGUFs 28 February 2026
Accuracy vs. Speed in Local LLMs: Finding Your Sweet Spot 28 February 2026
Snapdragon 8 Elite Gen 5 for Galaxy Official: 5 Key Improvements that Push the Boundaries 27 February 2026
On-Device AI in Mobile Apps: What Should Run on the Phone vs the Cloud (A 2026 Decision Guide) 27 February 2026
Extracting 100K Concepts from an 8B LLM 27 February 2026
Every agent framework has the same bug – prompt decay. Here's a fix 26 February 2026
DeepSeek Releases DualPath: Addressing Storage Bandwidth Bottlenecks in Agentic Inference 26 February 2026
Mirai Announces $10M to Advance On-Device AI Performance for Consumer Devices 25 February 2026
Advanced Quantization Techniques Show Surprising Performance Gains Over Standard Methods 25 February 2026
Show HN: A Ground Up TLS 1.3 Client Written in C 24 February 2026
Which Web Frameworks Are Most Token-Efficient for AI Agents? 23 February 2026
Wave Field LLM Achieves O(n log n) Scaling: 825M Model Trained to 1B Parameters in 13 Hours 23 February 2026
Custom Portable Workstation Optimized for Local AI Inference Builds 23 February 2026
A Tool to Tell You What LLMs Can Run on Your Machine 23 February 2026
Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search 23 February 2026
Breaking the Speed Limit: Strategies for 17k Tokens/Sec Local Inference 23 February 2026
Yet Another Fix Coming for Older AMD GPUs on Linux – Thanks to Valve Developer 23 February 2026
AI-Powered Reverse-Engineering of Rosetta 2 for Linux 23 February 2026
GGML Joins Hugging Face: What This Means for Local Model Optimization 22 February 2026
DietPi Released a New Version v10.1 22 February 2026