Tagged "optimization"
- MacinAI Local brings functional LLM inference to classic Macintosh hardware
- AI's Impact on Mathematics Analogous to Car's Impact on Cities
- You're Using Your Local LLM Wrong If You're Prompting It Like a Cloud LLM
- Auto-retry Claude Code on subscription rate limits (zero deps, tmux-based)
- India's Mobile-First AI Strategy Could Accelerate Local Inference Adoption in Emerging Markets
- Linux 7.0 AMDGPU Fixing Idle Power Issue For RDNA4 GPUs After Compute Workloads
- Show HN: VmExit – An Experiment in AI-Native Computing
- Quantization Explained: Q4_K_M vs AWQ vs FP16 for Local LLMs
- SK Hynix Completes Qualification for LPDDR6 Memory Optimized for AI Inference
- Llama.cpp Prompt Processing Optimization: Ubatch Size Configuration Guide
- ETH Zurich Research Challenges Context-Length Assumptions in LLM Agents
- OpenWrt 25.12.0 – Stable Release
- Building a Dependency-Free GPT on a Custom OS
- Critical: Qwen 3.5 Requires BF16 KV Cache, Not FP16 for Accurate Inference
- How to Run High-Performance LLMs Locally on the Arduino UNO Q
- Bare-Metal LLM Inference: UEFI Application Boots Directly Into LLM Chat
- Unsloth Dynamic 2.0 GGUFs
- Accuracy vs. Speed in Local LLMs: Finding Your Sweet Spot
- Snapdragon 8 Elite Gen 5 for Galaxy Official: 5 Key Improvements that Push the Boundaries
- On-Device AI in Mobile Apps: What Should Run on the Phone vs the Cloud (A 2026 Decision Guide)
- Extracting 100K Concepts from an 8B LLM
- Every agent framework has the same bug – prompt decay. Here's a fix
- DeepSeek Releases DualPath: Addressing Storage Bandwidth Bottlenecks in Agentic Inference
- Mirai Announces $10M to Advance On-Device AI Performance for Consumer Devices
- Advanced Quantization Techniques Show Surprising Performance Gains Over Standard Methods
- Show HN: A Ground Up TLS 1.3 Client Written in C
- Which Web Frameworks Are Most Token-Efficient for AI Agents?
- Wave Field LLM Achieves O(n log n) Scaling: 825M Model Trained to 1B Parameters in 13 Hours
- Custom Portable Workstation Optimized for Local AI Inference Builds
- A Tool to Tell You What LLMs Can Run on Your Machine
- Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search
- Breaking the Speed Limit: Strategies for 17k Tokens/Sec Local Inference
- Yet Another Fix Coming for Older AMD GPUs on Linux – Thanks to Valve Developer
- AI-Powered Reverse-Engineering of Rosetta 2 for Linux
- GGML Joins Hugging Face: What This Means for Local Model Optimization
- DietPi Released a New Version v10.1