Tagged "nvidia"
- Nvidia Nemotron Cascade 2 30B Emerges as Powerful Alternative to Qwen Models
- DeepSeek R1 RTX 4090 vs Apple M3 Max: Benchmark & Performance Guide
- Build a $1,500 AI Server with DeepSeek-R1 on RTX 4090
- Repurpose Old GPUs as Dedicated AI Inference Accelerators
- NVIDIA Nemotron Cascade 2 30B Delivers 120B-Class Performance in Compact Form Factor
- NVIDIA Nemotron 3 Nano 4B Enables On-Device Inference Directly in Web Browsers via WebGPU
- Llamafile 0.10 Released with GPU Support and Rebuilt Core
- I Ran Local LLMs on a 'Dead' GPU, and the Results Surprised Me
- Qwen 3.5 4B Outperforms Nvidia Nemotron 3 4B in Local Benchmarks
- Mistral Small 4 119B Released with NVFP4 Quantisation Support
- NVIDIA Updates Nemotron 3 122B License, Removes Deployment Restrictions
- Qwen3.5-397B Achieves 282 tok/s on 4x RTX PRO 6000 Blackwell Through Custom CUTLASS Kernel
- Nvidia's Nemotron 3 Super: Understanding the Significance for Local LLM Deployment
- Running Qwen3.5-27B Across Multiple GPUs Over LAN Achieves Practical Speed for Local Inference
- Startup Transforms Mac Mini Into Full-Powered AI Inference System With External GPU
- Open-Source GreenBoost Driver Augments NVIDIA GPU VRAM With System RAM and NVMe Storage
- AMD Launches Agent System Optimized for Local AI Inference With Ryzen and Radeon
- Intel OpenVINO Backend Support Now Available in llama.cpp
- Linux 7.0 AMDGPU Fixing Idle Power Issue For RDNA4 GPUs After Compute Workloads
- How to Install OpenClaw with Ollama (Step-by-Step Tutorial)
- Nvidia Pushes Jetson as Edge Hub for Open AI Models
- Nvidia Releases Nemotron 3 Super: 120B MoE Model for Local Deployment
- Comprehensive MoE Backend Benchmarks for Qwen3.5-397B: Real Numbers vs Hype
- Cutile.jl Brings Nvidia CUDA Tile-Based Programming to Julia
- NVIDIA Jetson Brings Open Models to Life at the Edge
- Intel Arc Pro B70 Workstation GPU Confirmed via vLLM AI Release Notes
- Qwen3.5-27B Identified as Sweet Spot for Mid-Range Local Deployment
- Nvidia Could Launch Its First Laptops With Its Own Processors
- Google Is Exploring Ways to Use Its Financial Might to Take on Nvidia
- NVIDIA Releases Dynamo v0.9.0: Infrastructure Overhaul With FlashIndexer and Multi-Modal Support
- LayerScale Launches Inference Engine Faster Than vLLM, SGLang, and TRT-LLM
- AMD Announces Day 0 Support for Qwen 3.5 LLM on Instinct GPUs
- NVIDIA's Dynamic Memory Sparsification Cuts LLM Inference Costs by 8x
- Mistral AI Debugs Critical Memory Leak in vLLM Inference Engine
- Community Member Builds 144GB VRAM Local LLM Powerhouse