A great tool to estimate how much VRAM your LLMs actually need. Alter the hardware config, quantization, etc., it tells you about: - Generation speed (tokens/sec) - Precise memory allocation - System throughput, etc. No more VRAM guessing!