🚀 Day 0 support: Kimi K2 Thinking now running on vLLM! In partnership with @Kimi_Moonshot, we're proud to deliver official support for the state-of-the-art open thinking model with 1T params, 32B active. Easy deploy in vLLM (nightly version) with OpenAI-compatible API: What makes it special: ⚡ Native INT4 quantization → 2× faster inference 💾 Half the memory footprint, no accuracy loss 🎯 256K context, stable across 200-300 tool calls 🎯 Official recipe & deployment guide included World-class reasoning, now accessible to everyone. 📦 Model: 📚 Recipes: #vLLM #KimiK2 #LLMInference