VLLM
Free81.0k GitHub stars
Platform & FrameworkAgnosticFile System
Overview
VLLM is a high-throughput and memory-efficient inference and serving engine designed for large language models. It is ideal for developers and researchers looking to deploy and optimize LLMs in various environments.