VLLM

Free

81.0k GitHub stars

Platform & FrameworkAgnosticFile System

Overview

VLLM is a high-throughput and memory-efficient inference and serving engine designed for large language models. It is ideal for developers and researchers looking to deploy and optimize LLMs in various environments.

Visit resource