AgentStack
Back to directory

VLLM

Free
81.0k GitHub stars
Platform & FrameworkAgnosticFile System

Overview

VLLM is a high-throughput and memory-efficient inference and serving engine designed for large language models. It is ideal for developers and researchers looking to deploy and optimize LLMs in various environments.

Visit resource