OpenAI Compatible API for TensorRT LLM Triton Backend
Free220 GitHub stars
Platform & FrameworkOpenAI AssistantsFile System
Overview
This tool provides an OpenAI compatible API for integrating TensorRT with Triton Inference Server, enabling efficient deployment of large language models. It is designed for developers looking to optimize AI model performance in production environments.