Infinity
Free2.8k GitHub stars
Platform & FrameworkAgnosticFile System
Overview
Infinity is a high-throughput, low-latency serving engine designed for text embeddings and reranking models. It is ideal for developers looking to implement efficient AI model serving in their applications.