Guidance for Scalable Model Inference and Agentic AI on Amazon EKS
Free20 GitHub stars
Platform & FrameworkAgnosticAWS
Overview
This repository provides a comprehensive architecture for scalable ML inference on Amazon EKS, utilizing Graviton processors and GPU instances. It is designed for developers looking to deploy large language models with agentic AI capabilities effectively.