LLM Judge PSAI
Free2 GitHub stars
Agent ToolAgnosticWeb Scraper
Overview
LLM Judge PSAI is an evaluation system designed to assess the performance of computer-use agents on web browsing and interaction tasks. It provides detailed scoring and feedback based on agent trajectories and results, making it ideal for developers and researchers in AI agent performance evaluation.