WildClaw Bench
Free407 GitHub stars
Learning ResourceOpenAI AssistantsFile System
Overview
WildClaw Bench is an in-the-wild benchmark designed for evaluating AI agents within the OpenClaw environment. It is ideal for researchers and developers looking to assess the performance of their AI agents in real-world scenarios.