WildClaw Bench

Free

407 GitHub stars

Learning ResourceOpenAI AssistantsFile System

Overview

WildClaw Bench is an in-the-wild benchmark designed for evaluating AI agents within the OpenClaw environment. It is ideal for researchers and developers looking to assess the performance of their AI agents in real-world scenarios.

Visit resource