Start
liu-2023-agentbench
liu-2023-agentbench - Skill Dossier

liu-2023-agentbench
Comprehensive benchmark suite for evaluating LLM agents across diverse interactive environments
Research & Academic
#benchmarks#llm-agents#evaluation#agent-testing#capabilities
⚡
Coming in Spring 2026 Beta
WinDAGs will match this skill automatically. Then ask:
"Use liu-2023-agentbench to help me build..."
Request Early Access