Legal Agent Benchmark
The Legal Agent Benchmark evaluates agentic legal work — reviewing contracts, producing redlines and answering questions that require sustained reasoning over long legal documents. Scores are low across all models, making it one of the least saturated evals reported.
Official source: Anthropic — Fable 5 / Mythos 5 announcement