FrontierCode (Diamond)
Cognition’s FrontierCode evaluation tests whether models can complete difficult coding tasks while meeting the standards of high-quality production codebases — code that is correct, maintainable and reviewable, not merely passing tests. Diamond is the hardest tier.
Official source: Anthropic — Fable 5 / Mythos 5 announcement