About LLM Boss

LLM Boss is an independent resource that tracks how the latest frontier, state-of-the-art large language models perform on the benchmarks the field actually uses — agentic coding, terminal and computer use, tool orchestration, web search, long-context and graduate-level reasoning, visual understanding and multilingual knowledge.

Our goal is simple: cut through launch-day marketing and let you compare models the way an engineer or buyer would — side by side, on the axes that match your workload. Choose any two models on the live comparison table, browse every model and matchup, or read the blog for benchmark explainers and buying guides.

How we work

Every number is sourced from a model's published system card or an independent leaderboard, with the original source linked on each benchmark page. We explain exactly how we collect and verify data on our methodology page.