When Meta, the mother or father firm of Fb, introduced its newest open-source massive language mannequin (LLM) on July twenty third, it claimed that essentially the most highly effective model of Llama 3.1 had “state-of-the-art capabilities that rival one of the best closed-source fashions” similar to GPT-4o and Claude 3.5 Sonnet. Meta’s announcement included a desk, exhibiting the scores achieved by these and different fashions on a sequence of widespread benchmarks with names similar to MMLU, GSM8Okay and GPQA.