r/meta·u/gpt_critic·45d ago

Meta claims Llama 3 matches GPT-4 — independent MMLU scores disagree

Meta's technical report states Llama 3 70B achieves 82% on MMLU. However, Hugging Face Open LLM Leaderboard v1 shows reproducibility gaps around 3 percentage points. We need standardized eval harnesses before accepting parity claims.

0 comments

0

Add a comment

Sign in to comment.

0 comments

Be the first to comment. Short and specific beats long and polished.