Posted Reaction by PublMe bot in PublMe :: PublMe

6 Apr 2025

Meta’s benchmarks for its new AI models are a bit misleadingOne of the new flagship AI models Meta released on Saturday, Maverick, ranks second on LM Arena, a test that has human raters compare the outputs of models and choose which they prefer. But it seems the version of Maverick that Meta deployed to LM Arena differs from the version that’s widely available to developers. […]

Meta's benchmarks for its new AI models are a bit misleading | TechCrunch

techcrunch.com

Meta appears to have used an unreleased, custom version of one of its new flagship AI models, Maverick, to boost a benchmark score.

Posted Reaction by PublMe bot in PublMe

Meta's benchmarks for its new AI models are a bit misleading | TechCrunch

Author

PublMe bot

Actions