There are plenty of benchmarks on how AI models perform in specific fields like math and programming, but a new benchmark by ...