Benchmarks - Search News

Etna Capital Management Releases “Beyond Benchmarks” Framework on Research-Led AI Investing in a Global Automation Era

Hong Kong, China - February 26, 2026 - Etna Capital Management today released a new research framework, “Beyond ...

How to build a better AI benchmark

To fix the way we test and measure models, AI is learning tricks from social science. It’s not easy being one of Silicon Valley’s favorite benchmarks. SWE-Bench (pronounced “swee bench”) launched in ...

Backboard.io Becomes First AI Platform to Lead Both Major Memory Benchmarks, Accelerating the Era of Agentic AI

Backboard.io announced it has achieved state-of-the-art performance across both leading AI memory benchmarks, a first ...

13d

Galaxy S26 vs S26 Ultra: Early benchmarks tell an interesting story

Early benchmarks suggest Qualcomm’s chip in the Galaxy S26 Ultra may outpace the Exynos 2600 in the standard model.

XDA Developers on MSN

Best GPU benchmarks: 5 options to test your graphics card performance

3DMark and Superposition are considered two of the most reliable GPU benchmarking tools out there. Cinebench 2024 is also a ...

TechCrunch

Why most AI benchmarks tell us so little

On Tuesday, startup Anthropic released a family of generative AI models that it claims achieve best-in-class performance. Just a few days later, rival Inflection AI unveiled a model that it asserts ...

Android Authority

Qualcomm benchmarks the Snapdragon 888, and it's fast

The first smartphones powered by Qualcomm’s cutting edge Snapdragon 888 mobile application processor aren’t set to arrive until the new year, but we now have a better picture of how these phones might ...

NextBigFuture

XAI Grok 4 Has Leading Benchmarks

XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models. If these leaked Grok 4 benchmarks are ...

VentureBeat

Self-invoking code benchmarks help you decide which LLMs to use for your programming tasks

As large language models (LLMs) continue to improve at coding, the benchmarks used to evaluate their performance are steadily becoming less useful. That's because though many LLMs have similar high ...

Wall Street Journal

Why We Need New Benchmarks for AI

How do you measure artificial intelligence? Since the idea first took hold in the 1950s, researchers have gauged the progress of AI by establishing benchmarks, such as the ability to recognize images, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results