News

This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors ...
German firm TNG has released DeepSeek-TNG R1T2 Chimera, an open-source variant twice as fast as its parent model thanks to a ...
The updated version of DeepSeek-R1 tied for first place with Google’s Gemini-2.5 and Anthropic’s Claude Opus 4 on the WebDev Arena leaderboard, which evaluates large language models (LLMs) on ...
Say hello to DeepSeek-TNG R1T2 Chimera, a large language model built by German firm TNG Consulting, using three different ...
DeepSeek quietly updates its R1 reasoning model Chinese AI startup DeepSeek has released an update to its R1 reasoning model. The new version, named R1-0528, was published on developer platform ...
Chinese AI upstart MiniMax released a new large language model, joining a slew of domestic peers inspired to surpass DeepSeek in the field of reasoning AI.
A new report says that Huawei's CloudMatrix 384 outperforms Nvidia processors running DeepSeek R1, which is to be expected given the energy use involved.