Alibaba’s AI model outperforms Chinese rivals, ranks just behind OpenAI, Anthropic

Amjad Ali July 15, 2024

0 37 1 minute read

Qwen2-72B-Instruct – the most advanced version of the Hangzhou-based e-commerce giant’s Qwen family of large language models (LLMs), the open source version of Tongyi Qianwen – came in just behind OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet in a ranking from SuperClue, a benchmarking platform that evaluates models based on metrics such as calculations, logic reasoning, coding, and text comprehension, among others.

Five Chinese models – from Alibaba, start-up Deepseek, Hong Kong-listed SenseTime, smartphone vendor Oppo, and a collaborative effort between Tsinghua University and start-up Zhipu AI – outperformed GPT-4 Turbo, one of the best models from Microsoft-backed OpenAI, according to SuperClue.

The gap between Chinese and US AI models appears to be narrowing, according to SuperClue, which said China made significant progress in advancing domestic LLMs in the first half of the year.

The ranking comes just a couple of weeks after the same model topped a ranking of open-source models from the machine-learning developer platform Hugging Face, with three Qwen models making the top 10.

Hugging Face co-founder and CEO Clement Delangue commended the progress made by Chinese AI firms in a post on X. “Qwen 72B is the king and Chinese open models are dominating overall,” he wrote.

As a platform for open-source models, Hugging Face does not benchmark closed-source models, which often lead in such tests. A separate test this month by LMSYS – an AI model research organisation supported by the University of California, Berkeley – ranked Qwen2-72B 20th, with closed-source models from OpenAI, Anthropic, and Google taking most of the top 10 slots.

OpenAI ignited an AI arms race in late 2022 with the launch of ChatGPT, which was then based on its GPT-3.5 model. The popularity of the product sent tech giants like Google and Microsoft scrambling to put out their own chatbots.

OpenAI’s subsequent models have remained industry-leading, although SuperClue said most closed-source Chinese models have at this point surpassed the capabilities of GPT-3.5-Turbo.

Source link

Amjad Ali July 15, 2024

0 37 1 minute read