Baidu and Zhipu AI’s large language models top Chinese generative AI rankings, but OpenAI, Anthropic remain ahead in overall performance

April 22, 2024

207 2 minutes read

Baidu and Zhipu AI’s large language models top Chinese generative AI rankings, but OpenAI, Anthropic remain ahead in overall performance — b827ac93 3c1c 438e 984f c68fd869d249 e82085e3.jpg

Baidu’s Ernie Bot 4.0 and start-up Zhipu AI’s GLM-4 rank top among Chinese large language models (LLMs), but their foreign rivals still lead in overall capabilities, according to a new test by Tsinghua University in Beijing.

The SuperBench assessment report examined 14 representative LLMs – the technology underpinning generative artificial intelligence (AI) chatbots – and found that overseas models, such as OpenAI’s GPT-4 and Anthropic’s Claude-3, came out on top in multiple capabilities, including semantic comprehension, coding abilities and alignment with human commands.

Researchers found “obvious gaps” in the code-writing and operative abilities in the real-world environment between domestic and first-class foreign models.

The report aims to “provide objective and scientific evaluation criteria” to examine a growing number of LLMs that have emerged recently, according to a WeChat post published by Tsinghua’s Basic Model Research Centre, which conducted the assessment with the state-backed Zhongguancun Laboratory.

05:03

How does China’s AI stack up against ChatGPT?

Chinese tech giants and start-ups have been racing to improve their LLMs since OpenAI, a US start-up backed by Microsoft, launched a series of innovative tools powered by generative AI, including ChatGPT and text-to-video service Sora.

Around 200 LLMs have been introduced in China, where OpenAI’s services are officially unavailable, according to government figures.

The Tsinghua report echoes a recent comment by Alibaba Group Holding co-founder and chairman Joe Tsai, who said China is about two years behind US companies in the global AI race, citing how OpenAI has leapfrogged the rest of the tech industry in AI innovation. Alibaba is the owner of the South China Morning Post.

Despite the challenges faced by Chinese LLM developers, Tsinghua’s report showed that Ernie Bot 4.0, the latest version of the generative AI chatbot launched by web search giant Baidu, and GLM-4 from Zhipu AI, a start-up founded by a Tsinghua graduate, have gradually narrowed their respective gaps with the world’s best models in overall performances.

One area where China’s LLMs performed better is Chinese text-language tasks, the test found. Start-up Moonshot AI’s Kimi chatbot, Alibaba’s Tongyi Qianwen 2.1, GLM-4 and Ernie Bot 4.0 ranked in the top four in that category, although GPT-4 still came first in Chinese text-language reasoning.

Moonshot AI and Zhipu AI, along with Baichuan and MiniMax, are locally known as the “four new AI tigers” of China for being some of the country’s most promising generative AI start-ups.

Established in 2019, Zhipu AI has raised 2.5 billion yuan (US$347 million) since last year, according to its founder, from backers that include state-affiliated investors, venture capitalists and Big Tech companies such as Alibaba, Tencent Holdings and Meituan.

Moonshot AI, also based in Beijing, raised US$1 billion in a funding round in February, according to multiple Chinese media reports.

Source

April 22, 2024

207 2 minutes read