Alibaba said the model's performance across 19 major benchmarks was comparable with top international reasoning models, including OpenAI's GPT-5.2-Thinking and Google's Gemini 3 Pro.
Photo from Jiemian News
by SONG Jianan
Alibaba Group late on Monday released Qwen3-Max-Thinking, the flagship reasoning model in its Qwen large language model series, as Chinese developers seek to narrow the gap with leading global AI systems in advanced reasoning.
Alibaba said the model's performance across 19 major benchmarks was comparable with top international reasoning models, including OpenAI's GPT-5.2-Thinking and Google's Gemini 3 Pro.
The company said Qwen3-Max-Thinking was trained on more than 36 trillion tokens and refined through large-scale reinforcement learning. The upgrade focuses on adaptive tool use—allowing the model to autonomously call search, memory or code tools during conversations—and test-time scaling, which allocates additional computing resources during inference to improve accuracy.
Alibaba said the model scored 58.3 on a tool-use benchmark, compared with 45.5 for GPT-5.2-Thinking and 45.8 for Gemini 3 Pro. It also achieved 91.5 on an IMO-level mathematics test.
The model is available to consumers via Qwen's platforms and to enterprise users through Alibaba Cloud.
The launch comes amid rapid growth of Alibaba's open-source AI ecosystem. Data from Hugging Face earlier this month showed Qwen-based models had surpassed 200,000 derivatives, with total downloads exceeding 1 billion, overtaking Meta's Llama to rank first globally among open-source large models.
Alibaba Chief Executive Eddie Wu has said the company is accelerating a three-year, 380-billion-yuan investment plan in AI infrastructure, with spending broadly in line with capital outlays by U.S. technology groups including Google, Meta and Amazon.