Ant Group’s use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

Ant Group, the fintech affiliate of Alibaba Group Holding, is able to train large language models (LLMs) using locally produced graphics processing units (GPUs), reducing reliance on Nvidia’s advanced chips and cutting training costs by 20 per cent, according to a research paper and media reports.

Advertisement

Ant’s Ling team, responsible for LLM development, revealed that its Ling-Plus-Base model, a Mixture-of-Experts (MoE) model with 300 billion parameters, can be “effectively trained on lower-performance devices”. The finding was published in a recent paper on arXiv, an open-access platform for professionals in the scientific community.

By avoiding high-performance GPUs, the model reduces computing costs by a fifth in the pre-training process, while still achieving performance comparable to other models such as Qwen2.5-72B-Instruct and DeepSeek-V2.5-1210-Chat, according to the paper.

The development positions the Hangzhou-based fintech giant alongside domestic peers like DeepSeek and ByteDance in reducing reliance on advanced Nvidia chips, which are subject to strict US export controls.

“These results demonstrate the feasibility of training state-of-the-art large-scale MoE models on less powerful hardware, enabling a more flexible and cost-effective approach to foundational model development with respect to computing resource selection,” the team wrote in the paper.

Advertisement

MoE is a machine learning technique in which multiple networks of specialised knowledge are used to divide a problem space into homogeneous sections. The technique has been widely adopted by leading artificial intelligence (AI) models – Grok, DeepSeek and Alibaba’s Qwen included – to scale LLMs to trillion-plus parameters while maintaining fixed computing costs. Alibaba owns the South China Morning Post.

  

Read More

Leave a Reply