Huawei claims better AI training method than DeepSeek using own chips

Researchers working on Huawei Technologies’ large language model (LLM) Pangu claimed they have improved on DeepSeek’s original approach to training artificial intelligence (AI) by leveraging the US-sanctioned company’s proprietary hardware.

Advertisement

A paper – published last week by Huawei’s Pangu team, which comprises 22 core contributors and 56 additional researchers – introduced the concept of Mixture of Grouped Experts (MoGE). It is an upgraded version of the Mixture of Experts (MoE) technique that has been instrumental in DeepSeek’s cost-effective AI models.

While MoE offers low execution costs for large model parameters and enhanced learning capacity, it often results in inefficiencies, according to the paper. This is because of the uneven activation of so-called experts, which can hinder performance when running on multiple devices in parallel.

In contrast, the improved MoGE “groups the experts during selection and better balances the expert workload”, researchers said.

In AI training, “experts” refer to specialised sub-models or components within a larger model, each designed to handle specific tasks or types of data. This allows the overall system to take advantage of diverse expertise to enhance performance.

image

01:38

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

The advancement comes at a crucial time, as Chinese AI companies are focused on enhancing model training and inference efficiency through algorithmic improvements and a synergy of hardware and software, despite US restrictions on the export of advanced AI chips like those from Nvidia.

Advertisement

  

Read More

Leave a Reply