ByteDance says new AI technology boosts model training efficiency by 1.7 times

Published: 9:00pm, 11 Mar 2025Updated: 9:22pm, 11 Mar 2025

TikTok owner ByteDance said it has achieved a 1.71 times efficiency improvement in large language model (LLM) training, the latest Chinese tech company to achieve a breakthrough that could potentially reduce demand for Nvidia’s high-end graphics processing units (GPUs).

The company’s Doubao development team said they managed to “speed up” LLM training efficiency by “1.71 times” through COMET, an optimised Mixture-of-Experts (MoE) system, according to a recent paper published on arXiv, an online forum for professionals in the scientific community.

MoE is a machine learning technique where multiple expert networks are used to divide a problem space into homogeneous sections.

The technique has been extensively adopted to scale LLMs to trillion-plus parameters, while maintaining fixed computing cost. It is widely adopted by leading artificial intelligence (AI) models such as Grok and DeepSeek.

The headquarters of ByteDance is seen in Beijing on September 16, 2020. Photo: AFP

The new system has already been adopted in the company’s production environment of clusters using over 10,000 GPUs, achieving “savings of millions of GPU hours”, according to the Doubao team.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

ByteDance says new AI technology boosts model training efficiency by 1.7 times

Leave a Reply Cancel reply