DeepSeek ends week-long marathon to disclose AI model details

Chinese artificial intelligence (AI) start-up DeepSeek wrapped up a week of revealing technical details about its development of a ChatGPT competitor, which was achieved at a fraction of the typical costs, in a move that is poised to accelerate global advances in the field.

Advertisement

Over the past few days, DeepSeek published eight open-source projects on GitHub, the world’s largest open-source community. It was the first time that the firm revealed in detail how it squeezed the best performance from chips in compute, communication and storage, which are the key pillars of model training.

DeepSeek’s team of young scientists said they disclosed the company’s “battle-tested building blocks” to share “our small-but-sincere progress with full transparency”.

DeepSeek has been cheered by global developers, who praised the Chinese company for revealing the techniques it used in building its low-cost, high-performance AI models. Some developers, including the founder of AI development platform Hyperbolic, called DeepSeek “the real OpenAI”.

Despite its name, ChatGPT maker OpenAI has pivoted to a closed-source approach, keeping the specific training methods and compute costs of its models tightly guarded. OpenAI founder and CEO Sam Altman said earlier in February that the company “has been on the wrong side of history” and “needs to figure out a different open-source strategy”.

Advertisement

  

Read More

Leave a Reply