DeepSeek’s upgraded foundational model excels in coding and maths

Chinese artificial intelligence (AI) star DeepSeek has upgraded its open-source V3 large language model by adding parameters and improving capabilities in coding and solving mathematical problems.

Advertisement

The DeepSeek-V3-0324, named after its predecessor and the launch date, has “enhanced reasoning capabilities, optimised front-end web development and upgraded Chinese writing proficiency”, according to a notice on the company’s website.

The new version and DeepSeek V3 are both foundation models trained on vast data sets that can be applied in different use cases, including that of a chatbot. DeepSeek R1, the reasoning model, is based on DeepSeek V3.

The updated foundation model has made improvements in several benchmarks, especially the American Invitational Mathematics Examination (AIME), where it scored 59.4 compared with 39.6 for its predecessor, while achieving an increase of 10 points on LiveCodeBench to achieve 49.2, DeepSeek data showed.

This illustration photograph taken on January 29, 2025 shows screens displaying the logos of DeepSeek and OpenAI’s AI chatbot ChatGPT. Photo: AFP
This illustration photograph taken on January 29, 2025 shows screens displaying the logos of DeepSeek and OpenAI’s AI chatbot ChatGPT. Photo: AFP

Compared with DeepSeek V3, which has 671 billion parameters and adopts the company’s own commercial license, the new 685-billion-parameter model uses the MIT software licence that is the most popular on developer platform GitHub.

Advertisement

Launched on AI community Hugging Face as well as the company’s own website, DeepSeek-V3-0324 is now the top trending model on Hugging Face, receiving positive comments on its performance.

  

Read More

Leave a Reply