DeepSeek’s proposed “mHC” architecture could transform the training of large language models (LLMs) – the technology behind artificial intelligence chatbots – as developers look for ways to scale models without simply adding more computing power.
However, experts cautioned that while the approach could prove far-reaching, it might still prove difficult to put into practice.
In a technical paper released last week, co-authored by DeepSeek founder and CEO Liang Wenfeng, the company proposed…
DeepSeek pitches new route to scale AI, but researchers call for more testing

