DeepSeek – Chinese AI in open source mode. Does Hong Kong have a chance against OpenAI?

11 February 2025   /  AI

DeepSeek is a series of Chinese language models that impresses with its performance and low training costs. Thanks to their open source approach, DeepSeek-R1 and DeepSeek-V3 are causing quite a stir in the AI industry.

DeepSeek

Source: www.deepseek.com

DeepSeek: a revolution in the world of AI from Hong Kong

DeepSeek is increasingly being mentioned in discussions about the future of artificial intelligence. This Hong Kong project provides open-source large language models (LLMs) with high performance and, crucially, significantly lower training costs than competing solutions from OpenAI or Meta.

In this article, we will take a closer look at DeepSeek-R1 and DeepSeek-V3 and provide an update on the development and distribution of these models based on official materials available on the Hugging Face platform as well as publications from Spider’s Web and china24.com.

Table of contents

  1. How was DeepSeek created?
  2. DeepSeek-R1 and DeepSeek-V3: a brief technical introduction
  3. Training costs and performance: what’s the secret?
  4. Open source and licensing
  5. DeepSeek-R1, R1-Zero and Distill models: what are the differences?
  6. The rivalry between China and the USA: sanctions, semiconductors and innovation
  7. Will DeepSeek threaten OpenAI’s dominance?
  8. Summary
  9. Sources

AI born

How was DeepSeek created?

The latest press reports indicate that High-Flyer Capital Management, a company that until recently was almost unknown in the IT industry outside of Asia, was founded in Hong Kong in 2015. This changed dramatically with DeepSeek, a series of large language models that took Silicon Valley experts by storm.

However, DeepSeek is not only a commercial project – it is also a breath of fresh air in a world where closed solutions with huge budgets, such as models from OpenAI (including GPT-4 and OpenAI o1), usually dominate.

DeepSeek-R1 and DeepSeek-V3: a brief technical introduction

According to information from the official project page on Hugging Face, DeepSeek is currently publishing several variants of its models:

  1. DeepSeek-R1-Zero: created through advanced training without the initial SFT (Supervised Fine-Tuning) stage, focusing on strengthening reasoning skills (the so-called chain-of-thought).
  2. DeepSeek-R1: in which the authors included additional, preliminary fine-tuning (SFT) before the reinforcement learning phase, which improved the readability and consistency of the generated text.
  3. DeepSeek-V3: named after the base model from which the R1-Zero and R1 variants described above are derived. DeepSeek-V3 can have up to 671 billion parameters and was trained in two months at a cost of approximately $5.58 million (data: china24.com).

ai tech

Technical background

  • The high number of parameters (up to 671 billion) means that very complex statements and analyses can be generated.
  • Thanks to the optimised training process, even such a large architecture does not require a budget comparable to that of OpenAI.
  • The main goal: to independently develop multi-stage solutions and minimise ‘hallucinations’, so common in other models.

Training costs and performance: what’s the secret?

Both the Spider’s Web service and the china24. com emphasise that the training costs of DeepSeek-R1 (approx. $5 million for the first version) are many times lower than those we hear about in the context of GPT-4 or other closed OpenAI models, where we hear about billions of dollars.

Where does the recipe for success lie?

  • Proprietary methods of optimising the learning process,
  • Agile architecture that allows the model to learn more effectively with fewer GPUs,
  • Economical management of training data (avoiding unnecessary repetitions and precisely selecting the data set).

open source

Open source and licensing

DeepSeek, unlike most of its Western competitors, relies on open source. As stated in the official documentation of the model on Hugging Face:

‘DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation…’

This means that the community is not only free to use these models, but also to modify and develop them. In addition, several variants have already been developed within the DeepSeek-R1-Distill line, optimised for lower resource requirements.

Important:

  • The DeepSeek-R1-Distill models are based, among other things, on the publicly available Qwen2.5 and Llama3, which are linked to the relevant Apache 2.0 and Llama licences.
  • Nevertheless, the whole is made available to the community on very liberal terms – which stimulates experimentation and further innovation.

AI

DeepSeek-R1, R1-Zero and Distill models: what are the differences?

From the documentation published on Hugging Face, a three-tier division emerges:

1. DeepSeek-R1-Zero

  • Training only with RL (reinforcement learning), without prior SFT,
  • The model can generate very complex chains of thought (chain-of-thought),
  • However, it can suffer from problems with text reproducibility and readability.

2. DeepSeek-R1

  • Including the SFT phase before RL solved the problems noticed in R1-Zero,
  • Better consistency and less tendency to hallucinate,
  • According to benchmarks, it is comparable to OpenAI o1 in math, programming, and analytical tasks.

3. DeepSeek-R1-Distill

  • ‘Slimmed-down’ versions of the model (1.5B, 7B, 8B, 14B, 32B, 70B parameters),
  • Enable easier implementation on weaker hardware,
  • Created by distillation (transferring knowledge from the full R1 model to smaller architectures).

Rivalry between China and the USA: sanctions, semiconductors and innovation

As noted by the ‘South China Morning Post’ (cited by chiny24.com), the development of Chinese AI models is taking place under conditions of limited access to advanced semiconductors due to US sanctions.

Meanwhile, Chinese companies – including DeepSeek and ByteDance (Doubao) – are showing that even in such an unfavourable climate, they are able to create models:

  • that are not inferior to Western solutions,
  • and often much cheaper to maintain.

As Jim Fan (researcher at Nvidia) points out, the DeepSeek project may be proof that innovation and restrictive conditions (less funding, sanctions) do not have to be mutually exclusive.

Will DeepSeek threaten OpenAI’s dominance?

High-Flyer Capital Management and other Chinese companies are entering the market with a model that:

  • performs better than Western competitors in some tests,
  • is cheaper to develop and maintain,
  • makes open repositories available, allowing for the rapid development of a community-based ecosystem.

If OpenAI (and other giants) do not develop a strategy to compete with cheaper and equally good models, Chinese solutions – such as DeepSeek or Doubao – could capture a significant share of the market.

LLM przyszłość

The era of expensive AI models is coming to an end?

DeepSeek is a prime example of how the era of gigantic and ultra-expensive AI models may be coming to an end. Open source, low training costs and very good benchmark results mean that ambitious start-ups from China could shake up the current balance of power in the artificial intelligence industry.

Due to the growing technological tensions between China and the USA, the further development of DeepSeek and similar projects will probably become one of the main themes in the global rivalry for the title of AI leader.

Sources

  1. ‘Chinese DeepSeek beats all OpenAI models. The West has a big problem’ – Spider’s Web
  2. ‘DeepSeek. Chinese startup builds open-source AI’ – chiny24.com
  3. Official DeepSeek-R1 website on Hugging Face

Author: own work based on the indicated publications.

Text intended for information and journalistic purposes.

Share

Share

Need help with this topic?

Write to our expert

Mateusz Borkiewicz

Managing Partner, Attorney at law

+48 663 683 888 Contact

Articles in this category

And the Oscar goes to … AI Brody

AI

More
And the Oscar goes to … AI Brody

BREAKING: new executive order from President Trump

AI

More
BREAKING: new executive order from President Trump

GPT chat not working. Thousands of user reports

AI

More
GPT chat not working. Thousands of user reports

Trump changes artificial intelligence regulations – new approach to AI in the USA

AI

More
Trump changes artificial intelligence regulations – new approach to AI in the USA

US export restrictions on AI chips: what do they mean for the world, the gaming industry and … Poland?

AI

More
US export restrictions on AI chips: what do they mean for the world, the gaming industry and … Poland?
More

Contact

Any questions?see phone number+48 663 683 888
see email address

Hey, have you
signed up to our newsletter yet?

    Check how we process your personal data here