DeepSeek explained: Everything you need to know

TechTarget.com/whatis

https://www.techtarget.com/whatis/feature/DeepSeek-explained-Everything-you-need-to-know

DeepSeek explained: Everything you need to know

By Sean Michael Kerner

In the world of AI, there has been a prevailing notion that developing leading-edge large language models requires significant technical and financial resources. That's one of the main reasons why the U.S. government pledged to support the $500 billion Stargate Project announced by President Donald Trump.

Listen to this article. This audio was generated by AI.

But Chinese AI development firm DeepSeek has disrupted that notion. On Jan. 20, 2025, DeepSeek released its R1 LLM at a fraction of the cost that other vendors incurred in their own developments. DeepSeek is also providing its R1 models under an open source license, enabling free use.

Within days of its release, the DeepSeek AI assistant -- a mobile app that provides a chatbot interface for DeepSeek-R1 -- hit the top of Apple's App Store chart, outranking OpenAI's ChatGPT mobile app. The meteoric rise of DeepSeek in terms of usage and popularity triggered a stock market sell-off on Jan. 27, 2025, as investors cast doubt on the value of large AI vendors based in the U.S., including Nvidia. Microsoft, Meta Platforms, Oracle, Broadcom and other tech giants also saw significant drops as investors reassessed AI valuations.

What is DeepSeek?

DeepSeek is an AI development firm based in Hangzhou, China. The company was founded by Liang Wenfeng, a graduate of Zhejiang University, in May 2023. Wenfeng also co-founded High-Flyer, a China-based quantitative hedge fund that owns DeepSeek. Currently, DeepSeek operates as an independent AI research lab under the umbrella of High-Flyer. The full amount of funding and the valuation of DeepSeek have not been publicly disclosed.

DeepSeek focuses on developing open source LLMs. The company's first model was released in November 2023. The company has iterated multiple times on its core LLM and has built out several different variations. However, it wasn't until January 2025 after the release of its R1 reasoning model that the company became globally famous.

The company provides multiple services for its models, including a web interface, mobile application and API access.

OpenAI vs. DeepSeek

DeepSeek represents the latest challenge to OpenAI, which established itself as an industry leader with the debut of ChatGPT in 2022. OpenAI has helped push the generative AI industry forward with its GPT family of models, as well as its o1 class of reasoning models, which include o3 and o4 mini.

While the two companies are both developing generative AI LLMs, they have different approaches.

	OpenAI	DeepSeek
Founding year	2015	2023
Headquarters	San Francisco, Calif.	Hangzhou, China
Development focus	Broad AI capabilities	Efficient, open source models
Key models	GPT-4o, o1	DeepSeek-V3, DeepSeek-R1
Specialized models	Dall-E (image generation), Whisper (speech recognition)	DeepSeek Coder (coding), Janus Pro (vision model)
API pricing (per million tokens)	o1: $15 (input), $60 (output)	DeepSeek-R1: $0.55 (input), $2.19 (output)
Open source policy	Limited	Mostly open source
Training approach	Supervised and instruction-based fine-tuning	Reinforcement learning
Development cost	Hundreds of millions of dollars for o1 (estimated)	Less than $6 million for DeepSeek-R1, according to the company

Training innovations in DeepSeek

DeepSeek uses a different approach to train its R1 models than what is used by OpenAI. The training involved less time, fewer AI accelerators and less cost to develop. DeepSeek's aim is to achieve artificial general intelligence, and the company's advancements in reasoning capabilities represent significant progress in AI development.

In a research paper, DeepSeek outlines the multiple innovations it developed as part of the R1 model, including the following:

Reinforcement learning. DeepSeek used a large-scale reinforcement learning approach focused on reasoning tasks.
Reward engineering. Researchers developed a rule-based reward system for the model that outperforms neural reward models that are more commonly used. Reward engineering is the process of designing the incentive system that guides an AI model's learning during training.
Distillation. Using efficient knowledge transfer techniques, DeepSeek researchers successfully compressed capabilities into models as small as 1.5 billion parameters.
Emergent behavior network. DeepSeek's emergent behavior innovation is the discovery that complex reasoning patterns can develop naturally through reinforcement learning without explicitly programming them.

DeepSeek large language models

Since the company was created in 2023, DeepSeek has released a series of generative AI models. With each new generation, the company has worked to advance both the capabilities and performance of its models:

DeepSeek Coder. Released in November 2023, this is the company's first open source model designed specifically for coding-related tasks.
DeepSeek LLM. Released in December 2023, this is the first version of the company's general-purpose model.
DeepSeek-V2. Released in May 2024, this is the second version of the company's LLM, focusing on strong performance and lower training costs.
DeepSeek-Coder-V2. Released in July 2024, this is a 236 billion-parameter model offering a context window of 128,000 tokens, designed for complex coding challenges.
DeepSeek-V3. Released in December 2024, DeepSeek-V3 uses a mixture-of-experts architecture, capable of handling a range of tasks. The model has 671 billion parameters with a context length of 128,000.
DeepSeek-R1. Released in January 2025, this model is based on DeepSeek-V3 and is focused on advanced reasoning tasks directly competing with OpenAI's o1 model in performance, while maintaining a significantly lower cost structure. Like DeepSeek-V3, the model has 671 billion parameters with a context length of 128,000.
DeepSeek-R1-0528. Released in May 2025, the R1-0528 model is an updated version of the original R1 model. The model now supports system prompts, JSON output and function calling, making it more suitable for agentic AI use cases. DeepSeek also claims it's more accurate with reduced hallucination rates compared to the prior release. R1-0528 also benefits from great reasoning depth, averaging 23,000 tokens per question vs. 12,000 in the previous version.
DeepSeek-R1-0528-Qwen3-8B. A smaller, distilled version based on Alibaba's Qwen3 model that is intended for systems with limited computational resources. According to DeepSeek, this 8 billion parameter model matches the performance of the larger Qwen3-235B model.
Janus-Pro-7B. Released in January 2025, Janus-Pro-7B is a vision model that can understand and generate images.

DeepSeek-R1 LLM sees competition from other vendors

Alibaba and Ai2 released their own updated LLMs within days of the R1 release -- Qwen2.5 Max and Tülu 3 405B.

Why it is raising alarms in the U.S.

While there was much hype around the DeepSeek-R1 release, it has raised alarms in the U.S., triggering concerns and a stock market sell-off in tech stocks. On Monday, Jan. 27, 2025, the Nasdaq Composite dropped by 3.4% at market opening, with Nvidia declining by 17% and losing approximately $600 billion in market capitalization.

DeepSeek is raising alarms in the U.S. for several reasons, including the following:

Cost disruption. DeepSeek claims to have developed its R1 model for less than $6 million. The low-cost development threatens the business model of U.S. tech companies that have invested billions in AI. DeepSeek is also cheaper for users than OpenAI.
Technical achievement despite restrictions. The export of the highest-performance AI accelerator and GPU chips from the U.S. is restricted to China. Yet, despite that, DeepSeek has demonstrated that leading-edge AI development is possible without access to the most advanced U.S. technology.
Business model threat. In contrast with OpenAI, which is proprietary technology, DeepSeek is open source and free, challenging the revenue model of U.S. companies charging monthly fees for AI services.
Geopolitical concerns. Being based in China, DeepSeek challenges U.S. technological dominance in AI. Tech investor Marc Andreessen called it AI's "Sputnik moment," comparing it to the Soviet Union's space race breakthrough in the 1950s.

DeepSeek bans

Countries and organizations around the world have already banned DeepSeek, citing ethics, privacy and security issues within the company. Because all user data is stored in China, the biggest concern is the potential for a data leak to the Chinese government. The LLM was also trained with a Chinese worldview -- a potential problem due to the country's authoritarian government.

Places where DeepSeek is banned include the following:

Australian government agencies.
India central government.
Italy.
NASA.
South Korea industry ministry.
Taiwan government agencies.
Texas state government.
U.S. Congress.
U.S. Navy.
U.S. Pentagon.

DeepSeek cyberattack

DeepSeek's popularity has not gone unnoticed by cyberattackers.

On Jan. 27, 2025, DeepSeek reported large-scale malicious attacks on its services, forcing the company to temporarily limit new user registrations. The timing of the attack coincided with DeepSeek's AI assistant app overtaking ChatGPT as the top downloaded app on the Apple App Store.

Despite the attack, DeepSeek maintained service for existing users. The issue extended into Jan. 28, when the company reported it had identified the issue and deployed a fix.

DeepSeek has not specified the exact nature of the attack, though widespread speculation from public reports indicated it was some form of DDoS attack targeting its API and web chat platform.

DeepSeek data exposed

Wiz Research -- a team within cloud security vendor Wiz Inc. -- published findings on Jan. 29, 2025, about a publicly accessible back-end database spilling sensitive information onto the web -- a "rookie" cybersecurity mistake. Information included DeepSeek chat history, back-end data, log streams, API keys and operational details. DeepSeek took the database offline shortly after being informed. It's unclear for how long the database was exposed.

DeepSeek jailbreak reveals its entire system prompt

Now we know exactly how DeepSeek was designed to work, and we may even have a clue toward its highly publicized scandal with OpenAI.

Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.

TechTarget.com/whatis