DeepSeek is breaking the internet—an open-source AI model that’s not only rivaling ChatGPT but also shaking up Silicon Valley, triggering a stock market plunge, and a whole lot of memes going viral across social media.
The world dominated by U.S. tech giants like OpenAI, Google, and Meta, now has a new player from China — DeepSeek. This low-cost artificial intelligence startup founded in 2023 has sent shockwaves through the global tech world with its groundbreaking AI models.
In just over a year, the company has achieved what many thought impossible: developing AI technology that rivals the best in the world at a fraction of the cost. The DeepSeek app has already sent shockwaves across stock markets, sparked discussions in Washington, and raised eyebrows across Silicon Valley. With its app topping Apple’s App Store and its models outperforming industry leaders, DeepSeek is not just a competitor—it’s a disruptor.
Key Points
- DeepSeek, a China-based startup, unveiled its R1 LLM model, which it claims matches or outperforms existing AI models at a minimal cost.
- DeepSeek’s app became the No. 1 downloaded free app on Apple’s App Store, sparking a frenzy in tech markets.
- DeepSeek has achieved major milestones by delivering competitive AI models with a much lower budget. DeepSeek’s AI models, including the V3 and R1, rival OpenAI’s GPT-4, Google’s Gemini 2.0, and Anthropic’s Claude 3.5 at a significantly lower cost.
- The company trained its models using mid-range Nvidia H800 chips and $5.6 million, compared to the billions spent by U.S. competitors.
What is DeepSeek?
DeepSeek is a Chinese AI startup founded in 2023. The company specializes in large language models (LLMs) that power its AI chatbot, which offers services similar to ChatGPT.
DeepSeek’s models are bilingual, capable of understanding and generating responses in both Chinese and English. What sets DeepSeek apart is its ability to deliver high-performance AI at a dramatically lower cost. For instance, its V3 model, released in December 2024, performs on par with OpenAI’s GPT-4 but was developed for just 5.58 million — a fraction of the 100 million reportedly spent on GPT-4.
Who is behind DeepSeek?
DeepSeek was founded by Liang Wenfeng, who also established High-Flyer Quant, a hedge fund managing over $10 billion in assets by 2019. Liang’s expertise in leveraging AI for financial predictions laid the groundwork for DeepSeek’s innovative approach.
The company, headquartered in Hangzhou, China, employs just 200 people—compared to OpenAI’s 3,500-strong workforce. Despite its small size, DeepSeek has managed to attract a 5-6 million global user base, driven by its efficient and cost-effective AI solutions.
How does DeepSeek work?
DeepSeek’s AI models are built using a combination of advanced algorithms and reinforcement learning techniques. DeepSeek AI models, such as the V3 and R1, can perform a wide range of tasks, from answering complex questions and generating detailed essays to writing and debugging computer code.
Its R1 model, released in January, is a “reasoning” model designed to tackle complex, multi-step problems like reading comprehension strategic planning, and mathematical problem-solving. DeepSeek-R1 can be used to build anything from games to websites.
The app’s interface is user-friendly, allowing users to type questions or requests and receive conversational responses. While it doesn’t yet generate images, its text-based capabilities have already drawn comparisons to ChatGPT and Google’s Gemini.
Meanwhile, people experimenting with DeepSeek have reported that the chatbot replies are restricted around topics deemed sensitive in China such as the Tiananmen Square protests sovereignty of Taiwan. When asked about controversial topics, DeepSeek usually replies in this manner: “Sorry, that’s beyond my current scope. Let’s talk about something else.”
What makes DeepSeek different from ChatGPT & other AI chatbots?
DeepSeek’s most significant advantage lies in its efficiency. The company claims to have developed techniques that make training and running its models far cheaper than its competitors. For example, DeepSeek trained its V3 model using just 2,000 NVIDIA H800 chips, which are considered less powerful compared to the 16,000 H100 chips used by some U.S. firms.
Additionally, DeepSeek has made its models freely available for others to download and modify, following a strategy similar to Meta’s Llama. This open approach contrasts with OpenAI’s proprietary model, which restricts access to its technology.
Already making waves, R1 ranks second only to OpenAI’s o1 model in the Artificial Analysis Quality Index—a widely followed independent AI analysis ranking. DeepSeek’s R1 model has already outperformed several leading AI models, including Google’s Gemini 2.0 Flash, Anthropic’s Claude 3.5 Sonnet, Meta’s Llama 3.3-70B, and OpenAI’s GPT-4o, according to the Artificial Analysis Quality Index—a widely respected independent ranking system for AI models.
How DeepSeek achieved its breakthrough
While giants like OpenAI and Google’s DeepMind rely on thousands of cutting-edge chips to train their models, DeepSeek has built its system on just 2,000 NVIDIA H800 chips, a far more affordable option than the more commonly used H100 chips. The result? DeepSeek’s V3 model, which rivals OpenAI’s GPT-4, was trained at a reported cost of just $5.58 million—pennies compared to the hundreds of millions spent by other companies.
Last week, DeepSeek published the technical infrastructure behind its DeepSeek-R1 model, highlighting cost savings by using fewer and less advanced chips compared to typical AI projects. Despite these modest resources, DeepSeek’s models are already outperforming other major AI systems in key areas. The R1 reasoning model builds on the V3 model and promises more sophisticated problem-solving abilities.
DeepSeek effect on Silicon Valley and Wall Street
DeepSeek’s rise has sent ripples through global markets. The stock market, particularly shares of chipmakers like NVIDIA, took a hit after DeepSeek’s revelations. The Nasdaq fell 3% following the release of DeepSeek’s R1 model, with NVIDIA losing $589 billion in market value. The company’s ability to deliver sophisticated AI with such low overhead costs has stunned investors and executives.
Marc Andreessen, venture capitalist and co-founder of Netscape, called DeepSeek’s breakthrough “one of the most amazing and impressive breakthroughs” he had ever seen. Silicon Valley’s reaction has been a mixture of awe and concern, with some wondering whether DeepSeek’s efficiency could fundamentally change how AI is developed and distributed. Microsoft CEO Satya Nadella also weighed in, calling DeepSeek’s new model “super impressive.”
How have Tech figures and competitors reacted?
As DeepSeek dominates headlines and social media, industry giants are paying attention.
OpenAI CEO Sam Altman acknowledged DeepSeek’s achievements, calling its R1 model “impressive” and adding that it was “invigorating to have a new competitor.” However, he also hinted at OpenAI’s plans to release even more advanced models in the future.
At the World Economic Forum in Switzerland, Microsoft CEO Satya Nadella — whose company is one of OpenAI’s biggest investors — called DeepSeek’s new model “super impressive.” In a post on X, he said: “As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.”
Silicon Valley venture capitalist Marc Andreessen called DeepSeek’s R1 model AI’s “Sputnik moment,” likening it to the space race between the U.S. and the Soviet Union and the event that forced the U.S. to realize that its technological abilities were not unassailable.
Nvidia spokesperson John Rizzo in a statement Monday called DeepSeek an “excellent AI advancement,” adding that deploying advanced AI models requires “significant numbers” of the company’s chips.
Nat Friedman, the former CEO of Github, similarly posted: “The deepseek team is obviously really good. China is full of talented engineers. Every other take is cope. Sorry.”
Global implications and ethical concerns
DeepSeek’s rapid success isn’t just a business story—it’s also a geopolitical one. The company’s rise is seen by many as a direct challenge to U.S. dominance in the AI sector. With its lower-cost model and access to a massive user base in China, DeepSeek’s breakthrough could have significant implications for the balance of global technological power.
However, this breakthrough has also sparked intense scrutiny over privacy, data security, and national security concerns. DeepSeek’s privacy and data collection policies, while transparent, have raised eyebrows globally. The company collects a range of user data, including keystroke patterns, device information, and usage behavior, all stored securely in China.
Despite these assurances, there are serious national security and data privacy concerns, particularly in the U.S. President Donald Trump weighed in, stating that DeepSeek’s success should serve as a “wake-up call” for U.S. tech companies to focus on innovation and cost efficiency. The U.S. Navy has banned the use of DeepSeek among its ranks, citing potential security risks.
The geopolitical tension surrounding DeepSeek has even fueled unsubstantiated claims that DeepSeek’s success is a Chinese government “psyop,” or psychological operation, casting suspicion on the small team’s ability to “beat all of the top researchers in the world as a side project.” Responding to these claims, Soumith Chintala, a co-founder of PyTorch, the machine learning library developed by Meta AI, said on X: “I’m comically impressed that people are coping on deepseek by spewing bizarre conspiracy theories — despite deepseek open-sourcing and writing some of the most detail oriented papers ever.” He added: “read. replicate. compete. don’t be salty, just makes you look incompetent.”
However, some privacy experts argue that DeepSeek’s data collection policies are no worse than those of its American counterparts. One thing is clear that DeepSeek has shown that China’s AI has developed much faster than many had believed, despite efforts from American policymakers to slow its progress.
Key Takeaways
- Cost Efficiency: DeepSeek’s AI models, such as R1, offer performance on par with industry leaders like OpenAI’s GPT-4 at a fraction of the cost, potentially redefining the cost of AI development.
- Open-Source Advantage: DeepSeek’s decision to make its AI models available for free download and modification promotes innovation in the AI community, enabling developers to build on its technology.
- Smaller, More Efficient Hardware: By using fewer and less powerful chips, DeepSeek’s models challenge the prevailing notion that cutting-edge AI requires massive computational power, signaling a shift toward more efficient AI infrastructure.
- Silicon Valley Shake-Up: DeepSeek’s rise is sending ripples through Silicon Valley, putting pressure on established players like OpenAI, Google, and Nvidia, as investors and companies reassess the viability of expensive AI systems.
- Geopolitical Tensions: With DeepSeek’s success and its Chinese origin, the company has sparked concerns over data security and national security, particularly in the U.S., highlighting the growing geopolitical stakes in the AI race.