Chinese AI startup DeepSeek last week put a bold marker down, claiming it trained its R1 AI model for just $294,000 using a cluster of 512 Nvidia H800 chips. The figure — disclosed in a peer-reviewed Nature paper co-authored by founder Liang Wenfeng — is a fraction of what American AI giants have spent, and it is already raising eyebrows across the tech world.
The company said training took 80 hours on the China-specific H800s, designed by Nvidia to comply with U.S. export controls. In supplementary notes, DeepSeek also admitted to using Nvidia A100 GPUs in early experiments, before migrating its work to H800s.
That price tag, however, has immediately invited scrutiny. Analysts and U.S. rivals say the number is less a reflection of miraculous efficiency than a carefully chosen sliver of a much larger story.
Register for Tekedia Mini-MBA edition 19 (Feb 9 – May 2, 2026): big discounts for early bird.
Tekedia AI in Business Masterclass opens registrations.
Join Tekedia Capital Syndicate and co-invest in great global startups.
Register for Tekedia AI Lab: From Technical Design to Deployment (next edition begins Jan 24 2026).
A stark contrast with U.S. spending
The comparison with American AI firms underlines a huge gap. OpenAI CEO Sam Altman has previously said that training his company’s foundation models costs “much more” than $100 million, while industry peers such as Anthropic and Google DeepMind are also thought to have budgets in the hundreds of millions per model.
Research firm SemiAnalysis contends that DeepSeek’s operations are far from bargain-basement. It reported that the company has access to about 50,000 Nvidia Hopper GPUs — including 10,000 H800s and 10,000 H100s — and that its real investment includes $1.6 billion in servers, $944 million in operating costs, and more than $500 million spent directly on GPUs. By that estimate, the highly publicized sub-$300,000 training cost represents only a narrow portion of the overall effort.
Distillation versus proprietary data
The disparity between DeepSeek and U.S. firms extends beyond accounting. American companies like OpenAI, Google DeepMind, and Anthropic emphasize building models on massive proprietary datasets, relying on expensive scaling and dedicated supercomputing infrastructure. DeepSeek, by contrast, has leaned heavily on distillation, a technique in which one AI model learns by training on the outputs of another.
The practice, while cost-efficient, has stirred controversy. DeepSeek previously admitted to incorporating Meta’s open-source Llama into some of its distilled systems. In its new paper in Nature, the company went further, acknowledging that training data for its V3 model contained “a significant number” of responses generated by OpenAI systems — a byproduct, it said, of web-crawled data rather than deliberate replication.
U.S. rivals argue that their models’ competitive edge rests on painstakingly curated datasets and rigorous infrastructure. DeepSeek counters that its approach makes AI deployment cheaper and more scalable — a particularly powerful narrative in China, where state support favors affordability and accessibility.
Global implications
The company’s January debut jolted global markets, wiping billions off the valuations of Western AI leaders as investors recalibrated expectations about competitive dynamics. Evidence of DeepSeek’s alternative methods, whether genuinely revolutionary or not, underscores the broader divide between Beijing’s push for cost-efficient domestic champions and Washington’s focus on capital-intensive proprietary models.
For Washington, DeepSeek’s claims also carry geopolitical weight. U.S. export controls were designed to restrict China’s access to the most advanced GPUs, yet DeepSeek has managed to train frontier models on Nvidia’s restricted-market H800s. That success, even if partly overstated, suggests Beijing-backed firms are adapting faster than expected.
Since January, DeepSeek has kept a low public profile, rolling out incremental product updates while leaving rivals and regulators guessing about its true scale. But the combination of claimed low training costs, heavy reliance on distillation, and disputed financial disclosures ensures that its progress will remain a flashpoint in the broader contest between the United States and China for AI supremacy.



