DistantNews
China's DeepSeek Releases New AI Model with Ultra-Long Context Capability
๐Ÿ‡ณ๐Ÿ‡ฌ Nigeria /Technology

China's DeepSeek Releases New AI Model with Ultra-Long Context Capability

From Vanguard · (10m ago) English

Translated from English, summarized and contextualized by DistantNews.

TLDR

  • Chinese AI startup DeepSeek has launched a new artificial intelligence model, DeepSeek-V4, featuring an "ultra-long context" of one million words.
  • The new model aims to significantly reduce compute and memory costs associated with processing long texts.
  • This release intensifies the AI competition between China and the United States, with DeepSeek's previous models having challenged US dominance.

In a move that further escalates the global artificial intelligence race, Chinese startup DeepSeek has unveiled its latest AI model, DeepSeek-V4. This release, coming more than a year after their previous low-cost chatbot stunned the world, promises "drastically reduced" costs and introduces an "ultra-long context" capability of one million words.

features an ultra-long context of one million words

โ€” DeepSeek StatementDescribing the key capability of the new DeepSeek-V4 AI model.

This development is particularly significant given the intensifying rivalry between China and the United States in the strategic AI sector. The White House's recent accusations of Chinese entities engaging in large-scale efforts to steal AI technology underscore the high stakes involved. DeepSeek's emergence in January last year, with its R1 reasoning model, challenged the prevailing assumption of US supremacy in generative AI.

world-leadingโ€ฆ with drastically reduced compute (and) memory costs

โ€” DeepSeek StatementHighlighting the performance and cost-efficiency of the new model.

The new DeepSeek-V4 model's ability to absorb and process an unprecedented amount of input is hailed as "world-leading." The company emphasizes the drastic reduction in compute and memory costs, a critical factor for widespread adoption. Experts view this as an "inflection point" for the industry, potentially moving long-text processing from high-end research labs into mainstream commercial applications, bringing accessible benefits to end-users.

achieves leadership in both domestic and open-source fields across agent capabilities, world knowledge, and reasoning performance

โ€” DeepSeek StatementDetailing the model's performance across various AI benchmarks.

Released in two versions, DeepSeek-V4-Pro and DeepSeek-V4-Flash, the model offers different parameter configurations to cater to various needs. The V4-Pro boasts 1.6 trillion parameters, while the V4-Flash provides a more efficient option with 284 billion parameters. DeepSeek-V4-Pro demonstrates remarkable performance, closely rivaling top-tier closed-source models like Google's Gemini-Pro-3.1 in world knowledge benchmarks, while significantly outperforming other open-source alternatives. This release solidifies China's position in the global AI landscape and signals continued innovation from its burgeoning tech sector.

This addresses the long-standing issues of slower performance and higher costs associated with long context lengths, marking a genuine inflection point for the industry.

โ€” Zhang YiAn expert's assessment of the significance of DeepSeek-V4's long context capabilities.
DistantNews Editorial

Originally published by Vanguard in English. Translated, summarized, and contextualized by our editorial team with added local perspective. Read our editorial standards.