DeepSeek unveils 1M-context V4 model, open-sources system

Flash pricing starts at 0.2 yuan per million tokens for cached input.

Photo from Jiemian News

Photo from Jiemian News

by SONG Jianan

DeepSeek on April 24 released and open-sourced a preview of its DeepSeek-V4 model, featuring a 1 million-token context window and stronger reasoning and agent capabilities.

The model comes in two versions — V4-Pro and V4-Flash — both supporting 1M context, well above the 128K–256K range typical of most domestic models.

V4-Pro has 49 billion activated parameters and 33 trillion training tokens, while V4-Flash uses 13 billion parameters with 32 trillion tokens, targeting faster, lower-cost deployment.

The API supports OpenAI- and Anthropic-style interfaces. Existing DeepSeek-chat and DeepSeek-reasoner endpoints will be phased out within three months.

Flash pricing starts at 0.2 yuan per million tokens for cached input. Pro is more expensive and currently limited by high-end compute capacity, though costs may fall as Huawei's Ascend systems scale.

DeepSeek said V4-Pro ranks among the top open-source models in coding and agent benchmarks, approaching leading proprietary systems in some tasks.

The model uses a sparse attention architecture to extend context while reducing compute costs.

The launch lifted China's semiconductor index. Media reports also said Tencent and Alibaba are in talks to invest in DeepSeek at a valuation above $20 billion. The company did not comment.