AI Models//DeepSeek: DeepSeek V4 Flash
Chat

DeepSeek: DeepSeek V4 Flash

deepseek/deepseek-v4-flash
1049KContext Window
4KMax Output
Normal

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance. The model includes hybrid attention for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

Capabilities

Text GenerationCode GenerationAnalysis & Reasoning

Technical Specs

Input Modality
Text
Output Modality
Text
Arch

Pricing

Pay per use, no monthly fees
Billing TypeUnitPrice
Text Input$0.1400/M tokens
Text Output$0.2800/M tokens
Cache Read$0.0028/M tokens

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.uniontoken.ai/v1",
    api_key="YOUR_UNIONTOKEN_API_KEY",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(response.choices[0].message.content)

FAQ

Ready to get started?

Get 1M free tokens on registration, no monthly fees or minimum spend

Register Now →