NVIDIA: Nemotron Nano 9B V2

nvidia/nemotron-nano-9b-v2

131KContext Window

Supported Protocols:reasoninginclude_reasoningmax_tokenstemperaturetop_pstopfrequency_penaltypresence_penaltyrepetition_penaltytop_kseedmin_presponse_formattoolstool_choice

Normal

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

Capabilities

🧠 Reasoning🔧 Function CallingText GenerationCode GenerationAnalysis & ReasoningReasoning

Technical Specs

Input Modality

Text

Output Modality

Text

Arch

—

Default Temperature

0.7

Default Top_P

Pricing

Pay per use, no monthly fees

Billing Type	Unit	Price
Text Input	—	$0.0400/M tokens
Text Output	—	$0.1600/M tokens

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.uniontoken.ai/v1",
    api_key="YOUR_UNIONTOKEN_API_KEY",
)

response = client.chat.completions.create(
    model="nvidia/nemotron-nano-9b-v2",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(response.choices[0].message.content)