Model catalog profile

NVIDIA: Llama 3.1 Nemotron… | pricing profile

Start with NVIDIA: Llama 3.1 Nemotron… input $0.60, output $1.8, context 131,072, and provider NVIDIA, then decide whether this pricing path is worth exploring further.

Provider

NVIDIA

11 models in catalog

Context

131,072

global model route

Input

$0.60

USD / 1M tokens

Output

$1.8

USD / 1M tokens

What this page answers

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-

NVIDIA · nvidia/llama-3.1-nemotron-ultra-253b-v1
text->text · global model route
131,072 context · $0.60 input

Before connecting

Do not stop at the model name. Before integration, verify base URL, protocol, visible models, parameters, and limits together.

supports frequency_penalty
supports include_reasoning
supports max_tokens
supports presence_penalty
supports reasoning

Next action

After this page, the usual next step is to open the model profile, the provider profile, or run a key check to confirm the real callable range.

Check whether the model fits the use case
Then verify key permission and callable models

Continue

nvidia/llama-3.1-nemotron-ultra-253b-v1