Model catalog profile

NVIDIA: Llama 3.1 Nemotron… | model profile

Use this NVIDIA: Llama 3.1 Nemotron… profile to review provider, region, modality, pricing, and the next route before deciding whether to keep exploring the model.

Provider

NVIDIA

11 models in catalog

Context

131,072

global model route

Input

$0.60

USD / 1M tokens

Output

$1.8

USD / 1M tokens

What this page answers

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-

NVIDIA · nvidia/llama-3.1-nemotron-ultra-253b-v1
text->text · global model route
131,072 context · $0.60 input

Before connecting

Do not stop at the model name. Before integration, verify base URL, protocol, visible models, parameters, and limits together.

supports frequency_penalty
supports include_reasoning
supports max_tokens
supports presence_penalty
supports reasoning

Next action

After this page, the usual next step is to open the model profile, the provider profile, or run a key check to confirm the real callable range.

Check whether the model fits the use case
Then verify key permission and callable models

Continue

nvidia/llama-3.1-nemotron-ultra-253b-v1