Model catalog profile

NVIDIA: Llama 3.3 Nemotron… | API profile

Start with NVIDIA: Llama 3.3 Nemotron… model ID, modality, and parameter surface, then decide whether to continue to a key check or integration test.

Provider

NVIDIA

11 models in catalog

Context

131,072

global model route

Input

$0.10

USD / 1M tokens

Output

$0.40

USD / 1M tokens

What this page answers

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via S

NVIDIA · nvidia/llama-3.3-nemotron-super-49b-v1.5
text->text · global model route
131,072 context · $0.10 input

Before connecting

Do not stop at the model name. Before integration, verify base URL, protocol, visible models, parameters, and limits together.

supports frequency_penalty
supports include_reasoning
supports max_tokens
supports min_p
supports presence_penalty

Next action

After this page, the usual next step is to open the model profile, the provider profile, or run a key check to confirm the real callable range.

Check whether the model fits the use case
Then verify key permission and callable models

Continue

nvidia/llama-3.3-nemotron-super-49b-v1.5