What this page answers
Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via S
- NVIDIA · nvidia/llama-3.3-nemotron-super-49b-v1.5
- text->text · global model route
- 131,072 context · $0.10 input
Before connecting
Do not stop at the model name. Before integration, verify base URL, protocol, visible models, parameters, and limits together.
- supports frequency_penalty
- supports include_reasoning
- supports max_tokens
- supports min_p
- supports presence_penalty
Next action
The goal is to catch search demand, then move users into model profiles, provider profiles, and key checking.
- Check whether the model fits the use case
- Then verify key permission and callable models