o10Last updated 2026-06-14b

nvidia models

5 models from nvidia in the unified inference model catalog.

Browse nvidia models with gateway pricing, provider endpoints, and eval-gated routing guidance.

Dashboards observe.
o10 enforces.

Every page in this index follows the same structure as the home site — answer-first, passage blocks, operational steps, and expanded FAQs. o10 State of Inference Spend 2026 found up to 638× compliant price spread across venues for identical workloads.

Start hereQuick overview

How to use this index

What nvidia models are in the catalog?

5 models from nvidia are listed with per-model pricing and endpoint data.

o10 State of Inference Spend 2026 found up to 638× compliant price spread across venues for identical workloads.

nvidia5 models

Nemotron 3 Nano 30B A3B

text · nvidia/nemotron-3-nano-30b-a3b

NVIDIA Nemotron 3 Super 120B A12B

text · nvidia/nemotron-3-super-120b-a12b

Nemotron 3 Ultra

text · nvidia/nemotron-3-ultra-550b-a55b

Nvidia Nemotron Nano 12B V2 VL

text · nvidia/nemotron-nano-12b-v2-vl

Nvidia Nemotron Nano 9B V2

text · nvidia/nemotron-nano-9b-v2

FAQFrequently asked questions

Common questions

How many nvidia models?

5 models in the o10 gateway catalog snapshot.

Where is pricing from?

Gateway catalog snapshot — verify against your gateway provider's published pricing.

How does o10 route nvidia models?

o10 State of Inference Spend 2026 found up to 638× compliant price spread across venues for identical workloads. o10 selects the cheapest nvidia tier that clears your per-use-case eval floor — starting in shadow mode.

o10Set the envelope. o10 holds it.

See what you're overpaying.

Paste a week of traffic. Get the number that books the audit.

See what you're overpaying →