-
-
-
-
-
-
Inference Providers
Active filters:
modelopt
nvidia/Qwen3-Next-80B-A3B-Instruct-NVFP4
Text Generation
•
Updated
•
15.8k
•
21
nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4
Text Generation
•
Updated
•
29k
•
26
lukealonso/MiniMax-M2.1-NVFP4
115B
•
Updated
•
27.1k
•
21
nvidia/Qwen3-30B-A3B-NVFP4
Text Generation
•
16B
•
Updated
•
25.2k
•
23
Text Generation
•
177B
•
Updated
•
4.58k
•
13
Text Generation
•
177B
•
Updated
•
1.95k
•
6
nvidia/Llama-4-Scout-17B-16E-Instruct-FP8
109B
•
Updated
•
200k
•
11
NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4
Text Generation
•
16B
•
Updated
•
7.16k
•
7
Text Generation
•
5B
•
Updated
•
8.15k
•
13
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
•
5B
•
Updated
•
4.5k
•
13
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation
•
Updated
•
31.2k
•
20
nvidia/Qwen3-235B-A22B-Thinking-2507-NVFP4
Text Generation
•
120B
•
Updated
•
107
•
1
nvidia/Qwen3-235B-A22B-Instruct-2507-NVFP4
Text Generation
•
120B
•
Updated
•
235
•
1
soundsgoodai/GLM-4.7-NVFP4-KV-cache-FP8
Text Generation
•
177B
•
Updated
•
799
•
1
nvidia/Qwen3-VL-235B-A22B-Instruct-NVFP4-MLPerf-Inference-Closed-V6.0
119B
•
Updated
•
4.93k
•
1
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B
•
Updated
•
35.4k
•
19
nvidia/Llama-4-Maverick-17B-128E-Instruct-FP8
402B
•
Updated
•
182
•
12
ishan24/test_modelopt_quant
nvidia/Llama-4-Maverick-17B-128E-Eagle3
Updated
•
22
•
9
jiangchengchengNLP/L3.3-MS-Nevoria-70b-FP8
Text Generation
•
71B
•
Updated
•
4
NVFP4/Qwen3-30B-A3B-Instruct-2507-FP4
Text Generation
•
16B
•
Updated
•
1.18k
•
11
gesong2077/Qwen3-32B-NVFP4
19B
•
Updated
•
1
54B
•
Updated
nvidia/Phi-4-multimodal-instruct-NVFP4
4B
•
Updated
•
2.64k
•
6
nvidia/Phi-4-multimodal-instruct-FP8
6B
•
Updated
•
31.4k
•
4
nvidia/Phi-4-reasoning-plus-FP8
15B
•
Updated
•
540
•
3
nvidia/Phi-4-reasoning-plus-NVFP4
8B
•
Updated
•
7.03k
•
6
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B
•
Updated
•
87.6k
•
6
Text Generation
•
8B
•
Updated
•
4.6k
•
3
Text Generation
•
8B
•
Updated
•
17k
•
5