Edit Models filters

Model Tree

meta-llama/Llama-2-70b-chat-hf

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

12

Full-text search

Active filters: meta-llama/Llama-2-70b-chat-hf

TheBloke/Llama-2-70B-Chat-GPTQ

Text Generation • 69B • Updated Sep 27, 2023 • 1.17k • 258

TheBloke/Llama-2-70B-Chat-GGUF

Text Generation • 69B • Updated Nov 21, 2023 • 1.6k • 121

TheBloke/Llama-2-70B-Chat-AWQ

Text Generation • 69B • Updated Nov 9, 2023 • 314k • 23

jamesdborin/llama2-70b-chat-4bit-AWQ

Text Generation • 69B • Updated Dec 11, 2023 • 3 • 1

mlc-ai/Llama-2-70b-chat-hf-q4f16_1-MLC

Updated Jul 11, 2024 • 1 • 2

softmax/Llama-2-70b-chat-hf-marlin

Text Generation • 10B • Updated Mar 17, 2024 • 1

strangervb/Llama-2-70B-Chat-GPTQ-2

Text Generation • 69B • Updated Mar 25, 2024 • 4

TitanML/llama2-70b-chat-4bit-AWQ

Text Generation • 69B • Updated Apr 24, 2024

espressor/meta-llama.Llama-2-70b-chat-hf_W8A8_FP8

Text Generation • 69B • Updated Nov 28, 2024 • 1

amd/Llama-2-70b-chat-hf_FP8_MLPerf_V2

69B • Updated Mar 27, 2025 • 5

amd/Llama-2-70b-chat-hf-WMXFP4-AMXFP4-KVFP8-Scale-UINT8-MLPerf-GPTQ

37B • Updated Aug 5, 2025 • 43

amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8

55B • Updated Sep 26, 2025 • 29