amd
/

Gemma-3-4b-it-mm-onnx-ryzenai-npu

Model card Files Files and versions

amd/gemma-3-4b

Introduction

This model was created using Quark Quantization for the Decoder, followed by OGA Model Builder, and finalized with post-processing for NPU deployment.
Quantization Strategy
- AWQ / Group 128 / Asymmetric / BFP16 activations / UINT4 weights
Base model info:
Please refer to Gemma-3-4b-it for base model info.

Evaluation scores

The MMMU scores are, Music: 33.33, Marketing: 40, Math: 30, Clinical Medicine: 20, History: 40, Electronics: 13.3.
The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 16.825.

License

Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amd/Gemma-3-4b-it-mm-onnx-ryzenai-npu

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Quantized

(184)

this model

Collection including amd/Gemma-3-4b-it-mm-onnx-ryzenai-npu

Ryzen-AI-1.7-NPU-LLM

List will be updated • 30 items • Updated 8 days ago • 1