amd/gemma-3-4b

  • Introduction

    This model was created using Quark Quantization for the Decoder, followed by OGA Model Builder, and finalized with post-processing for NPU deployment.

  • Quantization Strategy

    • AWQ / Group 128 / Asymmetric / BFP16 activations / UINT4 weights
  • Base model info:

  • Please refer to Gemma-3-4b-it for base model info.

Evaluation scores

  • The MMMU scores are, Music: 33.33, Marketing: 40, Math: 30, Clinical Medicine: 20, History: 40, Electronics: 13.3.
  • The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 16.825.

License

Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amd/Gemma-3-4b-it-mm-onnx-ryzenai-npu

Quantized
(184)
this model

Collection including amd/Gemma-3-4b-it-mm-onnx-ryzenai-npu