DISCLAIMER: Not thoroughly tested, may not yet work at all!!

Original Model Link : https://huggingface.co/inclusionAI/Ling-flash-2.0-GGUF:Q4_K_M

name: Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4
base_model: inclusionAI/Ling-flash-base-2.0
license: mit
pipeline_tag: text-generation
tasks: text-generation
language:
- en
- zh
get_started_code: uvx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt' --prompt 'Test prompt'

Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4

Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 is an MOE LLM model based on the unique bailing_moe design. an image outlining the architecture of the model

MLX is a framework for METAL graphics supported by Apple computers with ARM M-series processors (M1/M2/M3/M4)

Generation using uv https://docs.astral.sh/uv/**:

uvx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt'

Generation using pip:

pipx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt'

Downloads last month: 17

Safetensors

Model size

103B params

Tensor type

U32

BF16

F32

Model tree for exdysa/Ling-flash-2.0-MLX-MXFP4

Base model

inclusionAI/Ling-flash-base-2.0

Quantized

(4)

this model