DISCLAIMER: Not thoroughly tested, may not yet work at all!!

Original Model Link : https://huggingface.co/inclusionAI/Ling-flash-2.0-GGUF:Q4_K_M

name: Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4
base_model: inclusionAI/Ling-flash-base-2.0
license: mit
pipeline_tag: text-generation
tasks: text-generation
language:
- en
- zh
get_started_code: uvx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt' --prompt 'Test prompt'

Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4

Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 is an MOE LLM model based on the unique bailing_moe design. an image outlining the architecture of the model

MLX is a framework for METAL graphics supported by Apple computers with ARM M-series processors (M1/M2/M3/M4)

Generation using uv https://docs.astral.sh/uv/**:

uvx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt'

Generation using pip:

pipx --from mlx-lm mlx_lm.generate --model  --model exdysa/Ling-flash-2.0-GGUF:Q4_K_M-MLX-Q4 --prompt 'Test Prompt'
Downloads last month
17
Safetensors
Model size
103B params
Tensor type
U8
U32
BF16
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for exdysa/Ling-flash-2.0-MLX-MXFP4

Quantized
(4)
this model