Sarvam-1-VL-4B-Instruct - VLLM (Merged)

Model Description

This is the recommended version for inference. Fully merged 16-bit model combining Qwen3-VL-4B-Instruct with trained LoRA weights.

Training Details

  • Base Model: Qwen/Qwen3-VL-4B-Instruct
  • Training Method: LoRA fine-tuning
  • Format: Merged 16-bit weights
  • Training Steps: 2,000
  • Final Loss: 6.25

Datasets

Trained on 4 datasets covering:

  • Translation (40%): BPCC - 22 Indic languages ↔ English
  • Instruction Following (20%): Pralekha - 11 language pairs
  • Document Layout (30%): IndicDLP - Document understanding
  • Visual QA (10%): DocVQA - Question answering

Supported Languages

Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Marathi, Manipuri, Nepali, Odia, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, Urdu, English

Usage

from transformers import Qwen3VLForConditionalGeneration, Qwen3VLProcessor
from PIL import Image

# Load model
model = Qwen3VLForConditionalGeneration.from_pretrained(
    "mashriram/Sarvam-1-VL-4B-Instruct-VLLM",
    torch_dtype="auto",
    device_map="auto"
)
processor = Qwen3VLProcessor.from_pretrained("mashriram/Sarvam-1-VL-4B-Instruct-VLLM")

# Prepare input
image = Image.open("document.jpg")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "Translate this document from English to Hindi."}
        ]
    }
]

# Generate
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(outputs[0], skip_special_tokens=True))

Performance

  • Inference Speed: Optimized for VLLM serving
  • Memory: ~8-9GB VRAM (fp16)
  • Quality: Balanced accuracy and speed

License

Apache 2.0

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for mashriram/Sarvam-1-VL-4B-Instruct-VLLM

Finetuned
(168)
this model

Datasets used to train mashriram/Sarvam-1-VL-4B-Instruct-VLLM