SmolLM3-Instruct Discharge Sentences SFT
A fine-tuned version of SmolLM3-3B for multi-label clinical sentence classification in hospital discharge summaries.
Model Description
This model classifies individual sentences from discharge summaries into categories of follow-up actions:
- instructions: Case-specific instructions for the patient
- appointment: Appointment-related followup
- medication: Medication-related followups
- lab: Lab-related followup
- procedure: Procedure-related followup
- imaging: Imaging-related followup
- other: Other helpful contextual information
Performance
Evaluated on 5,313 test samples:
| Metric | Score |
|---|---|
| JSON Validity | 100.0% |
| Exact Match Accuracy | 85.6% |
| Micro F1 | 0.796 |
| Macro F1 | 0.647 |
| Micro Precision | 0.854 |
| Micro Recall | 0.745 |
Per-Category Performance
| Category | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| instructions | 0.873 | 0.745 | 0.804 | 1153 |
| appointment | 0.848 | 0.902 | 0.874 | 660 |
| medication | 0.797 | 0.674 | 0.730 | 239 |
| lab | 0.909 | 0.606 | 0.727 | 132 |
| procedure | 0.700 | 0.400 | 0.509 | 35 |
| imaging | 0.909 | 0.541 | 0.678 | 37 |
| other | 0.526 | 0.127 | 0.204 | 79 |
Comparison with Base SFT Model
| Metric | Base SFT | Instruct SFT | Improvement |
|---|---|---|---|
| JSON Validity | 97.5% | 100.0% | +2.5 pp |
| Exact Match | 76.3% | 85.6% | +9.3 pp |
| Micro F1 | 0.631 | 0.796 | +26% |
| Macro F1 | 0.568 | 0.647 | +14% |
| Micro Recall | 0.553 | 0.745 | +35% |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model and adapter
base_model = AutoModelForCausalLM.from_pretrained(
"HuggingFaceTB/SmolLM3-3B",
torch_dtype="auto",
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "chrisvoncsefalvay/smollm3-instruct-discharge-sentences-sft")
tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/smollm3-instruct-discharge-sentences-sft")
# Prepare input
messages = [
{"role": "system", "content": "You are a clinical action item classifier..."},
{"role": "user", "content": "Classify this sentence:\n\nPlease take aspirin 81mg daily."}
]
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.1)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
# {"categories": ["medication", "instructions"]}
Training Details
- Base Model: HuggingFaceTB/SmolLM3-3B (instruction-tuned)
- Method: LoRA (r=64, alpha=128, dropout=0.1)
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Dataset: chrisvoncsefalvay/smol-discharge-sentences-sft
- Training Samples: 25,782
- Epochs: 3
- Learning Rate: 5e-5
- Effective Batch Size: 16
- Precision: bf16
Advantages Over Base Model
Using the instruction-tuned SmolLM3-3B as a base provides:
- Perfect JSON validity (100%) - built-in chat template ensures reliable structured output
- Significantly higher recall (+35%) - better at identifying action items
- No custom template needed - uses standard chat format
Citation
If you use this model, please cite:
@misc{smollm3-instruct-discharge-sft,
author = {von Csefalvay, Chris},
title = {SmolLM3-Instruct Discharge Sentences SFT},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/chrisvoncsefalvay/smollm3-instruct-discharge-sentences-sft}
}
License
Apache 2.0
- Downloads last month
- -