Requirements:

pip install opencv-python
pip install albumentations
pip install accelerate
torch==2.2.1
transformers==4.39.0 # may work with more recent version

Adapted sample script for SRRG

import io
import requests
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer
import tempfile

# step 1: Setup constants
model_name = "StanfordAIMI/CheXagent-2-3b-srrg-findings"
dtype = torch.bfloat16
device = "cuda"

# step 2: Load Processor and Model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True)
model = model.to(dtype)
model.eval()

# step 3: Download image from URL, save to a local file, and prepare path list
url = "https://huggingface.co/IAMJB/interpret-cxr-impression-baseline/resolve/main/effusions-bibasal.jpg"
resp = requests.get(url)
resp.raise_for_status()

# Use a NamedTemporaryFile so it lives on disk
with tempfile.NamedTemporaryFile(delete=False, suffix=".jpg") as tmpfile:
    tmpfile.write(resp.content)
    local_path = tmpfile.name  # this is a real file path on disk

paths = [local_path]

prompt = "Structured Radiology Report Generation for Findings Section"
# build the multimodal input
query = tokenizer.from_list_format(
    [*([{"image": img} for img in paths]), {"text": prompt}]
)

# format as a chat conversation
conv = [
    {"from": "system", "value": "You are a helpful assistant."},
    {"from": "human", "value": query},
]

# tokenize and generate
input_ids = tokenizer.apply_chat_template(
    conv, add_generation_prompt=True, return_tensors="pt"
)
output = model.generate(
    input_ids.to(device),
    do_sample=False,
    num_beams=1,
    temperature=1.0,
    top_p=1.0,
    use_cache=True,
    max_new_tokens=512,
)[0]

# decode the “findings” text
response = tokenizer.decode(output[input_ids.size(1) : -1])
print(response)

Response:

Lungs and Airways:
- No evidence of pneumothorax.

Pleura:
- Bilateral pleural effusions.

Cardiovascular:
- Cardiomegaly.

Other:
- Bibasilar opacities.
- Mild pulmonary edema.
@inproceedings{delbrouck-etal-2025-automated,
    title = "Automated Structured Radiology Report Generation",
    author = "Delbrouck, Jean-Benoit  and
      Xu, Justin  and
      Moll, Johannes  and
      Thomas, Alois  and
      Chen, Zhihong  and
      Ostmeier, Sophie  and
      Azhar, Asfandyar  and
      Li, Kelvin Zhenghao  and
      Johnston, Andrew  and
      Bluethgen, Christian  and
      Reis, Eduardo Pontes  and
      Muneer, Mohamed S  and
      Varma, Maya  and
      Langlotz, Curtis",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.1301/",
    doi = "10.18653/v1/2025.acl-long.1301",
    pages = "26813--26829",
    ISBN = "979-8-89176-251-0",
    abstract = "Automated radiology report generation from chest X-ray (CXR) images has the potential to improve clinical efficiency and reduce radiologists' workload. However, most datasets, including the publicly available MIMIC-CXR and CheXpert Plus, consist entirely of free-form reports, which are inherently variable and unstructured. This variability poses challenges for both generation and evaluation: existing models struggle to produce consistent, clinically meaningful reports, and standard evaluation metrics fail to capture the nuances of radiological interpretation. To address this, we introduce Structured Radiology Report Generation (SRRG), a new task that reformulates free-text radiology reports into a standardized format, ensuring clarity, consistency, and structured clinical reporting. We create a novel dataset by restructuring reports using large language models (LLMs) following strict structured reporting desiderata. Additionally, we introduce SRR-BERT, a fine-grained disease classification model trained on 55 labels, enabling more precise and clinically informed evaluation of structured reports. To assess report quality, we propose F1-SRR-BERT, a metric that leverages SRR-BERT{'}s hierarchical disease taxonomy to bridge the gap between free-text variability and structured clinical reporting. We validate our dataset through a reader study conducted by five board-certified radiologists and extensive benchmarking experiments."
}
Downloads last month
131
Safetensors
Model size
3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collections including StanfordAIMI/CheXagent-2-3b-srrg-findings