CheXagent
Collection
9 items
•
Updated
•
1
Requirements:
pip install opencv-python
pip install albumentations
pip install accelerate
torch==2.2.1
transformers==4.39.0 # may work with more recent version
Adapted sample script for SRRG
import io
import requests
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer
import tempfile
# step 1: Setup constants
model_name = "StanfordAIMI/CheXagent-2-3b-srrg-findings"
dtype = torch.bfloat16
device = "cuda"
# step 2: Load Processor and Model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True)
model = model.to(dtype)
model.eval()
# step 3: Download image from URL, save to a local file, and prepare path list
url = "https://huggingface.co/IAMJB/interpret-cxr-impression-baseline/resolve/main/effusions-bibasal.jpg"
resp = requests.get(url)
resp.raise_for_status()
# Use a NamedTemporaryFile so it lives on disk
with tempfile.NamedTemporaryFile(delete=False, suffix=".jpg") as tmpfile:
tmpfile.write(resp.content)
local_path = tmpfile.name # this is a real file path on disk
paths = [local_path]
prompt = "Structured Radiology Report Generation for Findings Section"
# build the multimodal input
query = tokenizer.from_list_format(
[*([{"image": img} for img in paths]), {"text": prompt}]
)
# format as a chat conversation
conv = [
{"from": "system", "value": "You are a helpful assistant."},
{"from": "human", "value": query},
]
# tokenize and generate
input_ids = tokenizer.apply_chat_template(
conv, add_generation_prompt=True, return_tensors="pt"
)
output = model.generate(
input_ids.to(device),
do_sample=False,
num_beams=1,
temperature=1.0,
top_p=1.0,
use_cache=True,
max_new_tokens=512,
)[0]
# decode the “findings” text
response = tokenizer.decode(output[input_ids.size(1) : -1])
print(response)
Response:
Lungs and Airways:
- No evidence of pneumothorax.
Pleura:
- Bilateral pleural effusions.
Cardiovascular:
- Cardiomegaly.
Other:
- Bibasilar opacities.
- Mild pulmonary edema.
@inproceedings{delbrouck-etal-2025-automated,
title = "Automated Structured Radiology Report Generation",
author = "Delbrouck, Jean-Benoit and
Xu, Justin and
Moll, Johannes and
Thomas, Alois and
Chen, Zhihong and
Ostmeier, Sophie and
Azhar, Asfandyar and
Li, Kelvin Zhenghao and
Johnston, Andrew and
Bluethgen, Christian and
Reis, Eduardo Pontes and
Muneer, Mohamed S and
Varma, Maya and
Langlotz, Curtis",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-long.1301/",
doi = "10.18653/v1/2025.acl-long.1301",
pages = "26813--26829",
ISBN = "979-8-89176-251-0",
abstract = "Automated radiology report generation from chest X-ray (CXR) images has the potential to improve clinical efficiency and reduce radiologists' workload. However, most datasets, including the publicly available MIMIC-CXR and CheXpert Plus, consist entirely of free-form reports, which are inherently variable and unstructured. This variability poses challenges for both generation and evaluation: existing models struggle to produce consistent, clinically meaningful reports, and standard evaluation metrics fail to capture the nuances of radiological interpretation. To address this, we introduce Structured Radiology Report Generation (SRRG), a new task that reformulates free-text radiology reports into a standardized format, ensuring clarity, consistency, and structured clinical reporting. We create a novel dataset by restructuring reports using large language models (LLMs) following strict structured reporting desiderata. Additionally, we introduce SRR-BERT, a fine-grained disease classification model trained on 55 labels, enabling more precise and clinically informed evaluation of structured reports. To assess report quality, we propose F1-SRR-BERT, a metric that leverages SRR-BERT{'}s hierarchical disease taxonomy to bridge the gap between free-text variability and structured clinical reporting. We validate our dataset through a reader study conducted by five board-certified radiologists and extensive benchmarking experiments."
}