Domain-adapted RoBERTa (DA RoBERTa) for Deauville score classification from PET/CT reports
model2025-12-03https://doi.org/10.1148/atlas.1764775935721
113

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Domain-adapted RoBERTa (DA RoBERTa) for Deauville score classification from PET/CT reports

Link

https://github.com/zhuemann/Nuclear_Medicine_Domain_Adaptation

Indexing

Keywords: Lymphoma, PET, PET/CT, Deauville, Natural Language Processing, Multimodal Learning, Artificial Intelligence, Machine Learning, Language Modeling, Transfer Learning, Unsupervised Learning
Content: MI, NM
RadLex: RID35514, RID10341, RID12782

Author(s)

Zachary Huemann
Changhee Lee
Junjie Hu
Steve Y. Cho
Tyler J. Bradshaw

Organization(s)

University of Wisconsin–Madison, Departments of Radiology, Biostatistics, and Computer Science
University of Wisconsin Carbone Cancer Center

Version

1.0

Contact

Zachary Huemann, ude.csiw@nnameuhz

Funding

Supported by GE HealthCare (equipment through a master research agreement). NVIDIA provided an RTXA6000 GPU to the author’s institution.

Ethical review

Institutional review board–approved, retrospective, HIPAA-compliant protocol with waiver of informed consent.

Date

Published: 2023-09-27

References

[1] Huemann Z, Lee C, Hu J, Cho SY, Bradshaw TJ. "Domain-adapted Large Language Models for Classifying Nuclear Medicine Reports". Radiology: Artificial Intelligence. 2023 Nov;5(6):e220281.. 2023-09-27. doi:10.1148/ryai.220281. PMID: 38074793. PMCID: PMC10698610.

Model

Architecture

Transformer-based large language model (RoBERTa) further pretrained via masked language modeling on 4542 PET/CT nuclear medicine reports (≈2 million words), followed by a three-layer classifier head (two fully connected layers, 1024 nodes, and final softmax) for five-class classification.

Availability

Trained language models available on HuggingFace or public GitHub: https://github.com/zhuemann/Nuclear_Medicine_Domain_Adaptation

Clinical benefit

Automated classification of Deauville scores from PET/CT reports; demonstrates improved performance with nuclear medicine domain adaptation for downstream NLP tasks.

Input

Redacted PET/CT text reports (impression prioritized, findings included up to 512-token limit); text preprocessed with punctuation removal, date stripping, numerical rounding, and synonym replacement.

Instructions

Use domain-adapted RoBERTa encoder with MLM pretraining on PET/CT report corpus (15% token masking, learning rate 1e-6, three epochs to reduce overfitting). Fine-tune with cross-entropy loss using Adam optimizer; seven iterations of random-split cross-validation (80% train, 10% validation, 10% test). Tokenize with subword tokenization; append three-layer classifier (two 1024-node FC layers + softmax).

Limitations

Single-institution dataset; single prediction task (Deauville scoring); possible variability in physician-assigned labels; language models limited to 512 tokens requiring truncation of findings; human vs AI comparison used a single reader and small subset of cases.

Output

CDEs: RDE2673, RDE2688, RDE2678
Description: Five-class classification of Deauville score (1–5) for each PET/CT examination based on report text.

Recommendation

Domain adaptation via masked language modeling on nuclear medicine text improved performance; authors recommend adapting language models to nuclear medicine data prior to downstream tasks.

Reproducibility

Seven repeated runs with identical splits across methods for each run; accuracies assessed with statistical tests (paired t tests, repeated-measures ANOVA); models implemented with HuggingFace Transformers in PyTorch.