BERT models for device labeling in chest radiograph reports
2026-01-24https://doi.org/10.1148/atlas.1769275471339
111
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
BERT models for device labeling in chest radiograph reports
Link
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344209/
Indexing
Keywords: BERT, PubMedBERT, RoBERTa, DeBERTa, DistilBERT, natural language processing, radiology reports, chest radiograph, endotracheal tube, nasogastric tube, central venous catheter, Swan-Ganz catheter, transfer learning, dataset annotation, automation
Content: IN, CH
RadLex: RID5557, RID5578, RID5566, RID5584, RID10321
Author(s)
Ali S. Tejani
Yee S. Ng
Yin Xi
Julia R. Fielding
Travis G. Browning
Jesse C. Rayan
Organization(s)
Department of Radiology, University of Texas Southwestern Medical Center
Version
1.0
License
Text: © 2022 by the Radiological Society of North America, Inc.
URL: https://pubs.rsna.org/doi/10.1148/ryai.220007
Contact
Jesse C. Rayan; email: ude.nretsewhtuostu@nayar.essej
Funding
Authors declared no funding for this work.
Ethical review
IRB approved; designated exempt status with waived informed consent; HIPAA-compliant.
Date
Published: 2022-06-29
Created: 2022-01-12
References
[1] Tejani AS, Ng YS, Xi Y, Fielding JR, Browning TG, Rayan JC. "Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets". Radiology: Artificial Intelligence. 2022 Jul;4(4):e220007.. 2022-07-01. doi:10.1148/ryai.220007. PMID: 35923377. PMCID: PMC9344209.
Model
Architecture
Transformer-based language models (BERT, PubMedBERT, DistilBERT, RoBERTa, DeBERTa) fine-tuned for multi-label binary text classification (presence/absence of devices) using PyTorch Lightning/PyTorch.
Availability
Not stated.
Clinical benefit
Rapid, accurate autonomous annotation of large radiology report datasets for presence/absence of support devices to facilitate downstream computer vision model development and workflow efficiency.
Clinical workflow phase
Workflow optimization; data curation/annotation for AI development.
Degree of automation
Fully automated inference for report labeling; training requires labeled examples.
Indications for use
Annotating adult chest radiograph text reports to indicate presence or absence of endotracheal tubes, enterogastric tubes, central venous catheters, and Swan-Ganz catheters in an academic radiology setting.
Input
Free-text adult chest radiograph reports (April 2020–March 2021) from a single academic center; 1004 reports manually labeled for training/validation/testing; full corpus 69,095 reports for inference timing.
Instructions
Fine-tune pretrained transformer model on institution-specific labeled reports (12 epochs; AdamW; initial LR 5e-6 with warmup to 5e-5 by epoch 2 and cosine annealing to 1e-6 by epoch 12). Use fivefold cross-validation with stratification; select best epoch by lowest validation loss; apply fixed weights for test and large-scale inference.
Limitations
Single-center dataset; mix of structured/unstructured reports; potential negative set bias (64.4% without devices); relatively few SGC-positive cases; known failure modes include confusion of context (e.g., not recognizing “removal” phrasing) and misinterpretation of tube-related terms; unknown generalizability to other institutions and demographics; terminology limited to those observed in 1004 annotated reports.
Output
CDEs: RDE1533, RDE1531, RDE1532, RDE1530
Description: For each radiology report, binary classifications (present/absent) for four device categories: endotracheal tube, enterogastric (nasogastric) tube, central venous catheter, Swan-Ganz catheter.
Recommendation
Use pretrained, domain-specific or newer transformer models (e.g., PubMedBERT, RoBERTa, DeBERTa) to achieve high performance with small training sets for report annotation.
Regulatory information
Comment: No regulatory submission reported.
Authorization status: Not a regulated medical device (research study).
Reproducibility
Fivefold cross-validation with stratified folds; best-epoch selection by validation loss; detailed training regimen and hardware specified. External validation not performed.
Sustainability
Training/validation times ranged from 3m39s (DistilBERT) to 22m48s (DeBERTa) on a single Nvidia GTX 1080 Ti GPU; large-scale inference on 69,095 reports completed in as little as 6m15s.
Use
Intended: Report data extraction
Out-of-scope: Detection
Excluded: Other
User
Intended: Radiologist, Other
Out-of-scope: Layperson
Excluded: Patient