BERT models for device labeling in chest radiograph reports
model2026-01-24https://doi.org/10.1148/atlas.1769275471339
111

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

BERT models for device labeling in chest radiograph reports

Link

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344209/

Indexing

Keywords: BERT, PubMedBERT, RoBERTa, DeBERTa, DistilBERT, natural language processing, radiology reports, chest radiograph, endotracheal tube, nasogastric tube, central venous catheter, Swan-Ganz catheter, transfer learning, dataset annotation, automation
Content: IN, CH
RadLex: RID5557, RID5578, RID5566, RID5584, RID10321

Author(s)

Ali S. Tejani
Yee S. Ng
Yin Xi
Julia R. Fielding
Travis G. Browning
Jesse C. Rayan

Organization(s)

Department of Radiology, University of Texas Southwestern Medical Center

Version

1.0

License

Text: © 2022 by the Radiological Society of North America, Inc.
URL: https://pubs.rsna.org/doi/10.1148/ryai.220007

Contact

Jesse C. Rayan; email: ude.nretsewhtuostu@nayar.essej

Funding

Authors declared no funding for this work.

Ethical review

IRB approved; designated exempt status with waived informed consent; HIPAA-compliant.

Date

Published: 2022-06-29
Created: 2022-01-12

References

[1] Tejani AS, Ng YS, Xi Y, Fielding JR, Browning TG, Rayan JC. "Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets". Radiology: Artificial Intelligence. 2022 Jul;4(4):e220007.. 2022-07-01. doi:10.1148/ryai.220007. PMID: 35923377. PMCID: PMC9344209.

Model

Architecture

Transformer-based language models (BERT, PubMedBERT, DistilBERT, RoBERTa, DeBERTa) fine-tuned for multi-label binary text classification (presence/absence of devices) using PyTorch Lightning/PyTorch.

Availability

Not stated.

Clinical benefit

Rapid, accurate autonomous annotation of large radiology report datasets for presence/absence of support devices to facilitate downstream computer vision model development and workflow efficiency.

Clinical workflow phase

Workflow optimization; data curation/annotation for AI development.

Degree of automation

Fully automated inference for report labeling; training requires labeled examples.

Indications for use

Annotating adult chest radiograph text reports to indicate presence or absence of endotracheal tubes, enterogastric tubes, central venous catheters, and Swan-Ganz catheters in an academic radiology setting.

Input

Free-text adult chest radiograph reports (April 2020–March 2021) from a single academic center; 1004 reports manually labeled for training/validation/testing; full corpus 69,095 reports for inference timing.

Instructions

Fine-tune pretrained transformer model on institution-specific labeled reports (12 epochs; AdamW; initial LR 5e-6 with warmup to 5e-5 by epoch 2 and cosine annealing to 1e-6 by epoch 12). Use fivefold cross-validation with stratification; select best epoch by lowest validation loss; apply fixed weights for test and large-scale inference.

Limitations

Single-center dataset; mix of structured/unstructured reports; potential negative set bias (64.4% without devices); relatively few SGC-positive cases; known failure modes include confusion of context (e.g., not recognizing “removal” phrasing) and misinterpretation of tube-related terms; unknown generalizability to other institutions and demographics; terminology limited to those observed in 1004 annotated reports.

Output

CDEs: RDE1533, RDE1531, RDE1532, RDE1530
Description: For each radiology report, binary classifications (present/absent) for four device categories: endotracheal tube, enterogastric (nasogastric) tube, central venous catheter, Swan-Ganz catheter.

Recommendation

Use pretrained, domain-specific or newer transformer models (e.g., PubMedBERT, RoBERTa, DeBERTa) to achieve high performance with small training sets for report annotation.

Regulatory information

Comment: No regulatory submission reported.
Authorization status: Not a regulated medical device (research study).

Reproducibility

Fivefold cross-validation with stratified folds; best-epoch selection by validation loss; detailed training regimen and hardware specified. External validation not performed.

Sustainability

Training/validation times ranged from 3m39s (DistilBERT) to 22m48s (DeBERTa) on a single Nvidia GTX 1080 Ti GPU; large-scale inference on 69,095 reports completed in as little as 6m15s.

Use

Intended: Report data extraction
Out-of-scope: Detection
Excluded: Other

User

Intended: Radiologist, Other
Out-of-scope: Layperson
Excluded: Patient