Radiology BERT
2026-01-24https://doi.org/10.1148/atlas.1769275373753
301
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
Radiology BERT
Link
https://doi.org/10.1148/ryai.210185
Indexing
Keywords: radiology reports, speech recognition errors, BERT, natural language processing, error detection, token classification
Content: IN
RadLex: RID1034, RID5678, RID1245
Author(s)
Gunvant R. Chaudhari
Tengxiao Liu
Timothy L. Chen
Gabby B. Joseph
Maya Vella
Yoo Jin Lee
Thienkhai H. Vu
Youngho Seo
Andreas M. Rauschecker
Charles E. McCulloch
Jae Ho Sohn
Organization(s)
Department of Radiology and Biomedical Imaging, University of California San Francisco
Department of Epidemiology and Statistics, University of California San Francisco
Version
1.0
License
Text: © 2022 by the Radiological Society of North America, Inc.
Contact
Jae Ho Sohn
Funding
Authors declared no funding for this work.
Ethical review
Retrospective model development approved by the institutional human ethics board and conducted in accordance with the Helsinki Declaration (consent waived).
Date
Updated: 2022-05-10
Published: 2022-05-25
Created: 2021-07-05
References
[1] Chaudhari GR, Liu T, Chen TL, Joseph GB, Vella M, Lee YJ, Vu TH, Seo Y, Rauschecker AM, McCulloch CE, Sohn JH. "Application of a Domain-specific BERT for Detection of Speech Recognition Errors in Radiology Reports". Radiology: Artificial Intelligence. 2022 Jul;4(4):e210185.. 2022-07-01. doi:10.1148/ryai.210185. PMID: 35923373. PMCID: PMC9344210.
Model
Architecture
Bidirectional Encoder Representations from Transformers (BERT) initialized from Clinical BioBERT, further pretrained on radiology report corpus (masked language modeling and next sentence prediction), then fine-tuned for token-level classification with a fully connected linear layer and softmax to label tokens as normal, insertion, deletion, substitution, or padding.
Availability
Not provided.
Clinical benefit
Flags potential speech recognition errors and suggests corrections in radiology report impression sentences to reduce proofreading burden and improve report quality.
Clinical workflow phase
Clinical decision support systems; workflow optimization at report proofreading/signing.
Decision threshold
For sentence-level analyses, optimal ROC threshold (point closest to [0,1]) on signed reports test set; same threshold applied to prospective dataset.
Degree of automation
Decision support—assists radiologists by automatically flagging suspected errors; final decisions remain with the user.
Indications for use
Detection of insertion, deletion, and substitution speech recognition errors in impression sentences of radiology reports across multiple imaging modalities in hospital radiology departments using dictation-based workflows.
Input
Impression section sentences from dictated radiology reports (PowerScribe).
Instructions
Use on impression sentences prior to report signing to flag unusual or out-of-context tokens for radiologist review; corrections can be suggested by the companion correction model.
Limitations
Developed using reports from two institutions and a single SR software (PowerScribe); syntax variability across sites and radiologists can cause false positives; trained on imperfect reports—some true errors may be present in training data leading to false negatives; sentence-level model cannot leverage full report, imaging, or EMR context, limiting detection of certain errors (e.g., negation/laterality changes).
Output
CDEs: RDE2267, RDE397, RDE341
Description: Token-level classification indicating normal token or suspected insertion, deletion, or substitution error; sentence-level likelihood of containing an error. A separate model provides top candidate word suggestions for detected deletion/substitution errors.
Recommendation
Use as an assistive tool to highlight potential SR errors for radiologist verification prior to signing reports.
Reproducibility
Implemented with PyTorch (v1.6.0) and HuggingFace Transformers (v3.4.0); fivefold cross-validation used during model search; thresholds and bootstrapped CIs described in the paper.
Use
Intended: Detection, Mitigation
Out-of-scope: Artifact reduction, Report processing
Excluded: Other
User
Intended: Radiologist, Other
Out-of-scope: Patient
Excluded: Layperson