BERT-based Transfer Learning in Sentence-level Anatomic Classification of Free-Text Radiology Reports
model2026-01-24https://doi.org/10.1148/atlas.1769272111217
21

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

BERT-based Transfer Learning in Sentence-level Anatomic Classification of Free-Text Radiology Reports

Link

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077075/

Indexing

Keywords: BERT, transfer learning, radiology reports, sentence-level classification, anatomy, Japanese, PET/CT, precision-recall, AUC
Content: IN, CT, MI

Author(s)

Daiki Nishigaki
Yuki Suzuki
Tomohiro Wataya
Kosuke Kita
Kazuki Yamagata
Junya Sato
Shoji Kido
Noriyuki Tomiyama

Organization(s)

Osaka University Graduate School of Medicine, Department of Artificial Intelligence Diagnostic Radiology
Osaka University Graduate School of Medicine, Department of Radiology

Version

1.0

License

Text: © 2023 by the Radiological Society of North America, Inc.
URL: https://pubs.rsna.org/doi/10.1148/ryai.220097

Contact

Corresponding author: Shoji Kido, email: pj.ca.u-akaso.dem.loidar@odik (as printed, obfuscated)

Funding

Supported by Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Scientific Research (KAKENHI) grant no. JP21H03840.

Ethical review

Retrospective study approved by the institutional review boards of Osaka University Hospital and Medical Imaging Clinic; informed consent was waived.

Date

Updated: 2023-01-26
Published: 2023-02-15
Created: 2022-05-11

References

[1] Nishigaki D, Suzuki Y, Wataya T, et al.. "BERT-based Transfer Learning in Sentence-level Anatomic Classification of Free-Text Radiology Reports". Radiology: Artificial Intelligence. 2023 Mar;5(2):e220097.. 2023-02-15. doi:10.1148/ryai.220097. PMID: 37035437. PMCID: PMC10077075.

Model

Architecture

Transformer-based BERT (UTH-BERT, BERTbase: 12 encoder layers, 768-dim embeddings) fine-tuned with a linear classification layer (7-way softmax).

Clinical benefit

Automatically organizes free-text radiology reports at the sentence level by anatomic region to help users extract information efficiently, reduce labeling costs for developing clinical support systems, and potentially prevent clinical communication errors.

Clinical workflow phase

Workflow optimization; clinical decision support (information extraction/organization).

Decision threshold

Default argmax over 7-class softmax; per-class probability thresholds can be tuned (e.g., brain at 2.5% increased recall to 91.5% with precision 85.7%).

Degree of automation

Fully automated sentence-level anatomic labeling of free-text report sentences after model deployment.

Indications for use

Sentence-level anatomic classification of the findings section of free-text whole-body PET/CT radiology reports; dataset comprised Japanese-language reports from a single institution.

Input

Tokenized sentences (Japanese) from the findings sections of PET/CT radiology reports.

Instructions

Sentences are tokenized using the UTH-BERT tokenizer; sequences padded per batch; special tokens [CLS], [SEP], [PAD] added; fine-tune all layers with Adam optimizer and categorical cross-entropy; select model by macro-averaged F1 on validation set.

Limitations

Single-institution dataset; Japanese language only; single modality (PET/CT) used for report source; class imbalance with fewer samples for brain/limbs/spine; lower recall for minority classes under default single-label argmax; generalizability to other institutions, input devices (e.g., voice-to-text), languages, and modalities is untested.

Output

CDEs: RDE2507, RDE2519.2, RDE2509, RDE2514, RDE2516, RDE2510, RDE2508
Description: For each sentence, a single anatomic class label among seven categories with associated class probabilities.

Recommendation

Adjust per-class probability thresholds according to intended use (e.g., lower thresholds to increase sensitivity for target classes or higher thresholds to increase precision for weak supervision).

Regulatory information

Authorization status: Not a regulated clinical device; research study.

Reproducibility

Implementation details reported: Python 3.8.5; PyTorch 1.8.1; Transformers 4.13.0; scikit-learn 0.24.2; NumPy 1.21.2; R 4.1.2 with pROC for ROC statistics; trained on NVIDIA Titan RTX GPU (CUDA 10.1).

Sustainability

Training/inference performed on a single NVIDIA Titan RTX GPU; no run-time or energy metrics reported.

Use

Intended: Report data extraction
Out-of-scope: Report processing
Excluded: Decision support

User

Intended: Referring provider, Radiologist, Researcher
Out-of-scope: Layperson