External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review
model2026-01-24https://doi.org/10.1148/atlas.1769275925198
90

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review

Link

https://dx.doi.org/10.1148/ryai.210064

Indexing

Keywords: Systematic review, external validation, deep learning, radiologic diagnosis, generalizability, AUC, sensitivity, specificity
Content: IN, RS

Author(s)

Alice C. Yu
Bahram Mohajer
John Eng

Organization(s)

Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine

Version

1.0

Funding

Authors declared no funding for this work.

Ethical review

Systematic review; exempt from institutional review board review (stated).

Date

Published: 2022-05-04
Created: 2021-02-25

Model

Architecture

Various deep convolutional neural networks; multiple architecture types represented with ResNet being the most common among included studies.

Clinical benefit

Assesses generalizability of published DL diagnostic imaging algorithms by comparing internal vs external performance.

Clinical workflow phase

Research synthesis and evidence assessment; not a deployable model.

Input

Radiologic images across multiple modalities and body parts from included studies.

Limitations

Substantial heterogeneity across included studies (body parts, modalities, diseases, performance measures); limited methodological and clinical reporting; few studies adhered to reporting guidelines; external datasets typically smaller; inability to pool quantitatively; potential publication bias.

Output

CDEs: RDE1661, RDE1665, RDE1660
Description: Synthesis of external validation results: differences between development (internal) and external performance of DL algorithms for radiologic diagnosis.

Recommendation

Future studies should include external validation and improved reporting to assess generalizability; be cautious interpreting higher external performance due to potential data leakage or unrepresentative datasets.