Chest Radiography Foundation Model (as studied for bias)
2025-12-03https://doi.org/10.1148/atlas.1764793213103
141
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
Chest Radiography Foundation Model (as studied for bias)
Link
https://github.com/biomedia-mira/cxr-foundation-bias
Indexing
Keywords: Conventional Radiography, Computer Application-Detection/Diagnosis, Chest Radiography, Bias, Foundation Models
Content: CH
RadLex: RID11550, RID35300, RID35486, RID5352
Author(s)
Ben Glocker
Charles Jones
Mélanie Roschewitz
Stefan Winzeck
Organization(s)
Imperial College London
Version
1.0
License
Text: CC BY 4.0
URL: https://creativecommons.org/licenses/by/4.0/
Funding
B.G.: ERC grant 757173 (Project MIRA), Innovate UK/UKRI (UKRI London Medical Imaging & AI Centre for Value Based Healthcare), Royal Academy of Engineering; employment/stock options disclosures as stated. C.J.: Microsoft Research–EPSRC iCASE PhD Scholarship. M.R.: Imperial College London President's PhD Scholarship. S.W.: Innovate UK/UKRI (UKRI London Medical Imaging & AI Centre for Value Based Healthcare).
Ethical review
Retrospective study using publicly available secondary data; HIPAA-compliant; exempt from ethical approval.
Date
Updated: 2023-11-01
Published: 2023-09-27
Created: 2023-02-28
References
[1] Glocker B, Jones C, Roschewitz M, Winzeck S. "Risk of Bias in Chest Radiography Deep Learning Foundation Models". Radiology: Artificial Intelligence. 2023 Nov;5(6):e230060. 2023-09-27. doi:10.1148/ryai.230060. PMID: 38074789. PMCID: PMC10698597.
Model
Architecture
Proprietary chest radiography foundation model used as a fixed feature extractor (pretrained on natural images, then >800,000 chest radiographs using supervised contrastive learning). Downstream classifiers tested: single linear layer (CXR-linear) and multilayer perceptrons with 3 or 5 hidden layers (CXR-MLP-3, CXR-MLP-5). Baseline comparator: DenseNet-121 fine-tuned on CheXpert.
Availability
Access to the foundation model only via a programming interface that outputs features; network weights not publicly available.
Clinical benefit
Not established; study concludes the investigated foundation model may be unsafe for clinical applications due to subgroup performance disparities.
Clinical workflow phase
Research evaluation/auditing of AI models for bias and subgroup performance.
Decision threshold
Thresholds selected per model to achieve FPR = 0.20 on the whole sample; Youden J = TPR − FPR reported at this operating point.
Degree of automation
Feature extraction automated; downstream decision models trained by researchers; not autonomous clinical decision support.
Indications for use
Investigated as a feature extractor for downstream chest radiograph disease detection (labels: no finding, pleural effusion, cardiomegaly, pneumothorax) on adult patients in the CheXpert dataset; not intended for clinical use in this study.
Input
Posterior-anterior and other view chest radiographs from the CheXpert dataset; features produced by the foundation model’s backbone.
Instructions
Backbone kept frozen; extract features via provided API and train a downstream classifier (linear or MLP) on CheXpert training set with validation for model selection; evaluate on resampled balanced CheXpert test sets with subgroup analyses.
Limitations
Only one proprietary foundation model analyzed; backbone weights unavailable (cannot update feature extractor or apply debiasing that requires unfreezing); training data for the foundation model primarily from India and the US (>800k CXRs, >700k from India), potentially contributing to observed biases; analysis limited to selected labels; cannot determine exact origin of biases due to limited insight into pretraining data/process.
Output
CDEs: RDE397.1, RDE2721, RDE2927, RDE2722
Description: Backbone outputs high-dimensional feature vectors; downstream classifiers output multilabel probabilistic predictions for chest radiograph findings.
Recommendation
Comprehensive bias and subgroup performance analysis should be performed before any clinical use; the studied foundation model may be unsafe for clinical applications as it could amplify health disparities.
Reproducibility
All code and instructions to recreate data splits and analyses are available under Apache 2.0 at https://github.com/biomedia-mira/cxr-foundation-bias.
Use
Intended: Detection and diagnosis
Out-of-scope: Other
Excluded: Decision support
User
Intended: Researcher
Out-of-scope: Referring provider
Excluded: Other