Chest Radiography Foundation Model (as studied for bias)
model2025-12-03https://doi.org/10.1148/atlas.1764793213103
141

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Chest Radiography Foundation Model (as studied for bias)

Link

https://github.com/biomedia-mira/cxr-foundation-bias

Indexing

Keywords: Conventional Radiography, Computer Application-Detection/Diagnosis, Chest Radiography, Bias, Foundation Models
Content: CH
RadLex: RID11550, RID35300, RID35486, RID5352

Author(s)

Ben Glocker
Charles Jones
Mélanie Roschewitz
Stefan Winzeck

Organization(s)

Imperial College London

Version

1.0

License

Text: CC BY 4.0
URL: https://creativecommons.org/licenses/by/4.0/

Funding

B.G.: ERC grant 757173 (Project MIRA), Innovate UK/UKRI (UKRI London Medical Imaging & AI Centre for Value Based Healthcare), Royal Academy of Engineering; employment/stock options disclosures as stated. C.J.: Microsoft Research–EPSRC iCASE PhD Scholarship. M.R.: Imperial College London President's PhD Scholarship. S.W.: Innovate UK/UKRI (UKRI London Medical Imaging & AI Centre for Value Based Healthcare).

Ethical review

Retrospective study using publicly available secondary data; HIPAA-compliant; exempt from ethical approval.

Date

Updated: 2023-11-01
Published: 2023-09-27
Created: 2023-02-28

References

[1] Glocker B, Jones C, Roschewitz M, Winzeck S. "Risk of Bias in Chest Radiography Deep Learning Foundation Models". Radiology: Artificial Intelligence. 2023 Nov;5(6):e230060. 2023-09-27. doi:10.1148/ryai.230060. PMID: 38074789. PMCID: PMC10698597.

Model

Architecture

Proprietary chest radiography foundation model used as a fixed feature extractor (pretrained on natural images, then >800,000 chest radiographs using supervised contrastive learning). Downstream classifiers tested: single linear layer (CXR-linear) and multilayer perceptrons with 3 or 5 hidden layers (CXR-MLP-3, CXR-MLP-5). Baseline comparator: DenseNet-121 fine-tuned on CheXpert.

Availability

Access to the foundation model only via a programming interface that outputs features; network weights not publicly available.

Clinical benefit

Not established; study concludes the investigated foundation model may be unsafe for clinical applications due to subgroup performance disparities.

Clinical workflow phase

Research evaluation/auditing of AI models for bias and subgroup performance.

Decision threshold

Thresholds selected per model to achieve FPR = 0.20 on the whole sample; Youden J = TPR − FPR reported at this operating point.

Degree of automation

Feature extraction automated; downstream decision models trained by researchers; not autonomous clinical decision support.

Indications for use

Investigated as a feature extractor for downstream chest radiograph disease detection (labels: no finding, pleural effusion, cardiomegaly, pneumothorax) on adult patients in the CheXpert dataset; not intended for clinical use in this study.

Input

Posterior-anterior and other view chest radiographs from the CheXpert dataset; features produced by the foundation model’s backbone.

Instructions

Backbone kept frozen; extract features via provided API and train a downstream classifier (linear or MLP) on CheXpert training set with validation for model selection; evaluate on resampled balanced CheXpert test sets with subgroup analyses.

Limitations

Only one proprietary foundation model analyzed; backbone weights unavailable (cannot update feature extractor or apply debiasing that requires unfreezing); training data for the foundation model primarily from India and the US (>800k CXRs, >700k from India), potentially contributing to observed biases; analysis limited to selected labels; cannot determine exact origin of biases due to limited insight into pretraining data/process.

Output

CDEs: RDE397.1, RDE2721, RDE2927, RDE2722
Description: Backbone outputs high-dimensional feature vectors; downstream classifiers output multilabel probabilistic predictions for chest radiograph findings.

Recommendation

Comprehensive bias and subgroup performance analysis should be performed before any clinical use; the studied foundation model may be unsafe for clinical applications as it could amplify health disparities.

Reproducibility

All code and instructions to recreate data splits and analyses are available under Apache 2.0 at https://github.com/biomedia-mira/cxr-foundation-bias.

Use

Intended: Detection and diagnosis
Out-of-scope: Other
Excluded: Decision support

User

Intended: Researcher
Out-of-scope: Referring provider
Excluded: Other