Radiograph Classification Study Dataset and Splits (NIH Chest X-ray 14, CheXpert, PadChest, MIMIC-CXR, MURA)
dataset2026-01-24https://doi.org/10.1148/atlas.1769273341930
51

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

Radiograph Classification Study Dataset and Splits (NIH Chest X-ray 14, CheXpert, PadChest, MIMIC-CXR, MURA)

Link

https://github.com/zachmurphy1/transformer-radiographs

Indexing

Keywords: visual transformer, DeiT, DenseNet121, ResNet152, EfficientNetB7, transfer learning, hidden stratification, chest radiograph, upper extremity radiograph, NIH Chest X-ray 14, CheXpert, PadChest, MIMIC-CXR, MURA
Content: CH, IN, MK, RS
SNOMED: 3723001, 233604007, 95436008, 39839004, 125605004, 36118008, 8186001, 72704001

Author(s)

Zachary R. Murphy
Kesavan Venkatesh
Jeremias Sulam
Paul H. Yi

Organization(s)

Department of Anesthesiology, University of Michigan
Department of Biomedical Engineering, Johns Hopkins University Whiting School of Engineering
University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine

Funding

Authors declared no funding for this work.

Ethical review

HIPAA-compliant retrospective study using public datasets; no IRB approval required because all data were in the public domain and no human patients were involved.

Comments

Study used five public, de-identified radiograph datasets for training, validation, and testing; code and data splits provided in the linked repository.

Date

Published: 2022-09-21

References

[1] Murphy ZR, Venkatesh K, Sulam J, Yi PH. "Visual Transformers and Convolutional Neural Networks for Disease Classification on Radiographs: A Comparison of Performance, Sample Efficiency, and Hidden Stratification". Radiology: Artificial Intelligence. 2022-11-01. doi:10.1148/ryai.220012. PMID: 36523640. PMCID: PMC9745440.
[2] Irvin J, Rajpurkar P, Ko M, et al.. "CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison". arXiv 1901.07031. 2019-01-21. Available from: https://arxiv.org/abs/1901.07031
[3] Bustos A, Pertusa A, Salinas JM, de la Iglesia-Vayá M. "PadChest: a large chest x-ray image dataset with multi-label annotated reports". Med Image Anal. 2020-01-01. PMID: 32877839. Available from: http://europepmc.org/abstract/MED/32877839
[4] Johnson AEW, Pollard TJ, Berkowitz SJ, et al.. "MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports". Sci Data. 2019-01-01. PMID: 31831740. PMCID: PMC6908718. Available from: https://www.nature.com/articles/s41597-019-0322-0
[5] Rajpurkar P, Irvin J, Bagul A, et al.. "MURA: large dataset for abnormality detection in musculoskeletal radiographs". arXiv 1712.06957. 2017-12-11. Available from: https://arxiv.org/abs/1712.06957
[6] Wang X, Peng Y, Lu L, et al.. "ChestX-Ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases". 2017 IEEE CVPR. 2017-01-01. Available from: https://openaccess.thecvf.com/content_cvpr_2017/html/Wang_ChestX-Ray8_Hospital-Scale_Chest_CVPR_2017_paper.html

Dataset

Motivation

To compare performance, sample efficiency, and hidden stratification of ViT vs CNN architectures for disease classification on radiographs using transfer learning.

Sampling

External chest radiograph test sets were created by random selection of 25,000 images from each of CheXpert, PadChest, and MIMIC.

Partitioning scheme

Held-out internal test sets for NIH Chest X-ray 14 and MURA; external chest radiograph test sets drawn as random 25,000-image subsets from CheXpert, PadChest, and MIMIC.

Missing information

Exact train/validation/test split sizes for NIH Chest X-ray 14 and MURA not specified in the article text; preprocessing details are in supplemental Appendix E1.

Relationships between instances

MURA images are grouped into studies; labels are at the study level. For training, the study label was assigned to each image; evaluation aggregated image predictions per study by mean.

Noise

Some labels derived via NLP from radiology reports (Chest X-ray 14, PadChest; CheXpert and MIMIC have seven NLP-derived labels), which may introduce label noise.

External data

Training on NIH Chest X-ray 14 and MURA with internal held-out test sets; external testing on CheXpert, PadChest, and MIMIC-CXR (random 25,000-image subsets each).

Confidentiality

Public, de-identified datasets; HIPAA-compliant use.

Re-identification

Datasets are described as de-identified/public; no direct identifiers reported.

Sensitive data

No PHI; public de-identified radiograph datasets.