Batch x-ray (constructed from RSNA Pneumonia Detection Challenge normals and Kermany et al pediatric pneumonia)
2026-01-24https://doi.org/10.1148/atlas.1769272494681
30
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/dataset.json
Name
Batch x-ray (constructed from RSNA Pneumonia Detection Challenge normals and Kermany et al pediatric pneumonia)
Link
https://pmc.ncbi.nlm.nih.gov/articles/PMC9885377
Indexing
Keywords: batch effect, chest radiograph, pneumonia, adult, pediatric, generalizability
Content: CH
RadLex: RID50374, RID11284, RID5350, RID10345, RID34492, RID5648
Author(s)
Farhad Maleki
Katie Ovens
Rajiv Gupta
Caroline Reinhold
Alan Spatz
Reza Forghani
Organization(s)
Radiological Society of North America (RSNA)
Kaggle
Ethical review
Retrospective, institutional review board–exempt study as stated for the overall work.
Comments
Constructed by the study to simulate a batch effect: pneumonia cases sourced from Kermany et al (pediatric), normal (no findings) cases sourced from the RSNA Pneumonia Detection Challenge (adult). Used to demonstrate how batch effects impair generalizability.
Date
Published: 2022-11-16
References
[1] Kermany DS, Goldbaum M, Cai W, et al.. "Identifying medical diagnoses and treatable diseases by image-based deep learning". Cell. 2018. PMID: 29474911.
[2] . "RSNA Pneumonia Detection Challenge (dataset)". Kaggle. . Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge
[3] Maleki F, Ovens K, Gupta R, Reinhold C, Spatz A, Forghani R. "Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls". Radiology: Artificial Intelligence. 2022. doi:10.1148/ryai.220028. PMID: 36721408. PMCID: PMC9885377.
Dataset
Motivation
To simulate and quantify the impact of batch effect on model generalizability.
Sampling
All pneumonia samples taken from Kermany et al; all normal (no findings) samples taken from RSNA Pneumonia Detection Challenge.
Missing information
No per-patient counts, ages, or file formats reported for this constructed dataset in the article.
Relationships between instances
Class membership is confounded with data source and age group: pneumonia images from pediatric patients; normal images from adults.
External data
Composed of two external public datasets as sources: RSNA Pneumonia Detection Challenge normals (adults) and Kermany et al pediatric pneumonia.
Sensitive data
Includes pediatric chest radiographs.