Batch x-ray (constructed from RSNA Pneumonia Detection Challenge normals and Kermany et al pediatric pneumonia)
dataset2026-01-24https://doi.org/10.1148/atlas.1769272494681
30

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

Batch x-ray (constructed from RSNA Pneumonia Detection Challenge normals and Kermany et al pediatric pneumonia)

Link

https://pmc.ncbi.nlm.nih.gov/articles/PMC9885377

Indexing

Keywords: batch effect, chest radiograph, pneumonia, adult, pediatric, generalizability
Content: CH
RadLex: RID50374, RID11284, RID5350, RID10345, RID34492, RID5648

Author(s)

Farhad Maleki
Katie Ovens
Rajiv Gupta
Caroline Reinhold
Alan Spatz
Reza Forghani

Organization(s)

Radiological Society of North America (RSNA)
Kaggle

Ethical review

Retrospective, institutional review board–exempt study as stated for the overall work.

Comments

Constructed by the study to simulate a batch effect: pneumonia cases sourced from Kermany et al (pediatric), normal (no findings) cases sourced from the RSNA Pneumonia Detection Challenge (adult). Used to demonstrate how batch effects impair generalizability.

Date

Published: 2022-11-16

References

[1] Kermany DS, Goldbaum M, Cai W, et al.. "Identifying medical diagnoses and treatable diseases by image-based deep learning". Cell. 2018. PMID: 29474911.
[2] . "RSNA Pneumonia Detection Challenge (dataset)". Kaggle. . Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge
[3] Maleki F, Ovens K, Gupta R, Reinhold C, Spatz A, Forghani R. "Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls". Radiology: Artificial Intelligence. 2022. doi:10.1148/ryai.220028. PMID: 36721408. PMCID: PMC9885377.

Dataset

Motivation

To simulate and quantify the impact of batch effect on model generalizability.

Sampling

All pneumonia samples taken from Kermany et al; all normal (no findings) samples taken from RSNA Pneumonia Detection Challenge.

Missing information

No per-patient counts, ages, or file formats reported for this constructed dataset in the article.

Relationships between instances

Class membership is confounded with data source and age group: pneumonia images from pediatric patients; normal images from adults.

External data

Composed of two external public datasets as sources: RSNA Pneumonia Detection Challenge normals (adults) and Kermany et al pediatric pneumonia.

Sensitive data

Includes pediatric chest radiographs.