COVID-19 Study Datasets (M Health Fairview, IU, Emory, and public sources)
dataset2026-01-24https://doi.org/10.1148/atlas.1769275226676
71

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

COVID-19 Study Datasets (M Health Fairview, IU, Emory, and public sources)

Link

https://pubmed.ncbi.nlm.nih.gov/35923381/

Indexing

Keywords: COVID-19, Chest radiograph, AI diagnosis, Prospective validation, External validation, Emergency department
Content: CH, IN, RS, SQ
SNOMED: 75570004, 67782005, 840539006

Author(s)

Ju Sun
Le Peng
Taihui Li
Dyah Adila
Zach Zaiman
Genevieve B. Melton-Meaux
Nicholas E. Ingraham
Eric Murray
Daniel Boley
Sean Switzer
John L. Burns
Kun Huang
Tadashi Allen
Scott D. Steenburg
Judy Wawira Gichoya
Erich Kummerfeld
Christopher J. Tignanelli

Organization(s)

University of Minnesota
M Health Fairview
Emory University
Indiana University
North Memorial Health Hospital
Minnesota Supercomputing Institute
Epic Systems (Cognitive Computing)

License

Text: © 2022 RSNA. Article available via PMC Open Access Subset with unrestricted re-use during COVID-19 pandemic per PMC notice; dataset access not publicly available.
URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344211/

Contact

Corresponding author: Christopher J. Tignanelli, University of Minnesota, 420 Delaware St SE, Minneapolis, MN 55455

Funding

Microsoft AI for Health COVID-19 grant (GPU support); AHRQ/PCORI K12HS026379; NIH NCATS KL2TR002492 (C.J.T.) and UL1TR002494 (E.K.); NIH NHLBI T32HL07741 (N.E.I.); NIBIB 75N92020D00018/75N92020F00001 (J.W.G.); NIBIB MIDRC 75N92020C00008 and 75N92020C00021 (Z.Z., J.W.G.); U.S. NSF #1928481 (J.W.G.); University of Minnesota OVPR COVID-19 Impact Grant (J.S., E.K., C.J.T.).

Ethical review

IRB approvals: University of Minnesota (STUDY 00011158; consent waived); Indiana University (STUDY 2010169012; exempt, de-identified data remained within IU); Emory University (STUDY 00000506).

Comments

Prospective, multi-site evaluation of an interpretable AI model for COVID-19 detection on chest radiographs using internal, temporal, external, and real-time datasets.

Date

Published: 2022-06-01

References

[1] Sun J, Peng L, Li T, et al.. "Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study". Radiology: Artificial Intelligence. 2022-07-01. doi:10.1148/ryai.210217. PMID: 35923381. PMCID: PMC9344211.
[2] de la Iglesia Vayá M, Saborit JM, Montell JA, et al.. "BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients". arXiv. 2020-01-01. Available from: https://arxiv.org/abs/2006.01174
[3] Johnson AEW, Pollard TJ, Berkowitz SJ, et al.. "MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports". Scientific Data. 2019-01-01. doi:10.1038/s41597-019-0322-0. PMID: 31831740. PMCID: PMC6908718.
[4] Open-i Open Access Biomedical Image Search Engine. "Indiana University Chest Radiograph Collection". U.S. National Library of Medicine. 2013-01-01. Available from: https://openi.nlm.nih.gov/
[5] Cohen JP et al.. "COVID-19 Chest X-ray Dataset (GitHub)". GitHub. 2020-01-01. Available from: https://github.com/ieee8023/covid-chestxray-dataset

Dataset

Motivation

Evaluate real-time performance, equity, and drift of an interpretable AI model for COVID-19 detection on chest radiographs across 12 hospitals.

Sampling

Adult ED and inpatient chest radiographs across specified date ranges; controls sampled to reflect pre-COVID ED CXR distribution; public datasets randomly sampled for negatives and selected COVID-19-positive frontal images for positives.

Partitioning scheme

Training on M Health Fairview CXRs plus a subset of public images; separate hold-out for tuning; temporal validation (M Health Fairview, July 2020); external validation (Indiana University; Emory University); prospective real-time validation at M Health Fairview (week 1 pilot; weeks 8–19); separate dataset for radiologist comparison.

Missing information

Image resolution, file formats per-source beyond DICOM/PNG not fully detailed; demographic details absent for public datasets due to de-identification.

Relationships between instances

Per several partitions, one radiograph per participant was used in validation cohorts; training included multiple images per participant for M Health Fairview positives (2220 images from 837 participants).

External data

Public datasets used: BIMCV COVID-19+, COVID Chest X-ray GitHub, MIMIC-CXR, Open-i.

Confidentiality

Clinical data from M Health Fairview, IU, and Emory; model not publicly available; external IU data fully de-identified for validation.

Re-identification

External IU validation deemed exempt as data were fully de-identified and remained within IU; public datasets de-identified; internal operational data handled within health systems.

Sensitive data

Health information related to COVID-19 status and clinical imaging.