fastMRI: A Publicly Available Raw k-Space and DICOM Dataset of Knee Images for Accelerated MR Image Reconstruction Using Mach...
fastMRI: A Publicly Available Raw k-Space and DICOM Dataset of Knee Images for Accelerated MR Image Reconstruction Using Machine Learning
dataset2025-11-29https://doi.org/10.1148/atlas.1764457735058
192

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

fastMRI: A Publicly Available Raw k-Space and DICOM Dataset of Knee Images for Accelerated MR Image Reconstruction Using Mach...

Link

https://doi.org/10.1148/ryai.2020190007

Indexing

Keywords: fastMRI dataset, MR image reconstruction, k-space data, DICOM image data, knee MRI examinations, accelerated MRI, machine learning, reproducibility, scan time reduction, image quality
Content: MR, MK

Author(s)

Florian Knoll
Jure Zbontar
Anuroop Sriram
Matthew J. Muckley
Mary Bruno
Aaron Defazio
Marc Parente
Krzysztof J. Geras
Joe Katsnelson
Hersh Chandarana
Zizhao Zhang
Michal Drozdzalv
Adriana Romero
Michael Rabbat
Pascal Vincent
James Pinkerton
Duo Wang
Nafissa Yakubova
Erich Owens
C. Lawrence Zitnick
Michael P. Recht
Daniel K. Sodickson
Yvonne W. Lui

Organization(s)

NYU School of Medicine
Facebook Artificial Intelligence Research
New York University Center for Data Science
University of Florida

License

Text: Non-commercial use, research and educational purposes, data sharing agreement required.

Contact

Yvonne.Lui@nyulangone.org

Funding

National Institutes of Health grants R01EB024532 and P41EB017183.

Ethical review

Curation of the dataset was part of a study approved by our local institutional review board.

Comments

The fastMRI dataset is the first large-scale public dataset that includes raw k-space data and DICOM data from a clinical population, tailored for MR image reconstruction research using machine learning. It aims to accelerate research, enable large-scale validation, and enhance reproducibility in the field.

Date

Published: 2020-01-01

Dataset

Motivation

The purpose of the fastMRI dataset is to provide the first step toward addressing the lack of a large-scale public dataset that includes raw k-space data for MR image reconstruction. It aims to provide a resource to improve image acquisition and reconstruction itself using machine learning techniques.

Sampling

The k-space dataset consists of fully sampled data from 1594 consecutive clinical MRI proton density–weighted acquisitions of the knee. The DICOM dataset includes 10012 consecutive clinical knee MRI examinations from 9290 patients, representing a full complement of clinical acquisitions.

Partitioning scheme

The 1594 k-space data examples are partitioned into training, validation, multicoil testing, single-coil testing, and a hold-back set for a challenge. The 10012 DICOM data are not partitioned into training, validation, testing, and challenge sets, but provided as a single dataset for auxiliary training or generalizability testing.

Missing information

The dataset does not provide diagnostic labeling segmentations, text reports, statistics on the prevalence of pathology, information on metal implants, or demographic information.

Relationships between instances

The examples in the training and validation set are identical for the single-coil and multicoil k-space datasets. For the challenge and test set, unique examples are provided for the single-coil and multicoil datasets to prevent information sharing.

Noise

No examinations were excluded owing to presence of imaging artifacts from motion, pulsatile flow, and so forth, for both k-space and DICOM data.

Confidentiality

k-space data were deidentified via conversion to the vendor-neutral International Society for Magnetic Resonance in Medicine (ISMRM) raw data format. DICOM data were deidentified by using the Radiological Society of North America’s clinical trial processor tool. All metadata, as well as the DICOM images themselves, were manually inspected to ensure that no protected health information remained in the dataset.

Re-identification

Not possible due to deidentification of k-space data and DICOM data, manual inspection for PHI, and generation of random integer patient identifiers for DICOM.

Sensitive data

The dataset includes pathologic findings at a rate representative of a clinical patient population, as it consists of consecutive examinations.