RSNA Intracranial Hemorrhage Detection (RSNA-IHD) Dataset

The Radiological Society of North America Intracranial Hemorrhage Detection (RSNA-IHD) dataset is a collection of over 25,000 CT brain scans annotated by a cohort of over 60 volunteer radiologists from RSNA and the American Society of Neuroradiology to show the presence and subtypes of acute intracranial hemorrhages. The imaging data, totaling 874,035 images, was provided by three institutions. Initially compiled in 2019 for the RSNA Intracranial Hemorrhage Detection AI Challenge hosted on Kaggle competition platform, it represents the largest publicly available collection of its kind. Additional information on the dataset and how to make use of it is provided in the Data Resource Publication (https://pubs.rsna.org/doi/10.1148/ryai.2020190211).

2025-11-21https://doi.org/10.1148/atlas.1763403432204

345

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

RSNA Intracranial Hemorrhage Detection (RSNA-IHD) Dataset

Link

https://mira.rsna.org/dataset/1

Indexing

Keywords: Intracranial Hemorrhage, Brain Computed Tomography, Hemorrhage Classification, Neuroradiology, Multi-institutional Dataset, Expert Annotation, Machine Learning Challenge, Subarachnoid Hemorrhage, Intraventricular Hemorrhage, Subdural Hemorrhage, Epidural Hemorrhage, Intraparenchymal Hemorrhage, Computed Tomography Imaging

Content: CT, HN, NR

RadLex: RID4700, RID4710, RID6383

Author(s)

Adam E. Flanders

Luciano M. Prevedello

George Shih

Safwan S. Halabi

Jayashree Kalpathy-Cramer

Robyn Ball

John T. Mongan

Anouk Stein

Felipe C. Kitamura

Matthew P. Lungren

Gagandeep Choudhary

Lesley Cala

Luiz Coelho

Monique Mogensen

Fanny Morón

Elka Miller

Ichiro Ikuta

Vahe Zohrabian

Olivia McDonnell

Christie Lincoln

Lubdha Shah

David Joyner

Amit Agarwal

Ryan K. Lee

Jaya Nath

Organization(s)

Radiological Society of North America

American Society of Neuroradiology

Stanford University - Center for Artificial Intelligence in Medicine & Imaging (AIMI)

St. Michael’s | LKS-CHART | Diagnostic Imaging and Learning Algorithms (DILA)

Thomas Jefferson University Hospital

Universidade Federal de São Paulo

Version

1.0

License

Text: RSNA MIRA DATASET RESEARCH USE AGREEMENT

URL: https://docs.google.com/document/d/1r8_0yW-5XqxSqhFzFq2fV6L4NxIQ6drF0sBjXXJevXU/edit?tab=t.0#heading=h.1iah9825ct0r

Contact

inforamtics@rsna.org

Ethical review

Each contributing institution secured approval of its institutional review board and institutional compliance officers.

Comments

This dataset is composed of annotations of the five hemorrhage subtypes (subarachnoid, intraventricular, subdural, epidural, and intraparenchymal hemorrhage) typically encountered at brain CT. This 874,035-image, multi-institutional, and multinational brain hemorrhage CT dataset is the largest public collection of its kind that includes expert annotations from a large cohort of volunteer neuroradiologists for classifying intracranial hemorrhages. The intent for this challenge was to provide a large multi-institutional and multinational dataset to help develop machine learning algorithms that can assist in the detection and characterization of intracranial hemorrhage with brain CT.

Date

Created: 2019-12-09

Dataset

Motivation

The dataset was created to provide a large multi-institutional and multinational resource to support development of AI algorithms that can assist in the detection and characterization of intracranial hemorrhage with brain CT.

Sampling

Universidade Federal de São Paulo provided all brain CT examinations performed during a 1-year period. Stanford University preselected examinations based upon a normal versus abnormal assessment of radiology reports to provide a 50/50 sample of positive to negative examinations. Thomas Jefferson University Hospital extracted cases using simple natural language processing on radiology reports mentioning specific hemorrhage subtypes.

Partitioning scheme

Data from each institution were divided into sets of 500 examinations, and the last 100 examinations in each segment were selected for the test and validation sets, which were reviewed independently by an additional two neuroradiologists. The training, test, and validation sets were disjoint at the patient level. Institutional representation was equally balanced, and hemorrhage subtypes were appropriately represented in the validation and test sets.

Missing information

Without knowledge of patient age or sex, it was impossible to assess for some conditions (eg, age-appropriate volume loss or white matter disease) that would have aided in designating an examination as “no hemorrhage/not normal.”

Relationships between instances

The training dataset contained serial imaging on abnormal examinations, and the temporal sequence of the evolution of the abnormalities was not needed for labeling.

Noise

The most common user error observed was under-labeling, or inadvertent designation of a single image label to reflect the entire examination. The second most common error was over-labeling by applying hemorrhage labels to images that extended beyond the visible feature. Misclassification of hemorrhage subtype was the third most common error identified.

Confidentiality

Imaging and related clinical data are fully de-identified.

Re-identification

Site-specific anonymization “signatures” could be used to discriminate one site from another, requiring another round of anonymization and synthetic unique identifier generation to normalize the examination metadata.