RSNA Intracranial Hemorrhage Detection (RSNA-IHD) Dataset
The Radiological Society of North America Intracranial Hemorrhage Detection (RSNA-IHD) dataset is a collection of over 25,000 CT brain scans annotated by a cohort of over 60 volunteer radiologists from RSNA and the American Society of Neuroradiology to show the presence and subtypes of acute intracranial hemorrhages. The imaging data, totaling 874,035 images, was provided by three institutions. Initially compiled in 2019 for the RSNA Intracranial Hemorrhage Detection AI Challenge hosted on Kaggle competition platform, it represents the largest publicly available collection of its kind. Additional information on the dataset and how to make use of it is provided in the Data Resource Publication (https://pubs.rsna.org/doi/10.1148/ryai.2020190211).
2025-11-21https://doi.org/10.1148/atlas.1763403432204
345
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/dataset.json
Name
RSNA Intracranial Hemorrhage Detection (RSNA-IHD) Dataset
Link
https://mira.rsna.org/dataset/1
Indexing
Keywords: Intracranial Hemorrhage, Brain Computed Tomography, Hemorrhage Classification, Neuroradiology, Multi-institutional Dataset, Expert Annotation, Machine Learning Challenge, Subarachnoid Hemorrhage, Intraventricular Hemorrhage, Subdural Hemorrhage, Epidural Hemorrhage, Intraparenchymal Hemorrhage, Computed Tomography Imaging
Content: CT, HN, NR
RadLex: RID4700, RID4710, RID6383
Author(s)
Adam E. Flanders
Luciano M. Prevedello
George Shih
Safwan S. Halabi
Jayashree Kalpathy-Cramer
Robyn Ball
John T. Mongan
Anouk Stein
Felipe C. Kitamura
Matthew P. Lungren
Gagandeep Choudhary
Lesley Cala
Luiz Coelho
Monique Mogensen
Fanny Morón
Elka Miller
Ichiro Ikuta
Vahe Zohrabian
Olivia McDonnell
Christie Lincoln
Lubdha Shah
David Joyner
Amit Agarwal
Ryan K. Lee
Jaya Nath
Organization(s)
Radiological Society of North America
American Society of Neuroradiology
Stanford University - Center for Artificial Intelligence in Medicine & Imaging (AIMI)
St. Michael’s | LKS-CHART | Diagnostic Imaging and Learning Algorithms (DILA)
Thomas Jefferson University Hospital
Universidade Federal de São Paulo
Version
1.0
License
Text: RSNA MIRA DATASET RESEARCH USE AGREEMENT
URL: https://docs.google.com/document/d/1r8_0yW-5XqxSqhFzFq2fV6L4NxIQ6drF0sBjXXJevXU/edit?tab=t.0#heading=h.1iah9825ct0r
Contact
inforamtics@rsna.org
Ethical review
Each contributing institution secured approval of its institutional review board and institutional compliance officers.
Comments
This dataset is composed of annotations of the five hemorrhage subtypes (subarachnoid, intraventricular, subdural, epidural, and intraparenchymal hemorrhage) typically encountered at brain CT. This 874,035-image, multi-institutional, and multinational brain hemorrhage CT dataset is the largest public collection of its kind that includes expert annotations from a large cohort of volunteer neuroradiologists for classifying intracranial hemorrhages. The intent for this challenge was to provide a large multi-institutional and multinational dataset to help develop machine learning algorithms that can assist in the detection and characterization of intracranial hemorrhage with brain CT.
Date
Created: 2019-12-09
Dataset
Motivation
The dataset was created to provide a large multi-institutional and multinational resource to support development of AI algorithms that can assist in the detection and characterization of intracranial hemorrhage with brain CT.
Sampling
Universidade Federal de São Paulo provided all brain CT examinations performed during a 1-year period. Stanford University preselected examinations based upon a normal versus abnormal assessment of radiology reports to provide a 50/50 sample of positive to negative examinations. Thomas Jefferson University Hospital extracted cases using simple natural language processing on radiology reports mentioning specific hemorrhage subtypes.
Partitioning scheme
Data from each institution were divided into sets of 500 examinations, and the last 100 examinations in each segment were selected for the test and validation sets, which were reviewed independently by an additional two neuroradiologists. The training, test, and validation sets were disjoint at the patient level. Institutional representation was equally balanced, and hemorrhage subtypes were appropriately represented in the validation and test sets.
Missing information
Without knowledge of patient age or sex, it was impossible to assess for some conditions (eg, age-appropriate volume loss or white matter disease) that would have aided in designating an examination as “no hemorrhage/not normal.”
Relationships between instances
The training dataset contained serial imaging on abnormal examinations, and the temporal sequence of the evolution of the abnormalities was not needed for labeling.
Noise
The most common user error observed was under-labeling, or inadvertent designation of a single image label to reflect the entire examination. The second most common error was over-labeling by applying hemorrhage labels to images that extended beyond the visible feature. Misclassification of hemorrhage subtype was the third most common error identified.
Confidentiality
Imaging and related clinical data are fully de-identified.
Re-identification
Site-specific anonymization “signatures” could be used to discriminate one site from another, requiring another round of anonymization and synthetic unique identifier generation to normalize the examination metadata.