ADMANI: Annotated Digital Mammograms and Associated Non-Image Datasets
ADMANI: Annotated Digital Mammograms and Associated Non-Image Datasets
dataset2025-11-29https://doi.org/10.1148/atlas.1764458851670
155

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

ADMANI: Annotated Digital Mammograms and Associated Non-Image Datasets

Link

https://doi.org/10.1148/ryai.220072

Indexing

Keywords: Mammography, Screening, Convolutional Neural Network, Breast Cancer Detection, Population Screening
Content: BR, OI, BQ, PH

Author(s)

Helen M. L. Frazer
Jennifer S. N. Tang
Michael S. Elliott
Katrina M. Kunicki
Brendan Hill
Ravishankar Karthik
Chun Fung Kwok
Carlos A. Peña-Solorzano
Yuanhong Chen
Chong Wang
Osamah Al-Qershi
Samantha K. Fox
Shuai Li
Enes Makalic
Tuong L. Nguyen
Daniel F. Schmidt
Prabhathi Basnayake Ralalage
Jocelyn F. Lippey
Peter Brotchie
John L. Hopper
Gustavo Carneiro
Davis J. McCarthy

Organization(s)

St Vincent’s BreastScreen
St Vincent’s Hospital Melbourne
BreastScreen Victoria
St Vincent’s Institute of Medical Research
University of Melbourne
University of Adelaide
Monash University
Ramaciotti Foundation

License

Text: Attribution 4.0 International (CC BY 4.0)
URL: https://creativecommons.org/licenses/by/4.0/

Contact

Helen.Frazer@svha.org.au

Funding

Supported in part by an Australian government grant (no. MRFAI000090), as part of the 2019 Medical Research Future Fund (MRFF) Applied Artificial Intelligence Research in Health grant opportunity, which funded the Transforming Breast Cancer Screening with AI (BRAIx) program. The BRAIx program also includes significant in-kind contributions from its grant partners: St Vincent’s Institute of Medical Research, St Vincent’s Hospital Melbourne, BreastScreen Victoria, University of Melbourne, and University of Adelaide. This work was also supported by a Ramaciotti Health Investment Grant from the Ramaciotti Foundation in Australia.

Ethical review

Use of the ADMANI datasets is governed under the executed BRAIx Multi-Institutional Agreement, with approvals by the human research ethics committee (approval nos. LNR/18/SVHM/162 and LNR/19/SVHM/123). All women sign a consent form at screening registration that provides for the use of the de-identified data for research purposes. A unique identifier is used for the purposes of the ADMANI datasets, with all image and non-image data de-identified.

Comments

The ADMANI datasets (ADMANI1, ADMANI2, and ADMANI3) are large-scale, multicenter, clinically curated breast screening mammographic datasets created for artificial intelligence algorithm development. They contain over 4.4 million images from 629,863 women and 1,048,345 screening episodes, enhanced with associated demographic and clinical non-image data.

Dataset

Motivation

To enable the development of AI-based algorithms to aid breast cancer detection in the mammographic screening population and support risk-based screening.

Sampling

The ADMANI2 and ADMANI3 datasets are large-scale, population-based, longitudinal resources reflecting the real-world screened population. The RSNA Challenge subset was randomly selected from the dataset from a 3-year period.

Partitioning scheme

The ADMANI datasets are structured into three temporal subsets (ADMANI1, ADMANI2, ADMANI3) covering different screening periods. A separate subset of 40,000 images from 10,000 episodes was randomly selected for the RSNA Breast Cancer Detection AI Challenge.

Missing information

Interval cancer data from 2018 and 2019 are awaiting updates. Digital breast tomosynthesis images and breast ultrasound images are yet to be included.

Relationships between instances

The datasets are structured around individual screening episodes, defined as a single screening round including mammography, reading, assessment, and the subsequent 2-year screening interval.

Confidentiality

All image and non-image data are de-identified, and a unique identifier is used for the ADMANI datasets.

Re-identification

It is not possible to identify individuals as all image and non-image data are de-identified and a unique identifier is used.

Sensitive data

Patient data collected includes age, country of birth, risk category, symptoms (e.g., lump, nipple discharge), and use of hormone replacement therapy. Radiologist reading data includes lesion side and grade. Histopathologic data contains surgical specimen results, including lesion subtype. All data is de-identified.