OPTIMAM mammography subset for AI breast cancer risk prediction (Ellis et al., 2024)
2025-11-26https://doi.org/10.1148/atlas.1764158185224
12
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/dataset.json
Name
OPTIMAM mammography subset for AI breast cancer risk prediction (Ellis et al., 2024)
Link
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294956/
Indexing
Keywords: Breast cancer risk prediction, Screening mammography, Interval cancer, Screen-detected cancer, Mammographic density, UK NHS Breast Screening Programme, OPTIMAM
Content: BR
RadLex: RID10357
SNOMED: 254837009
Author(s)
Sam Ellis
Sandra Gomes
Matthew Trumble
Mark D. Halling-Brown
Kenneth C. Young
Nouman S. Chaudhry
Peter Harris
Lucy M. Warren
Organization(s)
Royal Surrey NHS Foundation Trust
University of Surrey
Contact
Sam Ellis (corresponding author): ten.shn@2sille.mas
Funding
Creation and maintenance of the OPTIMAM Image Database funded by Cancer Research UK (C30682/A28396). S.E. supported by the Million Women Study (C16077/A29186).
Ethical review
Data were collected with approval from an ethical research committee specializing in research databases organized by the NHS Health Research Authority.
Comments
Retrospective study using a curated subset of the UK OPTIMAM Mammography Image Database to train and evaluate a deep learning model predicting 3-year breast cancer risk from negative screening mammograms.
Date
Published: 2024-05-22
References
[1] Ellis S, Gomes S, Trumble M, et al.. "Deep Learning for Breast Cancer Risk Prediction: Application to a Large Representative UK Screening Cohort". Radiology: Artificial Intelligence. 2024-07-01. doi:10.1148/ryai.230431. PMID: 38775671. PMCID: PMC11294956.
[2] Halling-Brown MD, Warren LM, Ward D, et al.. "OPTIMAM mammography image database: a large-scale resource of mammography images and clinical data". Radiology: Artificial Intelligence. 2020-01-01. PMID: 33937853. PMCID: PMC8082293. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8082293/
Dataset
Motivation
Develop and evaluate a UK-specific AI model to predict 3-year future breast cancer risk from negative screening mammograms.
Sampling
From 5264 risk-positive and 191,488 risk-negative women, random sampling used: training negatives reduced by 50% to increase prevalence; one episode randomly selected per woman where multiple eligible episodes existed in the risk-negative group.
Partitioning scheme
Stratified 60:20:20 split into training, validation, and test to preserve cancer prevalence. Validation balanced by cancer outcome at the patient level. Test set left unmodified to reflect OPTIMAM demographics.
Missing information
Only Hologic mammography systems included; images from other manufacturers underrepresented and excluded. Examinations with confirmed cancer and images containing implants were excluded. Some screening episodes lacked follow-up and were excluded.
Relationships between instances
One screening episode per woman; two views per breast (CC and MLO). In the training split, images of healthy contralateral breasts of risk-positive patients were removed.
External data
Data derived from the OPTIMAM Mammography Image Database collected at multiple UK NHSBSP sites.