Digital Breast Tomosynthesis (DBT) dataset
dataset2025-11-29https://doi.org/10.1148/atlas.1764445190500
43

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

Digital Breast Tomosynthesis (DBT) dataset

Link

https://pubmed.ncbi.nlm.nih.gov/38568095/

Indexing

Keywords: digital breast tomosynthesis, DBT, breast cancer, artificial intelligence, reader study, Hologic, GE HealthCare, sensitivity, specificity, reading time
Content: BR, OI
RadLex: RID10359, RID45682, RID49440
SNOMED: 254837009, 82711006, 1162814007

Author(s)

Eun Kyung Park
SooYoung Kwak
Weonsuk Lee
Joon Suk Choi
Thijs Kooi
Eun-Kyung Kim

Organization(s)

Lunit
Department of Radiology, Yongin Severance Hospital, College of Medicine, Yonsei University

License

Text: © 2024 by the Radiological Society of North America, Inc.
URL: https://pubs.rsna.org/doi/10.1148/ryai.230318

Funding

Supported by funds secured by Lunit.

Ethical review

Retrospective study approved by ethics review and central IRB; informed consent waived. Mammography examinations were de-identified according to HIPAA Safe Harbor.

Comments

Retrospective multi-institutional DBT dataset used to develop and validate an AI algorithm and to run a multi-reader multi-case study. Data de-identified per HIPAA Safe Harbor.

Date

Published: 2024-04-03

References

[1] Park EK, Kwak S, Lee W, Choi JS, Kooi T, Kim E-K. "Impact of AI for Digital Breast Tomosynthesis on Breast Cancer Detection and Interpretation Time". Radiology: Artificial Intelligence. 2024-05-01. doi:10.1148/ryai.230318. PMID: 38568095. PMCID: PMC11140510.

Dataset

Motivation

Develop and validate a deep learning AI algorithm for breast cancer detection on DBT and assess its impact on radiologist accuracy and interpretation time.

Sampling

Inclusion: female, ≥22 years, devices from Hologic or GE HealthCare, four-view screening or diagnostic DBT with full-field DM or synthetic 2D image; Exclusion: prior breast cancer, prior surgery or vacuum-assisted biopsy, implants or pacemakers on required images, inadequate image quality per MQSA.

Partitioning scheme

Development dataset split into training, tuning, test, and external test. Separate stand-alone validation set (n=2202). Separate cancer-enriched reader study set (n=258).

Missing information

Per-partition exact counts for training/tuning/test/external test splits not reported; image file formats and pixel resolutions not specified.

Relationships between instances

Each examination includes four views (R-CC, R-MLO, L-CC, L-MLO); lesion annotations include 3D location and feature type (soft tissue, calcification, or both).

External data

Development data sourced from five data sources in the United States and South Korea (2010–2021 US; 2012–2018 South Korea).

Confidentiality

De-identified according to HIPAA Safe Harbor standard.

Re-identification

Low risk due to de-identification per HIPAA Safe Harbor.

Sensitive data

Medical imaging data of breast examinations.