In-house digital breast tomosynthesis (DBT) dataset
dataset2026-01-24https://doi.org/10.1148/atlas.1769269901085
71

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

In-house digital breast tomosynthesis (DBT) dataset

Link

https://pmc.ncbi.nlm.nih.gov/articles/PMC10245183

Indexing

Keywords: digital breast tomosynthesis, DBT, breast cancer, Hologic, multi-institutional, United States, transformer, TimeSformer, Conv3D, de-identified, four-view studies
Content: BR
RadLex: RID10359, RID35976, RID10357
SNOMED: 254837009

Author(s)

Weonsuk Lee
Hyeonsoo Lee
Hyunjae Lee
Eun Kyung Park
Hyeonseob Nam
Thijs Kooi

Organization(s)

Lunit

Funding

Supported by funds secured by Lunit.

Ethical review

Data were retrospectively collected in compliance with HIPAA; all patient data were de-identified using the Safe Harbor Method. Institutional review board approval was not required. No access to a link allowing re-identification.

Comments

Dataset described as an in-house multi-institutional collection of four-view Hologic DBT studies used to train/validate/test breast cancer classification models.

Date

Published: 2023-05-10

References

[1] Lee W, Lee H, Lee H, Park EK, Nam H, Kooi T. "Transformer-based Deep Neural Network for Breast Cancer Classification on Digital Breast Tomosynthesis Images". Radiology: Artificial Intelligence. 2023-05-01. doi:10.1148/ryai.220159. PMID: 37293346. PMCID: PMC10245183.

Dataset

Motivation

To develop and evaluate a transformer-based model that incorporates neighboring DBT sections for breast cancer detection.

Sampling

6829 four-view Hologic DBT studies from nine U.S. institutions; cancer confirmed by biopsy; benign confirmed by biopsy or ≥1 year follow-up; normal confirmed by ≥1 year follow-up.

Partitioning scheme

Split into training (5174 studies), validation (1000 studies), and test (655 studies from one institution).

Missing information

Public access URL, exact participating institutions, detailed demographics, and full imaging acquisition parameters (beyond vendor and typical pixel spacing) are not provided.

Relationships between instances

Each study is a four-view DBT exam. For cancer and biopsy-proven benign cases, one mammogram per patient was included. Cancer studies were annotated at the section with largest lesion cross-section; labels include cancer, benign, and normal.

External data

Studies were collected from nine institutions in the United States through an external entity.

Confidentiality

De-identified per HIPAA Safe Harbor; no link for re-identification is available.

Re-identification

Authors state they do not have access to a link allowing re-identification of data.

Sensitive data

Protected Health Information removed according to HIPAA Safe Harbor.