Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation
model2025-11-30https://doi.org/10.1148/atlas.1764532270797
51

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation

Link

https://dx.doi.org/10.1148/ryai.220231

Indexing

Keywords: Brain Tumor Segmentation, Deep Learning Algorithms, Glioblastoma, Cancer, Machine Learning
Content: MR, NR, OI
RadLex: RID10312, RID4044, RID12775, RID35806
SNOMED: 126952004, 1163375002

Author(s)

Katharina V. Hoebel
Christopher P. Bridge
Sara Ahmed
Oluwatosin Akintola
Caroline Chung
Raymond Y. Huang
Jason M. Johnson
Albert Kim
K. Ina Ly
Ken Chang
Jay Patel
Marco Pinho
Tracy T. Batchelor
Bruce R. Rosen
Elizabeth R. Gerstner
Jayashree Kalpathy-Cramer

Organization(s)

Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital
Stephen E. and Catherine Pappas Center for Neuro-Oncology, Massachusetts General Hospital
Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology
MGH and BWH Center for Clinical Data Science
The University of Texas MD Anderson Cancer Center (Departments of Radiation Oncology, Diagnostic Radiology, Neuroradiology)
Brigham and Women’s Hospital (Departments of Radiology, Neurology)
University of Texas Southwestern Medical Center (Radiology and Advanced Imaging Research Center)
University of Colorado Anschutz Medical Campus (Ophthalmology)

Version

1.0

License

Text: © 2023 by the Radiological Society of North America, Inc.

Contact

Jayashree Kalpathy-Cramer (corresponding author), email: ude.ztuhcsnauc@remarc-yhtaplak.eerhsayaj

Funding

NIH U01CA242879 (K.V.H., J.K.C.); NIH R01CA129371 and K23CA169021 (E.R.G.); P41EB015896 (Athinoula A. Martinos Center resources). Additional disclosures and institutional/grant supports listed in article.

Ethical review

Institutional review board approval with waiver of written consent (secondary analysis of imaging data from two clinical trials; ClinicalTrials.gov identifiers NCT00756106 and NCT00662506).

Date

Updated: 2023-11-22
Published: 2023-11-22
Created: 2022-10-31

References

[1] Hoebel KV, Bridge CP, Ahmed S, et al.. "Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation". Radiology: Artificial Intelligence. 2024 Jan;6(1):e220231.. 2023-11-22. doi:10.1148/ryai.220231. PMID: 38197800. PMCID: PMC10831514.

Model

Architecture

Monte Carlo dropout three-dimensional U-Net trained on 64×64×16 voxel patches with dropout probability 0.2 after each convolutional layer; weighted cross-entropy loss; STAPLE used to combine 10 Monte Carlo dropout samples into final segmentation.

Availability

Model implemented in the open-source DeepNeuro framework; quantitative metrics computed with the pymia Python package. Data available from corresponding author upon request.

Clinical benefit

Automated segmentation of postoperative glioblastoma T2-FLAIR abnormality to support expert review and downstream tasks such as treatment planning and response assessment.

Clinical workflow phase

Clinical decision support systems; workflow optimization for segmentation review and editing.

Degree of automation

Assists decision making by generating automatic segmentations for expert review; not intended for fully autonomous clinical use.

Indications for use

Segmentation of areas of T2-weighted FLAIR abnormality corresponding to total tumor burden in postoperative glioblastoma patients; intended for use by specialists in neuro-oncology, neuroradiology, and radiation oncology within clinical/research environments.

Input

Three-channel MRI: T1-weighted precontrast, T1-weighted postcontrast, and T2-weighted FLAIR sequences (registered to FLAIR) from postoperative glioblastoma patients.

Instructions

Register T1 pre- and postcontrast to T2-FLAIR; perform brain extraction, N4 bias correction, and z-score normalization of the brain region. Apply the Monte Carlo dropout 3D U-Net with dropout p=0.2 on 64×64×16 patches; aggregate 10 stochastic predictions using STAPLE to obtain the final segmentation.

Limitations

Study focused on postoperative glioblastoma on T2-FLAIR; findings may not generalize to preoperative tumors or other pathologies. Expert rating used only T2-FLAIR images (no additional sequences during rating). Data derived from two clinical trials; single-manual ground truth per case for training/testing; limited external validation. High interrater variability in perceived segmentation quality; existing metrics (e.g., Dice) correlate poorly with expert perception.

Output

CDEs: RDE1278, RDE1281
Description: Binary segmentation mask of T2-FLAIR abnormality (total tumor burden) for postoperative glioblastoma.

Recommendation

Performance evaluation should include expert-centered assessment and consider metrics aligning with clinical perception (e.g., HD95, surface Dice) rather than relying solely on Dice score.

Reproducibility

Processing and evaluation pipelines described (registration, preprocessing, model configuration); implementation using DeepNeuro and pymia. Data available upon request from corresponding author.

Use

Intended: Image segmentation
Out-of-scope: Image segmentation
Excluded: Decision support

User

Intended: Physician, Radiologist, Subspecialist diagnostic radiologist
Excluded: Layperson