Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation
2025-11-30https://doi.org/10.1148/atlas.1764532270797
51
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation
Link
https://dx.doi.org/10.1148/ryai.220231
Indexing
Keywords: Brain Tumor Segmentation, Deep Learning Algorithms, Glioblastoma, Cancer, Machine Learning
Content: MR, NR, OI
RadLex: RID10312, RID4044, RID12775, RID35806
SNOMED: 126952004, 1163375002
Author(s)
Katharina V. Hoebel
Christopher P. Bridge
Sara Ahmed
Oluwatosin Akintola
Caroline Chung
Raymond Y. Huang
Jason M. Johnson
Albert Kim
K. Ina Ly
Ken Chang
Jay Patel
Marco Pinho
Tracy T. Batchelor
Bruce R. Rosen
Elizabeth R. Gerstner
Jayashree Kalpathy-Cramer
Organization(s)
Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital
Stephen E. and Catherine Pappas Center for Neuro-Oncology, Massachusetts General Hospital
Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology
MGH and BWH Center for Clinical Data Science
The University of Texas MD Anderson Cancer Center (Departments of Radiation Oncology, Diagnostic Radiology, Neuroradiology)
Brigham and Women’s Hospital (Departments of Radiology, Neurology)
University of Texas Southwestern Medical Center (Radiology and Advanced Imaging Research Center)
University of Colorado Anschutz Medical Campus (Ophthalmology)
Version
1.0
License
Text: © 2023 by the Radiological Society of North America, Inc.
Contact
Jayashree Kalpathy-Cramer (corresponding author), email: ude.ztuhcsnauc@remarc-yhtaplak.eerhsayaj
Funding
NIH U01CA242879 (K.V.H., J.K.C.); NIH R01CA129371 and K23CA169021 (E.R.G.); P41EB015896 (Athinoula A. Martinos Center resources). Additional disclosures and institutional/grant supports listed in article.
Ethical review
Institutional review board approval with waiver of written consent (secondary analysis of imaging data from two clinical trials; ClinicalTrials.gov identifiers NCT00756106 and NCT00662506).
Date
Updated: 2023-11-22
Published: 2023-11-22
Created: 2022-10-31
References
[1] Hoebel KV, Bridge CP, Ahmed S, et al.. "Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation". Radiology: Artificial Intelligence. 2024 Jan;6(1):e220231.. 2023-11-22. doi:10.1148/ryai.220231. PMID: 38197800. PMCID: PMC10831514.
Model
Architecture
Monte Carlo dropout three-dimensional U-Net trained on 64×64×16 voxel patches with dropout probability 0.2 after each convolutional layer; weighted cross-entropy loss; STAPLE used to combine 10 Monte Carlo dropout samples into final segmentation.
Availability
Model implemented in the open-source DeepNeuro framework; quantitative metrics computed with the pymia Python package. Data available from corresponding author upon request.
Clinical benefit
Automated segmentation of postoperative glioblastoma T2-FLAIR abnormality to support expert review and downstream tasks such as treatment planning and response assessment.
Clinical workflow phase
Clinical decision support systems; workflow optimization for segmentation review and editing.
Degree of automation
Assists decision making by generating automatic segmentations for expert review; not intended for fully autonomous clinical use.
Indications for use
Segmentation of areas of T2-weighted FLAIR abnormality corresponding to total tumor burden in postoperative glioblastoma patients; intended for use by specialists in neuro-oncology, neuroradiology, and radiation oncology within clinical/research environments.
Input
Three-channel MRI: T1-weighted precontrast, T1-weighted postcontrast, and T2-weighted FLAIR sequences (registered to FLAIR) from postoperative glioblastoma patients.
Instructions
Register T1 pre- and postcontrast to T2-FLAIR; perform brain extraction, N4 bias correction, and z-score normalization of the brain region. Apply the Monte Carlo dropout 3D U-Net with dropout p=0.2 on 64×64×16 patches; aggregate 10 stochastic predictions using STAPLE to obtain the final segmentation.
Limitations
Study focused on postoperative glioblastoma on T2-FLAIR; findings may not generalize to preoperative tumors or other pathologies. Expert rating used only T2-FLAIR images (no additional sequences during rating). Data derived from two clinical trials; single-manual ground truth per case for training/testing; limited external validation. High interrater variability in perceived segmentation quality; existing metrics (e.g., Dice) correlate poorly with expert perception.
Output
CDEs: RDE1278, RDE1281
Description: Binary segmentation mask of T2-FLAIR abnormality (total tumor burden) for postoperative glioblastoma.
Recommendation
Performance evaluation should include expert-centered assessment and consider metrics aligning with clinical perception (e.g., HD95, surface Dice) rather than relying solely on Dice score.
Reproducibility
Processing and evaluation pipelines described (registration, preprocessing, model configuration); implementation using DeepNeuro and pymia. Data available upon request from corresponding author.
Use
Intended: Image segmentation
Out-of-scope: Image segmentation
Excluded: Decision support
User
Intended: Physician, Radiologist, Subspecialist diagnostic radiologist
Excluded: Layperson