Characterizing Renal Structures with 3D Block Aggregate Transformers
Characterizing Renal Structures with 3D Block Aggregate Transformer
2025-11-22https://doi.org/10.1148/atlas.1763846039343
41
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
Characterizing Renal Structures with 3D Block Aggregate Transformers
Link
https://github.com/Project-MONAI/model-zoo/tree/dev/models/renalStructures_UNEST_segmentation
Indexing
Keywords: Renal Substructures, Computed Tomography, Transformer Model
Content: GU, CT, BQ, OI
Author(s)
Xin Yu
Yucheng Tang
Yinchi Zhou
Riqiang Gao
Qi Yang
Ho Hin Lee
Thomas Li
Shunxing Bao
Yuankai Huo
Zhoubing Xu
Thomas A. Lasko
Richard G. Abramson
Bennett A. Landman
Organization(s)
Vanderbilt University
Siemens Healthineers
Vanderbilt University Medical Center
Funding
NIH Common Fund and National Institute of Diabetes, Digestive and Kidney Diseases U54DK120058, NSF CAREER 1452485, NIH grants 2R01EB006136, 1R01EB017230 (Landman), R01NS09529. VICTR CTSA award (ULTR000445 from NCATS/NIH) and Vanderbilt University Medical Center institutional funding. PCORI (contract CDRN-1306-04869).
Ethical review
Study approved by institutional review board (IRB).
Comments
This paper proposes UNesT, a novel 3D block aggregation transformer for segmenting renal cortex, medulla, and collecting system on contrast-enhanced CT scans. The model addresses data inefficiency challenges in transformer models for medical image analysis by using a hierarchical U-shape design with 3D block aggregation. It achieves state-of-the-art performance for renal substructure segmentation and demonstrates strong correlation and reproducibility for volumetric analysis, with generalizability validated on the KiTS dataset.
Date
Published: 2022-03-04
Model
Architecture
The UNesT model is a 3D U-shape medical segmentation model with Nested Transformers (UNesT) that uses a hierarchical transformer as the encoder and a convolution-based decoder. It incorporates 3D block aggregation to achieve local communication between sequence representations without modifying self-attention, enhancing data efficiency for small structures and datasets.
Clinical benefit
Provides efficient and accurate quantification of renal structures, facilitating biomarker discovery for kidney morphology. It serves as an accurate and efficient quantification tool for characterizing renal structures, enabling accurate volumetric analysis with strong correlation and reproducibility, and significantly improving the derivation of visual and quantitative results for radiologists.
Clinical workflow phase
Diagnosis
Degree of automation
Fully automated
Indications for use
Segmentation of renal cortex, medulla, and pelvicalyceal system on contrast-enhanced CT scans for characterizing renal structures and performing volumetric analysis.
Input
3D contrast-enhanced Computed Tomography (CT) image sub-volumes.
Instructions
CT window range of [-175, 275] HU; scaled intensities of [0.0,1.0]. Training performed with a single Nvidia RTX 2080 11GB GPU using Pytorch and MONAI, with a batch size of 1 and input image sub-volume size of 96 × 96 × 96. AdamW optimizer with warm-up cosine scheduler of 500 steps. Learning rate initialized to 0.001, decaying to 1e-5 for 50K iterations. No pre-training is performed.
Limitations
Transformer-based models are often data-inefficient, leading to suboptimal performance on small structures and datasets. Renal segmentation datasets can vary due to imaging protocols, patient morphology, and institutional differences. Potential improvements include pre-registration of kidney regions of interest to reduce shape/size variations and incorporating dose usage in the segmentation loop to better identify adjacent tissues.
Output
Description: Segmentation masks for renal cortex, medulla, and pelvicalyceal system, enabling volumetric analysis of these kidney components.
Reproducibility
Pearson R of 0.9891 between the proposed method and manual standards for cortex volumetric analysis, indicating strong correlation and reproducibility. The automatic segmentation method achieves better agreement (mean difference 0.01) compared to inter-rater assessment (mean difference 0.29) for medulla volume agreement, indicating reliable reproducibility.
Use
Intended: Image segmentation, Volumetric analysis, Biomarker discovery, Kidney morphology characterization
User
Intended: Radiologist, Physician, Researcher, Clinical Expert