Brain Imaging Generation with Latent Diffusion Models
Brain Imaging Generation with Latent Diffusion Models
model2025-11-22https://doi.org/10.1148/atlas.1763762962562
51

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Brain Imaging Generation with Latent Diffusion Models

Link

https://arxiv.org/abs/2209.07162

Indexing

Keywords: Synthetic data, Diffusion models, Generative models, Brain Imaging
Content: NR, BQ, MR

Author(s)

Walter H. L. Pinaya
Petru-Daniel Tudosiu
Jessica Dafflon
Pedro F Da Costa
Virginia Fernandez
Parashkev Nachev
Sebastien Ourselin
M. Jorge Cardoso

Organization(s)

King’s College London
National Institute of Mental Health
Birkbeck College
University College London

Version

v1

Funding

WHLP and MJC are supported by Wellcome Innovations [WT213038/Z/18/Z]. PTD is supported by the EPSRC Research Council, part of the EPSRC DTP, grant Ref: [EP/R513064/1]. JD is supported by the Intramural Research Program of the NIMH (ZIC-MH002960 and ZIC-MH002968). PFDC is supported by the European Union’s HORIZON 2020 Research and Innovation Programme under the Marie Sklodowska-Curie Grant Agreement No 814302. PN is supported by Wellcome Innovations [WT213038/Z/18/Z] and the UCLH NIHR Biomedical Research Centre. This research has been conducted using the UK Biobank Resource (Project number: 58292).

Comments

This study explores using Latent Diffusion Models to generate synthetic high-resolution 3D brain images conditioned on age, sex, and brain structure volumes. A synthetic dataset of 100,000 brain images was created and made openly available to the scientific community.

Date

Published: 2022-09-15

Model

Architecture

Latent Diffusion Models (LDM) combining autoencoders for compression into a lower-dimensional latent representation and diffusion models for generative modeling. The autoencoder maps brain images to a latent representation of 20 × 28 × 20. The diffusion model converts Gaussian noise into samples via an iterative denoising process over 1000 steps. A hybrid conditioning approach combines concatenation of conditioning variables with input data and cross-attention mechanisms.

Availability

The synthetic dataset of 100,000 brain images generated by this model is openly available at Academics Torrents, FigShare, and HDRUK Gateway.

Clinical benefit

Addresses the limitation of small dataset sizes in medical imaging by generating synthetic data, complementing training datasets, and enabling medical image research at a larger scale without privacy concerns.

Degree of automation

Fully automated generation of synthetic high-resolution 3D brain MRI images.

Indications for use

Generation of synthetic high-resolution 3D brain MRI images for research purposes, data augmentation for machine learning model training, and exploring probabilistic distributions of brain images.

Input

High-resolution 3D T1w MRI images of the brain. Conditioning variables include age, sex, ventricular volume, and brain volume normalized for head size.

Instructions

The model can be conditioned on age, sex, ventricular volume, and brain volume normalized for head size to control data generation. Input images are pre-processed using rigid body registration to a common MNI space, with a 1 mm3 voxel size, and cropped to a volume of 160 × 224 × 160 voxels.

Limitations

While LDMs demonstrated more stable training and easier convergence compared to GAN-based baselines, generative models can still face challenges in replicating essential finer details if image resolution is too low. Extrapolation of conditioning variables outside the training range can lead to abnormally huge ventricles or signs of neurodegeneration.

Output

Description: Synthetic high-resolution 3D brain MRI images (160 × 224 × 160 voxels) that replicate properties from real training images and can be controlled by conditioning variables.

Reproducibility

LDMs demonstrated more stable training and easier convergence when compared to GAN-based baselines (VAE-GAN, LSGAN) for high-resolution 3D images. The use of the DDIM sampler reduced sampling time from 142.3 ± 1.6s per sample to 7.6 ± 0.2s per sample with minimal performance loss.

Use

Intended: Synthetic Data Generation, Data Augmentation, Medical Image Research, Machine Learning Model Training
Out-of-scope: Clinical Diagnosis, Patient Treatment, Direct Clinical Decision Making

User

Intended: Researcher, Machine Learning Engineer, Data Scientist, Medical Imaging Scientist
Out-of-scope: Clinician, Patient, General Public