Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications
2025-12-03https://doi.org/10.1148/atlas.1764779941378
41
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/model.json
Name
Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications
Link
https://dx.doi.org/10.1148/ryai.230019
Indexing
Keywords: patient reidentification, deep metric learning, triplet loss, chest radiograph, EfficientNet-B3, StyleGAN2, GAN explainability, image retrieval, longitudinal abnormality prediction
Content: CH, RS
RadLex: RID10345, RID28784, RID28625
Author(s)
Matthew S. Macpherson
Charles E. Hutchinson
Carolyn Horst
Vicky Goh
Giovanni Montana
Organization(s)
University of Warwick
University Hospitals Coventry and Warwickshire NHS Trust
King’s College London
Guy’s and St Thomas’ NHS Foundation Trust
The Alan Turing Institute
Version
1.0
License
Text: © 2023 by the Radiological Society of North America, Inc.
Contact
ku.ca.kciwraw@anatnom.g
Funding
Supported by the Wellcome Trust (research grant) and EPSRC (student funding).
Ethical review
Retrospective study using de-identified DICOM data gathered under national governance (Governance Arrangements for Research Ethics Committees) and NHS data opt-out procedures; images from six UK hospitals (2006–2019).
Date
Published: 2023-09-20
Created: 2023-01-25
References
[1] Macpherson MS, Hutchinson CE, Horst C, Goh V, Montana G. "Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications". Radiology: Artificial Intelligence. 2023;5(6):e230019. . doi:10.1148/ryai.230019. PMID: 38074779. PMCID: PMC10698609.
Model
Architecture
Deep metric learning with EfficientNet-B3 CNN backbone (ImageNet-initialized) producing 1536 features, followed by a fully connected projection head to an N-dimensional embedding (4–128 dims). Trained with triplet loss and online triplet mining. Logistic regression on Euclidean distance between embeddings for same-patient probability. GAN-based explainability via StyleGAN2 AC-GAN conditioning on learned identity embeddings.
Availability
Code implemented in PyTorch (v1.8.0); triplet loss/miner from PyTorch Metric Learning. No public code link provided.
Clinical benefit
- Patient identity confirmation and retrieval from chest radiograph databases.
- Potential longitudinal biomarker: change in identity representation associated with emergence of abnormalities.
- Model explainability through GAN-generated counterfactuals aids understanding of identity-relevant features.
Clinical workflow phase
Research method; potential future use in data management (patient identity confirmation) and longitudinal decision support.
Decision threshold
0.50 probability for binary patient confirmation (logistic regression on embedding distances).
Degree of automation
Fully automated image embedding, similarity computation, and retrieval; automated logistic regression decision for confirmation; automated GAN-based visualization for interpretability.
Indications for use
Research setting: reidentification and retrieval of adult patient frontal chest radiographs (AP/PA) across large datasets; exploratory longitudinal abnormality risk signaling from identity representation drift.
Input
Frontal chest radiographs (AP/PA) from adults (≥16 years). Image resolutions used: 256×256, 299×299, 512×512. Input formats included DICOM (internal, MIMIC-CXR) and PNG/JPG (ChestX-ray14, CheXpert).
Limitations
- Retrospective study; potential biases related to clinical populations and ascertainment.
- Limited time span per patient; few widely time-separated images.
- Only frontal views (no lateral) considered.
- Differences in preprocessing for external PNG/JPG datasets may affect generalization.
- Limited metadata (e.g., height/weight) precluded quantitative confirmation of some feature interpretations.
- Not validated for clinical deployment; thresholding vs. database size for forensic use not established.
Output
CDEs: RDE229, RDE1550
Description: - N-dimensional identity embedding per image and Euclidean distances for similarity.
- Probability that a pair of images belongs to the same patient (logistic regression on distances).
- Nearest-neighbor retrieval results from a database.
- Longitudinal abnormality score via identity-representation drift from a normal baseline image.
- GAN-generated counterfactual images illustrating identity-relevant principal components.
Regulatory information
Comment: Study demonstrates technical feasibility and potential applications; no regulatory submission reported.
Authorization status: Research-use only; not a cleared medical device.
Reproducibility
Detailed dataset splits, training regimen (optimizer, learning rates, early stopping), architectures (EfficientNet-B3, embedding sizes), and hardware (NVIDIA DGX-1 with 8×V100, 256 GB GPU memory) are reported. External validation on ChestX-ray14, CheXpert, and MIMIC-CXR without fine-tuning.
Sustainability
Training performed on 8×NVIDIA V100 GPUs (DGX-1). GAN trained for ~25 million image exposures at 256×256. No energy consumption metrics reported.
Use
Intended: Detection and diagnosis, Other
Out-of-scope: Decision support, Detection and diagnosis
Excluded: Detection, Other
User
Intended: Radiologist, Researcher
Out-of-scope: Patient, Layperson
Excluded: Layperson