Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications
model2025-12-03https://doi.org/10.1148/atlas.1764779941378
41

Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications

Link

https://dx.doi.org/10.1148/ryai.230019

Indexing

Keywords: patient reidentification, deep metric learning, triplet loss, chest radiograph, EfficientNet-B3, StyleGAN2, GAN explainability, image retrieval, longitudinal abnormality prediction
Content: CH, RS
RadLex: RID10345, RID28784, RID28625

Author(s)

Matthew S. Macpherson
Charles E. Hutchinson
Carolyn Horst
Vicky Goh
Giovanni Montana

Organization(s)

University of Warwick
University Hospitals Coventry and Warwickshire NHS Trust
King’s College London
Guy’s and St Thomas’ NHS Foundation Trust
The Alan Turing Institute

Version

1.0

License

Text: © 2023 by the Radiological Society of North America, Inc.

Contact

ku.ca.kciwraw@anatnom.g

Funding

Supported by the Wellcome Trust (research grant) and EPSRC (student funding).

Ethical review

Retrospective study using de-identified DICOM data gathered under national governance (Governance Arrangements for Research Ethics Committees) and NHS data opt-out procedures; images from six UK hospitals (2006–2019).

Date

Published: 2023-09-20
Created: 2023-01-25

References

[1] Macpherson MS, Hutchinson CE, Horst C, Goh V, Montana G. "Patient Reidentification from Chest Radiographs: An Interpretable Deep Metric Learning Approach and Its Applications". Radiology: Artificial Intelligence. 2023;5(6):e230019. . doi:10.1148/ryai.230019. PMID: 38074779. PMCID: PMC10698609.

Model

Architecture

Deep metric learning with EfficientNet-B3 CNN backbone (ImageNet-initialized) producing 1536 features, followed by a fully connected projection head to an N-dimensional embedding (4–128 dims). Trained with triplet loss and online triplet mining. Logistic regression on Euclidean distance between embeddings for same-patient probability. GAN-based explainability via StyleGAN2 AC-GAN conditioning on learned identity embeddings.

Availability

Code implemented in PyTorch (v1.8.0); triplet loss/miner from PyTorch Metric Learning. No public code link provided.

Clinical benefit

- Patient identity confirmation and retrieval from chest radiograph databases. - Potential longitudinal biomarker: change in identity representation associated with emergence of abnormalities. - Model explainability through GAN-generated counterfactuals aids understanding of identity-relevant features.

Clinical workflow phase

Research method; potential future use in data management (patient identity confirmation) and longitudinal decision support.

Decision threshold

0.50 probability for binary patient confirmation (logistic regression on embedding distances).

Degree of automation

Fully automated image embedding, similarity computation, and retrieval; automated logistic regression decision for confirmation; automated GAN-based visualization for interpretability.

Indications for use

Research setting: reidentification and retrieval of adult patient frontal chest radiographs (AP/PA) across large datasets; exploratory longitudinal abnormality risk signaling from identity representation drift.

Input

Frontal chest radiographs (AP/PA) from adults (≥16 years). Image resolutions used: 256×256, 299×299, 512×512. Input formats included DICOM (internal, MIMIC-CXR) and PNG/JPG (ChestX-ray14, CheXpert).

Limitations

- Retrospective study; potential biases related to clinical populations and ascertainment. - Limited time span per patient; few widely time-separated images. - Only frontal views (no lateral) considered. - Differences in preprocessing for external PNG/JPG datasets may affect generalization. - Limited metadata (e.g., height/weight) precluded quantitative confirmation of some feature interpretations. - Not validated for clinical deployment; thresholding vs. database size for forensic use not established.

Output

CDEs: RDE229, RDE1550
Description: - N-dimensional identity embedding per image and Euclidean distances for similarity. - Probability that a pair of images belongs to the same patient (logistic regression on distances). - Nearest-neighbor retrieval results from a database. - Longitudinal abnormality score via identity-representation drift from a normal baseline image. - GAN-generated counterfactual images illustrating identity-relevant principal components.

Regulatory information

Comment: Study demonstrates technical feasibility and potential applications; no regulatory submission reported.
Authorization status: Research-use only; not a cleared medical device.

Reproducibility

Detailed dataset splits, training regimen (optimizer, learning rates, early stopping), architectures (EfficientNet-B3, embedding sizes), and hardware (NVIDIA DGX-1 with 8×V100, 256 GB GPU memory) are reported. External validation on ChestX-ray14, CheXpert, and MIMIC-CXR without fine-tuning.

Sustainability

Training performed on 8×NVIDIA V100 GPUs (DGX-1). GAN trained for ~25 million image exposures at 256×256. No energy consumption metrics reported.

Use

Intended: Detection and diagnosis, Other
Out-of-scope: Decision support, Detection and diagnosis
Excluded: Detection, Other

User

Intended: Radiologist, Researcher
Out-of-scope: Patient, Layperson
Excluded: Layperson