Dataset 1 (DS1): Frontal chest radiographs with radiograph-level and lesion-level annotations from Shenzhen People’s Hospital
2026-01-24https://doi.org/10.1148/atlas.1769273751818
50
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/dataset.json
Name
Dataset 1 (DS1): Frontal chest radiographs with radiograph-level and lesion-level annotations from Shenzhen People’s Hospital
Link
https://pmc.ncbi.nlm.nih.gov/articles/PMC9530769
Indexing
Keywords: Chest radiograph, Frontal CXR, Bounding boxes, Lesion localization, Cardiomegaly, Pleural effusion, Mass, Nodule, Pneumonia, Pneumothorax, Tuberculosis, Fracture, Aortic calcification
Content: CH
RadLex: RID28788, RID50149, RID34539, RID5335, RID29116, RID39056, RID35057, RID5352
SNOMED: 233604007, 125605004, 36118008, 8186001, 56717001, 309529002, 72704001, 60046008, 786838002
Author(s)
Luyang Luo
Hao Chen
Yongjie Xiao
Yanning Zhou
Xi Wang
Varut Vardhanabhuti
Mingxiang Wu
Chu Han
Zaiyi Liu
Xin Hao Benjamin Fang
Efstratios Tsougenis
Huangjing Lin
Pheng-Ann Heng
Organization(s)
Shenzhen People’s Hospital, Luohu, Shenzhen, China
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
AI Research Laboratory, Imsight Technology, Shenzhen, China
Department of Radiology, Guangdong Provincial People’s Hospital, Guangzhou, China
Contact
Corresponding author: Hao Chen (email as provided in article: kh.tsu.esc@chj)
Funding
Key-Area Research and Development Program of Guangdong Province, China (2020B010165004, 2018B010109006); Hong Kong Innovation and Technology Fund (ITS/311/18FP); HKUST Bridge Gap Fund (BGF.005.2021); National Natural Science Foundation of China (U1813204); Shenzhen-HK Collaborative Development Zone.
Ethical review
Institutional ethical committee approval no. YB-2021–554; individual consent waived; all institutional data de-identified.
Comments
Retrospective single-center internal dataset used to develop and evaluate CheXNet (classification) and CheXDet (detection) models; includes both radiograph-level labels and fine-grained lesion bounding boxes for nine thoracic findings.
Date
Published: 2022
References
[1] Luo L, Chen H, Xiao Y, Zhou Y, Wang X, Vardhanabhuti V, et al.. "Rethinking Annotation Granularity for Overcoming Shortcuts in Deep Learning–based Radiograph Diagnosis: A Multicenter Study". Radiology: Artificial Intelligence. 2022-07-20. doi:10.1148/ryai.210299. PMID: 36204545. PMCID: PMC9530769.
Dataset
Motivation
Evaluate whether fine-grained lesion-level annotations alleviate shortcut learning and improve generalizability of DL models for chest radiograph diagnosis.
Sampling
Retrospective collection from PACS between Jan 1, 2005 and Sep 31, 2019.
Partitioning scheme
Patient-disjoint splits: training n=28,673; tuning n=2,906; internal testing n=2,922; reader-study subset n=496 sampled from testing. Additional ablations used 20%, 40%, 60%, 80%, and 100% of training set.
Missing information
Exact image pixel dimensions and acquisition parameters not reported; demographic breakdown per partition not reported.
Relationships between instances
Multiple radiographs per patient possible; labels include radiograph-level presence/absence for 9 findings and lesion-level bounding boxes for corresponding abnormalities.
Noise
Not explicitly characterized; external datasets have differing label sources (radiologist-adjudicated vs NLP-derived).
External data
External testing used NIH Google (ChestX-ray14 subset), PadChest, and NIH ChestX-ray14; additional training for a comparison model used CheXpert.
Confidentiality
De-identified retrospective clinical images; internal dataset cannot be released for privacy and safety reasons.
Re-identification
All institutional data de-identified prior to research use.
Sensitive data
Medical imaging data with associated reports; PHI removed prior to analysis.