Lille University Hospital Emergency Brain MRI Reports (2022) — Vicuna LLM Evaluation Cohort
2025-11-26https://doi.org/10.1148/atlas.1764132213655
32
Overview
Schema Version
https://atlas.rsna.org/schemas/2025-11/dataset.json
Name
Lille University Hospital Emergency Brain MRI Reports (2022) — Vicuna LLM Evaluation Cohort
Link
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294959/
Indexing
Keywords: Large Language Model, Vicuna, Information extraction, Radiology reports, Brain MRI, Emergency department, Headache, French reports, Contrast medium, Free text
Content: ER, IN, MR, NR
RadLex: RID10312, RID45946, RID11587, RID39094, RID13060, RID10319, RID49531, RID11595
SNOMED: 25064002
Author(s)
Bastien Le Guellec
Alexandre Lefèvre
Charlotte Geay
Lucas Shorten
Cyril Bruge
Lotfi Hacein-Bey
Philippe Amouyel
Jean-Pierre Pruvo
Gregory Kuchcinski
Aghiles Hamroun
Organization(s)
CHU Lille–Université Lille, Department of Neuroradiology
CHU Lille–Université Lille, Department of Public Health
INclude Health Data Warehouse, CHU Lille
UC Davis Health, Department of Radiology
Université Lille, INSERM, Institut Pasteur de Lille, U1167-RID-AGE
INSERM U1172–LilNCog-Lille Neuroscience & Cognition, Université Lille
UAR 2014-US 41-PLBS–Plateformes Lilloises en Biologie & Santé, Université Lille
License
Text: CC BY 4.0
URL: https://creativecommons.org/licenses/by/4.0/
Funding
Authors declared no funding for this work.
Ethical review
Data warehouse approved by French data protection authority (ref. 2019–103). Study use approved by Lille University Hospital IRB in June 2023 (EDS2307251350).
Comments
Retrospective cohort of pseudonymized free-text emergency brain MRI reports (French) from CHU Lille (France) in 2022, used to evaluate an on-premise open-source LLM (Vicuna 13B) for information extraction tasks.
Date
Published: 2024-05-08
Created: 2022-01-01
References
[1] Le Guellec B, Lefèvre A, Geay C, Shorten L, Bruge C, Hacein-Bey L, Amouyel P, Pruvo JP, Kuchcinski G, Hamroun A. "Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports". Radiology: Artificial Intelligence. 2024-05-08. doi:10.1148/ryai.230364. PMID: 38717292. PMCID: PMC11294959.
Dataset
Motivation
Assess feasibility and performance of an on-premise open-source LLM for extracting clinically relevant information from real-life radiology reports.
Sampling
All consecutive emergency department brain MRI reports from Jan–Dec 2022 at a single French quaternary center; subset with headache identified by radiologist review.
Missing information
Raw report texts are not shared in the article; only translated/modified examples in figures/tables. Imaging files not included.
Relationships between instances
Each instance is a radiology report; reports segmented into sections (clinical context, protocol, results, conclusion).
Noise
Reports authored by 43 radiologists (22 trainees, 21 board-certified) with variable phrasing; some reports lacked explicit mention of contrast use.
External data
No external datasets reported; all data from CHU Lille health data warehouse.
Confidentiality
Pseudonymized free-text reports extracted from institutional health data warehouse; no raw images shared.
Re-identification
Reports were pseudonymized using eHOP software by removing patient residence, name, and prescribing physician.
Sensitive data
Clinical free-text reports containing medical information; identifiers removed prior to analysis.