Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/dataset.json

Name

Lille University Hospital Emergency Brain MRI Reports (2022) — Vicuna LLM Evaluation Cohort

Link

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294959/

Indexing

Keywords: Large Language Model, Vicuna, Information extraction, Radiology reports, Brain MRI, Emergency department, Headache, French reports, Contrast medium, Free text

Content: ER, IN, MR, NR

RadLex: RID10312, RID45946, RID11587, RID39094, RID13060, RID10319, RID49531, RID11595

SNOMED: 25064002

Author(s)

Bastien Le Guellec

Alexandre Lefèvre

Charlotte Geay

Lucas Shorten

Cyril Bruge

Lotfi Hacein-Bey

Philippe Amouyel

Jean-Pierre Pruvo

Gregory Kuchcinski

Aghiles Hamroun

Organization(s)

CHU Lille–Université Lille, Department of Neuroradiology

CHU Lille–Université Lille, Department of Public Health

INclude Health Data Warehouse, CHU Lille

UC Davis Health, Department of Radiology

Université Lille, INSERM, Institut Pasteur de Lille, U1167-RID-AGE

INSERM U1172–LilNCog-Lille Neuroscience & Cognition, Université Lille

UAR 2014-US 41-PLBS–Plateformes Lilloises en Biologie & Santé, Université Lille

License

Text: CC BY 4.0

URL: https://creativecommons.org/licenses/by/4.0/

Funding

Authors declared no funding for this work.

Ethical review

Data warehouse approved by French data protection authority (ref. 2019–103). Study use approved by Lille University Hospital IRB in June 2023 (EDS2307251350).

Comments

Retrospective cohort of pseudonymized free-text emergency brain MRI reports (French) from CHU Lille (France) in 2022, used to evaluate an on-premise open-source LLM (Vicuna 13B) for information extraction tasks.

Date

Published: 2024-05-08

Created: 2022-01-01

References

[1] Le Guellec B, Lefèvre A, Geay C, Shorten L, Bruge C, Hacein-Bey L, Amouyel P, Pruvo JP, Kuchcinski G, Hamroun A. "Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports". Radiology: Artificial Intelligence. 2024-05-08. doi:10.1148/ryai.230364. PMID: 38717292. PMCID: PMC11294959.

Dataset

Motivation

Assess feasibility and performance of an on-premise open-source LLM for extracting clinically relevant information from real-life radiology reports.

Sampling

All consecutive emergency department brain MRI reports from Jan–Dec 2022 at a single French quaternary center; subset with headache identified by radiologist review.

Missing information

Raw report texts are not shared in the article; only translated/modified examples in figures/tables. Imaging files not included.

Relationships between instances

Each instance is a radiology report; reports segmented into sections (clinical context, protocol, results, conclusion).

Noise

Reports authored by 43 radiologists (22 trainees, 21 board-certified) with variable phrasing; some reports lacked explicit mention of contrast use.

External data

No external datasets reported; all data from CHU Lille health data warehouse.

Confidentiality

Pseudonymized free-text reports extracted from institutional health data warehouse; no raw images shared.

Re-identification

Reports were pseudonymized using eHOP software by removing patient residence, name, and prescribing physician.

Sensitive data

Clinical free-text reports containing medical information; identifiers removed prior to analysis.