Overview

Schema Version

https://atlas.rsna.org/schemas/2025-11/model.json

Name

Vicuna 13B (v1.3) for information extraction from emergency brain MRI reports

Link

https://huggingface.co/lmsys/vicuna-13b-v1.3

Indexing

Keywords: Large Language Model, Vicuna, Open Source, Information Extraction, Radiology Report, Brain, MRI, Headache

Content: IN, ER, NR, MR, RS

RadLex: RID39045, RID45846, RID49531, RID39094

Author(s)

Bastien Le Guellec

Alexandre Lefèvre

Charlotte Geay

Lucas Shorten

Cyril Bruge

Lotfi Hacein-Bey

Philippe Amouyel

Jean-Pierre Pruvo

Gregory Kuchcinski

Aghiles Hamroun

Organization(s)

CHU Lille–Université Lille, Department of Neuroradiology

CHU Lille–Université Lille, Department of Public Health

INclude Health Data Warehouse, CHU Lille–Université Lille

UC Davis Health, Department of Radiology

Université Lille, INSERM, CHU Lille, Institut Pasteur de Lille, U1167-RID-AGE

INSERM, U1172–LilNCog-Lille Neuroscience & Cognition, Université Lille

UAR 2014-US 41-PLBS–Plateformes Lilloises en Biologie & Santé, Université Lille

Version

1.3 (Vicuna 13B)

License

Text: CC BY 4.0 (article)

URL: https://creativecommons.org/licenses/by/4.0/

Contact

Bastien Le Guellec: rf.ellil-vinu@ute.celleugel.neitsab

Funding

Authors declared no funding for this work.

Ethical review

Data warehouse approved by French data protection authority (reference no. 2019–103). Use approved by Lille University Hospital IRB in June 2023 (EDS2307251350).

Date

Published: 2024-05-08

References

[1] Le Guellec B, Lefèvre A, Geay C, Shorten L, Bruge C, Hacein-Bey L, Amouyel P, Pruvo JP, Kuchcinski G, Hamroun A. "Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports". Radiology: Artificial Intelligence. 2024 Jul;6(4):e230364.. 2024-05-08. doi:10.1148/ryai.230364. PMID: 38717292. PMCID: PMC11294959.

Model

Architecture

Open-source large language model (Vicuna 13B v1.3), based on Meta LLaMA and fine-tuned on ShareGPT conversations; inference via FastChat; temperature set to 0.

Availability

Model weights: https://huggingface.co/lmsys/vicuna-13b-v1.3; Inference server: https://github.com/lm-sys/FastChat; Study scripts: https://github.com/BastienLeGuellec/RadioVicuna

Clinical benefit

Automated extraction of clinically relevant information from free-text radiology reports to enable cohort identification and reduce manual review time.

Clinical workflow phase

Workflow optimization; research data curation and retrospective cohort building.

Degree of automation

Automates information extraction from report text without additional training; on-premise execution.

Indications for use

Extraction of: symptom presence (headache) from clinical context, contrast medium use from protocol, normal/abnormal classification from conclusion, and causal linkage between findings and headache in emergency brain MRI reports.

Input

Pseudonymized free-text radiology reports (French), segmented into clinical context, protocol, results, and conclusion; prompts in English (sensitivity analysis with French prompts).

Instructions

Run Vicuna 13B v1.3 via FastChat with temperature=0; provide short few-shot prompts (4–6 contextual examples) tailored to each task; give the model only the report section relevant to the task; automate via provided Python script to output a table.

Limitations

Single-center, French-language reports; causal inference ground truth subjective (expert consensus); limited clinical context (only report text); seven reports lacked explicit contrast information; model may be outperformed by newer LLMs.

Output

CDEs: RDE149, RDE1425.0, RDE1424.0, RDE1454

Description: Tabular outputs per report for four tasks: presence of headache in indication; contrast injection in protocol; conclusion classified as normal/abnormal; inference whether main finding explains the headache.

Reproducibility

Fixed model/version (Vicuna 13B v1.3), temperature=0, prompts and scripts publicly available; on-premise execution supports version control; interactor via FastChat; ground-truth and evaluation described with CIs.

Sustainability

Ran on two NVIDIA Quadro RTX 6000 GPUs; compute time ~30 minutes (task 4, 227 reports) to ~3 hours (task 1, 2398 reports). Prompt engineering time: ~30 minutes (tasks 1–2) and ~1 hour (tasks 3–4).

Use

Intended: Report data extraction

User

Intended: Radiologist, Other, Researcher