Artificial intelligence applications in medical education: systematic review and meta‑analysis of performance and AI‑driven interventions

Hosseini, Seyyed Ali; Rezaeian, Ali; Amirkhani, Zahra

Back to the articles list | Back to browse issues page

Ethics code: IR.LARUMS.REC.1404.019

Artificial intelligence applications in medical education: systematic review and meta‑analysis of performance and AI‑driven interventions

Seyyed Ali Hosseini¹

, Ali Rezaeian¹

, Zahra Amirkhani ^*²

1- Department of Basic Sciences, School of Medicine, Larestan University of Medical Sciences, Lar, Iran
2- Department of Basic Sciences, School of Medicine, Larestan University of Medical Sciences, Lar, Iran , z.amirkhani1357@gmail.com

Abstract: (326 Views)

Background and Objective: The rapid integration of Artificial Intelligence (AI) into medical education has created a need for quantitative evidence synthesis. This study sought to benchmark AI performance on medical knowledge assessments and to evaluate the preliminary effectiveness of AI‑driven educational interventions.
Materials & Methods: A systematic review and dual meta‑analysis were conducted in accordance with PRISMA guidelines. Five databases (PubMed, Web of Science, Embase, Scopus, and Google Scholar) were searched from inception through February 28, 2024. AI performance was evaluated using 35 accuracy data points derived from seven benchmarking studies encompassing 2,341 examination questions. The effectiveness of educational interventions was assessed using data from eight Randomized Controlled Trials (RCTs) involving 574 medical, dental, and pharmacy students. Random‑effects models were applied, including a three‑level proportion meta‑analysis for AI accuracy to account for within‑study dependence and a Hedges’ g meta‑analysis for intervention outcomes. Heterogeneity was quantified using the I² statistic, alongside exploratory meta‑regression and sensitivity analyses.
Results: The analyses yielded two primary findings. First, AI models demonstrated a pooled accuracy of 70.9% (95% CI: 65.1%–75.9%) on standardized medical examinations, with performance improving across successive model generations. Second, AI‑based educational interventions showed a large pooled effect size; however, this estimate was unstable due to a highly influential outlier (g = 6.72). Exclusion of this study altered the pooled effect from g = 1.40 to g = 1.95, with substantial heterogeneity observed (I² = 93.6%). This variability was largely attributable to small, early‑stage studies reporting disproportionately large effects. Leave‑one‑out sensitivity analyses produced effect sizes ranging from g = 1.26 to g = 1.45, while an inverse association between sample size and effect magnitude (p < 0.001) suggested systematic overestimation in smaller trials.
Conclusion: Advanced AI systems exhibit robust medical knowledge; however, evidence supporting their educational effectiveness remains preliminary and potentially biased. While the findings are encouraging, they highlight the need for larger, methodologically rigorous trials to establish reliable effect sizes and to inform the responsible integration of AI into medical education.

Keywords: artificial intelligence, medical education, systematic review, meta-analysis, AI-driven educational interventions

Article Type : Review | Subject: Medical Education
Received: 2025/12/27 | Accepted: 2026/05/10

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Related Websites

Site Keywords

Medical, Education, Zanjan University, Medical Sciences, Development

Vote

Journal of Medical Education Development

How Do You Evaluate This Site?
	Excellent
	Good
	Average
	weak