<?xml version="1.0" encoding="utf-8"?>
<journal>
<title>Journal of Medical Education Development</title>
<title_fa>مجله توسعه آموزش در علوم پزشکی</title_fa>
<short_title>J Med Edu Dev</short_title>
<subject>Medical Sciences</subject>
<web_url>http://edujournal.zums.ac.ir</web_url>
<journal_hbi_system_id>53</journal_hbi_system_id>
<journal_hbi_system_user>kavandi</journal_hbi_system_user>
<journal_id_issn>2980-7670</journal_id_issn>
<journal_id_issn_online>2980-7670</journal_id_issn_online>
<journal_id_pii></journal_id_pii>
<journal_id_doi>10.61882/edcj</journal_id_doi>
<journal_id_iranmedex></journal_id_iranmedex>
<journal_id_magiran></journal_id_magiran>
<journal_id_sid></journal_id_sid>
<journal_id_nlai></journal_id_nlai>
<journal_id_science></journal_id_science>
<language>en</language>
<pubdate>
	<type>jalali</type>
	<year>1405</year>
	<month>1</month>
	<day>1</day>
</pubdate>
<pubdate>
	<type>gregorian</type>
	<year>2026</year>
	<month>4</month>
	<day>1</day>
</pubdate>
<volume>19</volume>
<number>2</number>
<publish_type>online</publish_type>
<publish_edition>1</publish_edition>
<article_type>fulltext</article_type>
<articleset>
	<article>


	<language>en</language>
	<article_id_doi></article_id_doi>
	<title_fa></title_fa>
	<title>Evaluation of AI support for medical training in resource-constrained settings: performance of GPT-5 Pro, Gemini 2.5 Pro, and DeepSeek V3 on real examination questions</title>
	<subject_fa>آموزش پزشکی</subject_fa>
	<subject>Medical Education</subject>
	<content_type_fa>پژوهشي اصیل</content_type_fa>
	<content_type>Orginal Research</content_type>
	<abstract_fa></abstract_fa>
	<abstract>&lt;span style=&quot;line-height:2;&quot;&gt;&lt;span style=&quot;font-family:Times New Roman;&quot;&gt;&lt;span style=&quot;font-size:16px;&quot;&gt;&lt;b&gt;Background &amp; Objective:&lt;/b&gt; Recent advances in Large Language Models (LLMs) have expanded their potential applications in medical education and assessment. This study compared the performance of GPT-5 Pro (OpenAI), Gemini 2.5 Pro (Google DeepMind), and DeepSeek V3 (DeepSeek AI) on authentic, faculty-validated Multiple-Choice Questions (MCQs) from an Algerian francophone Medical Faculty.&lt;br&gt;
&lt;b&gt;Materials &amp; Methods:&lt;/b&gt; This parallel, cross-sectional comparative evaluation was carried out under standardized online conditions. A total of 480 faculty-validated, non-public MCQs from a private subscription repository, covering four pre-clinical modules and four clinical modules, were presented to each model in independent chat sessions. Accuracy was compared across models using Cochran&amp;rsquo;s Q and pairwise McNemar tests with Holm correction. Intra-model subgroup analyses (module, study cycle, question type, response format, and temporal factors) used chi-square or Mann&amp;ndash;Whitney tests, with p &lt; 0.05 considered significant.&lt;br&gt;
&lt;b&gt;Results:&lt;/b&gt; Gemini 2.5 Pro achieved the highest accuracy (447/480, 93.1% [95% CI: 90.5 &amp;ndash; 95.1]), followed by GPT-5 Pro (430/480, 89.6% [95% CI: 86.5 &amp;ndash; 92.0]) and DeepSeek V3 (429/480, 89.4% [95% CI: 86.3 &amp;ndash; 91.8]). The overall difference in accuracy was significant (Cochran&amp;rsquo;s Q = 8.65, p = 0.013), with a small global effect size (Kendall&amp;rsquo;s W = 0.009). Pairwise testing showed Gemini 2.5 Pro performed better than both competitors (p = 0.049), whereas GPT-5 Pro and DeepSeek V3 did not differ (p = 1.000). Within-model accuracy was stable across subgroups; non-responses were rare (&lt; 2%) and did not change ranking.&lt;br&gt;
&lt;b&gt;Conclusion:&lt;/b&gt; All tested LLMs demonstrated strong competence on structured medical MCQs and may support supervised formative learning in resource-constrained settings. However, although between-model differences were statistically significant, their absolute educational impact was modest, and their effect on real learning outcomes remains uncertain. Key limitations include potential residual training overlap, single-source MCQ sampling, and absence of explanation-quality assessment; future multicenter longitudinal studies should evaluate open-ended clinical reasoning and learning outcomes.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br&gt;
&amp;nbsp;</abstract>
	<keyword_fa></keyword_fa>
	<keyword>large language models, education, medical, artificial intelligence, formative assessment, resource-limited settings</keyword>
	<start_page>4</start_page>
	<end_page>11</end_page>
	<web_url>http://edujournal.zums.ac.ir/browse.php?a_code=A-12-3487-1&amp;slc_lang=en&amp;sid=1</web_url>


<author_list>
	<author>
	<first_name>Redouene Sid Ahmed</first_name>
	<middle_name></middle_name>
	<last_name>Benazzouz</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>r.benazzouz@lagh-univ.dz</email>
	<code>5300319475328460037136</code>
	<orcid>0009-0000-3614-5883</orcid>
	<coreauthor>Yes
</coreauthor>
	<affiliation>Faculty of Medicine, Laghouat University, Laghouat, Algeria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


	<author>
	<first_name>Massinissa</first_name>
	<middle_name></middle_name>
	<last_name>Benyagoub</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>m.benyagoub@lagh-univ.dz</email>
	<code>5300319475328460037137</code>
	<orcid>5300319475328460037137</orcid>
	<coreauthor>No</coreauthor>
	<affiliation>Faculty of Medicine, Laghouat University, Laghouat, Algeria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


	<author>
	<first_name>Yacine</first_name>
	<middle_name></middle_name>
	<last_name>Boufatah</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>taha.boufatah@gmail.com</email>
	<code>5300319475328460037138</code>
	<orcid>0009-0001-9605-5756 </orcid>
	<coreauthor>No</coreauthor>
	<affiliation>Faculty of Medicine, Laghouat University, Laghouat, Algeria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


	<author>
	<first_name>Fodhil</first_name>
	<middle_name></middle_name>
	<last_name>Sadeki</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>sadekifodhil@gmail.com</email>
	<code>5300319475328460037139</code>
	<orcid>0009-0009-5383-175X</orcid>
	<coreauthor>No</coreauthor>
	<affiliation>Faculty of Medicine, Laghouat University, Laghouat, Algeria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


	<author>
	<first_name>Mohamed Safouane</first_name>
	<middle_name></middle_name>
	<last_name>Benazzouz</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>ms.benazzouz@univ-alger.dz</email>
	<code>5300319475328460037140</code>
	<orcid>5300319475328460037140</orcid>
	<coreauthor>No</coreauthor>
	<affiliation>Pasteur Institute of Algeria, Algiers, Algeria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


	<author>
	<first_name>Mounir</first_name>
	<middle_name></middle_name>
	<last_name>Ould Setti</last_name>
	<suffix></suffix>
	<first_name_fa></first_name_fa>
	<middle_name_fa></middle_name_fa>
	<last_name_fa></last_name_fa>
	<suffix_fa></suffix_fa>
	<email>ouldsettimounir@gmail.com</email>
	<code>5300319475328460037141</code>
	<orcid>5300319475328460037141</orcid>
	<coreauthor>No</coreauthor>
	<affiliation>Alma Mater Europaea University, Vienna, Austria</affiliation>
	<affiliation_fa></affiliation_fa>
	 </author>


</author_list>


	</article>
</articleset>
</journal>
