ChatGPT, ChatGPT Plus, Gemini ve Microsoft Copilot B�y�k Dil Modellerinin T�rkiye�deki Di� Hekimli�i Uzmanl�k E�itimi Giri� S�nav��nda Sorulan A��z, Di� ve �ene Radyolojisi Sorular�n� Cevaplama Performanslar�n�n De�erlendirilmesi

Peker, Ramazan Berkay

pdf

Cilt : 21 Say� : 3

22/1Son Say� Ar�iv En �ok �ndirilen Makaleler Online Makale G�nder

YAZAR KATKI FORMU

�IKAR �ATI�MASI BEYAN FORMU

YAYIN HAKKI DEV�R FORMU

ChatGPT, ChatGPT Plus, Gemini ve Microsoft Copilot B�y�k Dil Modellerinin T�rkiye�deki Di� Hekimli�i Uzmanl�k E�itimi Giri� S�nav��nda Sorulan A��z, Di� ve �ene Radyolojisi Sorular�n� Cevaplama Performanslar�n�n De�erlendirilmesi [Yeditepe J Dent]

Yeditepe J Dent. 2025; 21(3): 130-135 | DOI: 10.5505/yeditepe.2025.93265

ChatGPT, ChatGPT Plus, Gemini ve Microsoft Copilot B�y�k Dil Modellerinin T�rkiye�deki Di� Hekimli�i Uzmanl�k E�itimi Giri� S�nav��nda Sorulan A��z, Di� ve �ene Radyolojisi Sorular�n� Cevaplama Performanslar�n�n De�erlendirilmesi

Ramazan Berkay Peker
Trakya �niversitesi, Di� Hekimli�i Fak�ltesi, A��z, Di� ve �ene Radyolojisi Anabilim Dal�, Edirne

G�R�� ve AMA�: Bu �al��man�n amac� biri �cretli (ChatGPT-4o), �� cretsiz (ChatGPT-4o mini, Gemini, Microsoft Copilot) d�rt b�y�k dil modelinin (BDM) 2012-2021 y�llar� aras�nda yap�lan Di� Hekimli�i Uzmanl�k E�itimi Giri� S�nav��nda (DUS) sorulmu� olan A��z, Di� ve �ene Radyolojisi (AD�R) sorular�n� cevaplama performans�n� de�erlendirip kar��la�t�rmakt�r.
Y�NTEM ve GERE�LER: 2012-2021 y�llar� aras�nda DUS�ta sorulmu� olan 123 soru, �oral diagnoz� ve �radyoloji� kategorilerine ayr�larak d�rt BDM�ye soruldu. BDM�lerin verdi�i cevab�n do�rulu�una g�re elde edilen veriler Pearson Ki Kare Testi, Monte Carlo d�zeltmeli Fisher Exact Testi ve Bonferroni d�zeltmeli z Testi kullan�larak analiz edildi (p<0.05).
BULGULAR: T�m sorulara verilen do�ru cevap oran� Chat- GPT-4o mini�de %74, ChatGPT-4o�da %91,1, Gemini�de %69,9 ve Microsoft Copilot�ta %86,2 olarak elde edilmi�tir. Sadece radyoloji kategorisinde verilen cevaplarda BDM'ler aras�nda istatistiksel olarak anlaml� bir ili�ki bulunmu�tur (p=0,054; p<0,001).
TARTI�MA ve SONU�: D�rt BDM aras�nda �cretli olan ChatGPT-4o ve �cretsiz olan Microsoft Copilot, istatistiksel olarak birbirine benzer ve sorulan sorular�n %80�inden fazlas�na do�ru cevap verme performans� sergilemi�tir. �nceki �al��malar analiz edildi�inde BDM�lerin h�zl� bir geli�im g�sterdi�i g�zlenmektedir. BDM�ler ilerleyen zamanlarda di� hekimli�i e�itiminde etkin bir �ekilde rol oynayabilecektir.

Anahtar Kelimeler: Yapay zeka, b�y�k dil modelleri, Chat- GPT, Microsoft Copilot; Gemini; �oktan se�meli soru

Assessment of the Performance of Large Language Models ChatGPT, ChatGPT Plus, Gemini, and Microsoft Copilot in Responding to Oral and Maxillofacial Radiology Questions from the Turkish Dentistry Specialty Entrance Exam

Ramazan Berkay Peker
Department of Dentomaxillofacial Radiology, Trakya University, Edirne, T�rkiye

INTRODUCTION: The aim of this study was to evaluate and compare the performance of four large language models (LLM), one paid (ChatGPT-4o) and three free (ChatGPT-4o mini, Gemini, Microsoft Copilot), in answering Oral, Dental and Maxillofacial Radiology questions asked in the Dental Specialty Training Entrance Examination between 2012 and 2021.
METHODS: The 123 questions asked in Dental Specialty Training Entrance Examination between 2012 and 2021 were divided into �oral diagnosis� and �radiology� categories and asked to four LLMs. The data obtained according to the accuracy of the answers given by the LLMs were analyzed using Pearson Chi-Square Test, Fisher Exact Test with Monte Carlo correction and z Test with Bonferroni correction (p<0.05).
RESULTS: The correct answer rate for all questions was 74% in ChatGPT-4o mini, 91.1% in ChatGPT-4o, 69.9% in Gemini and 86.2% in Microsoft Copilot. A statistically significant correlation was found among the LLMs only in the answers given to the radiology category (p<0.001).
DISCUSSION AND CONCLUSION: Among the four LLMs, ChatGPT-4o, a paid LLM, and Microsoft Copilot, a free LLM, performed statistically similar to each other and answered more than 80% of the questions correctly. When previous studies are analyzed, it is observed that LLMs are developing rapidly. LLMs will be able to play an effective role in dental education in the future.

Keywords: Artificial intelligence, large language models, ChatGPT, Microsoft Copilot; Gemini; multiple choice questioning

Sorumlu Yazar: Ramazan Berkay Peker, T�rkiye
Makale Dili: T�rk�e

ATIF KOPYALA

Tam Metin PDF At�f dosyas� indir RIS EndNote BibTex Medlars Procite Reference Manager Yazara e-posta g�nder Benzer makaleler PubMed Google Scholar

H�zl� Arama

ChatGPT, ChatGPT Plus, Gemini ve Microsoft Copilot B�y�k Dil Modellerinin T�rkiye�deki Di� Hekimli�i Uzmanl�k E�itimi Giri� S�nav��nda Sorulan A��z, Di� ve �ene Radyolojisi Sorular�n� Cevaplama Performanslar�n�n De�erlendirilmesi

Assessment of the Performance of Large Language Models ChatGPT, ChatGPT Plus, Gemini, and Microsoft Copilot in Responding to Oral and Maxillofacial Radiology Questions from the Turkish Dentistry Specialty Entrance Exam