1.Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT
Mahmood DASHTI ; Shohreh GHASEMI ; Niloofar GHADIMI ; Delband HEFZI ; Azizeh KARIMIAN ; Niusha ZARE ; Amir FAHIMIPOUR ; Zohaib KHURSHID ; Maryam Mohammadalizadeh CHAFJIRI ; Sahar GHAEDSHARAF
Imaging Science in Dentistry 2024;54(3):271-275
Purpose:
Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care.
Materials and Methods:
This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams -specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and Advanced Dental Admission Test (ADAT) - excluding image-based questions. Using customized prompts,ChatGPT’s answers were evaluated against official answer sheets.
Results:
ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions.In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5’s rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P = 0.009). Both versions showed similar patterns in incorrect responses.
Conclusion
Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4’s perfect score incomprehension questions underscores its trainability in specific subjects. However, both versions exhibited weakerperformance in mathematical analysis, suggesting this as an area for improvement.
2.Evaluation of deep learning and convolutional neural network algorithms for mandibular fracture detection using radiographic images: A systematic review and meta-analysis
Mahmood DASHTI ; Sahar GHAEDSHARAF ; Shohreh GHASEMI ; Niusha ZARE ; Elena-Florentina CONSTANTIN ; Amir FAHIMIPOUR ; Neda TAJBAKHSH ; Niloofar GHADIMI
Imaging Science in Dentistry 2024;54(3):232-239
Purpose:
The use of artificial intelligence (AI) and deep learning algorithms in dentistry, especially for processing radiographic images, has markedly increased. However, detailed information remains limited regarding the accuracy of these algorithms in detecting mandibular fractures.
Materials and Methods:
This meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specific keywords were generated regarding the accuracy of AI algorithms in detecting mandibular fractures on radiographic images. Then, the PubMed/Medline, Scopus, Embase, and Web of Science databases were searched. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool was employed to evaluate potential bias in the selected studies. A pooled analysis of the relevant parameters was conducted using STATA version 17 (StataCorp, College Station, TX, USA), utilizing the metandi command.
Results:
Of the 49 studies reviewed, 5 met the inclusion criteria. All of the selected studies utilized convolutional neural network algorithms, albeit with varying backbone structures, and all evaluated panoramic radiography images.The pooled analysis yielded a sensitivity of 0.971 (95% confidence interval [CI]: 0.881-0.949), a specificity of 0.813 (95% CI: 0.797-0.824), and a diagnostic odds ratio of 7.109 (95% CI: 5.27-8.913).
Conclusion
This review suggests that deep learning algorithms show potential for detecting mandibular fractures on panoramic radiography images. However, their effectiveness is currently limited by the small size and narrow scope of available datasets. Further research with larger and more diverse datasets is crucial to verify the accuracy of these tools in in practical dental settings.
3.Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT
Mahmood DASHTI ; Shohreh GHASEMI ; Niloofar GHADIMI ; Delband HEFZI ; Azizeh KARIMIAN ; Niusha ZARE ; Amir FAHIMIPOUR ; Zohaib KHURSHID ; Maryam Mohammadalizadeh CHAFJIRI ; Sahar GHAEDSHARAF
Imaging Science in Dentistry 2024;54(3):271-275
Purpose:
Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care.
Materials and Methods:
This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams -specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and Advanced Dental Admission Test (ADAT) - excluding image-based questions. Using customized prompts,ChatGPT’s answers were evaluated against official answer sheets.
Results:
ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions.In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5’s rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P = 0.009). Both versions showed similar patterns in incorrect responses.
Conclusion
Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4’s perfect score incomprehension questions underscores its trainability in specific subjects. However, both versions exhibited weakerperformance in mathematical analysis, suggesting this as an area for improvement.
4.Evaluation of deep learning and convolutional neural network algorithms for mandibular fracture detection using radiographic images: A systematic review and meta-analysis
Mahmood DASHTI ; Sahar GHAEDSHARAF ; Shohreh GHASEMI ; Niusha ZARE ; Elena-Florentina CONSTANTIN ; Amir FAHIMIPOUR ; Neda TAJBAKHSH ; Niloofar GHADIMI
Imaging Science in Dentistry 2024;54(3):232-239
Purpose:
The use of artificial intelligence (AI) and deep learning algorithms in dentistry, especially for processing radiographic images, has markedly increased. However, detailed information remains limited regarding the accuracy of these algorithms in detecting mandibular fractures.
Materials and Methods:
This meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specific keywords were generated regarding the accuracy of AI algorithms in detecting mandibular fractures on radiographic images. Then, the PubMed/Medline, Scopus, Embase, and Web of Science databases were searched. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool was employed to evaluate potential bias in the selected studies. A pooled analysis of the relevant parameters was conducted using STATA version 17 (StataCorp, College Station, TX, USA), utilizing the metandi command.
Results:
Of the 49 studies reviewed, 5 met the inclusion criteria. All of the selected studies utilized convolutional neural network algorithms, albeit with varying backbone structures, and all evaluated panoramic radiography images.The pooled analysis yielded a sensitivity of 0.971 (95% confidence interval [CI]: 0.881-0.949), a specificity of 0.813 (95% CI: 0.797-0.824), and a diagnostic odds ratio of 7.109 (95% CI: 5.27-8.913).
Conclusion
This review suggests that deep learning algorithms show potential for detecting mandibular fractures on panoramic radiography images. However, their effectiveness is currently limited by the small size and narrow scope of available datasets. Further research with larger and more diverse datasets is crucial to verify the accuracy of these tools in in practical dental settings.
5.Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT
Mahmood DASHTI ; Shohreh GHASEMI ; Niloofar GHADIMI ; Delband HEFZI ; Azizeh KARIMIAN ; Niusha ZARE ; Amir FAHIMIPOUR ; Zohaib KHURSHID ; Maryam Mohammadalizadeh CHAFJIRI ; Sahar GHAEDSHARAF
Imaging Science in Dentistry 2024;54(3):271-275
Purpose:
Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care.
Materials and Methods:
This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams -specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and Advanced Dental Admission Test (ADAT) - excluding image-based questions. Using customized prompts,ChatGPT’s answers were evaluated against official answer sheets.
Results:
ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions.In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5’s rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P = 0.009). Both versions showed similar patterns in incorrect responses.
Conclusion
Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4’s perfect score incomprehension questions underscores its trainability in specific subjects. However, both versions exhibited weakerperformance in mathematical analysis, suggesting this as an area for improvement.
6.Evaluation of deep learning and convolutional neural network algorithms for mandibular fracture detection using radiographic images: A systematic review and meta-analysis
Mahmood DASHTI ; Sahar GHAEDSHARAF ; Shohreh GHASEMI ; Niusha ZARE ; Elena-Florentina CONSTANTIN ; Amir FAHIMIPOUR ; Neda TAJBAKHSH ; Niloofar GHADIMI
Imaging Science in Dentistry 2024;54(3):232-239
Purpose:
The use of artificial intelligence (AI) and deep learning algorithms in dentistry, especially for processing radiographic images, has markedly increased. However, detailed information remains limited regarding the accuracy of these algorithms in detecting mandibular fractures.
Materials and Methods:
This meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specific keywords were generated regarding the accuracy of AI algorithms in detecting mandibular fractures on radiographic images. Then, the PubMed/Medline, Scopus, Embase, and Web of Science databases were searched. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool was employed to evaluate potential bias in the selected studies. A pooled analysis of the relevant parameters was conducted using STATA version 17 (StataCorp, College Station, TX, USA), utilizing the metandi command.
Results:
Of the 49 studies reviewed, 5 met the inclusion criteria. All of the selected studies utilized convolutional neural network algorithms, albeit with varying backbone structures, and all evaluated panoramic radiography images.The pooled analysis yielded a sensitivity of 0.971 (95% confidence interval [CI]: 0.881-0.949), a specificity of 0.813 (95% CI: 0.797-0.824), and a diagnostic odds ratio of 7.109 (95% CI: 5.27-8.913).
Conclusion
This review suggests that deep learning algorithms show potential for detecting mandibular fractures on panoramic radiography images. However, their effectiveness is currently limited by the small size and narrow scope of available datasets. Further research with larger and more diverse datasets is crucial to verify the accuracy of these tools in in practical dental settings.
7.Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT
Mahmood DASHTI ; Shohreh GHASEMI ; Niloofar GHADIMI ; Delband HEFZI ; Azizeh KARIMIAN ; Niusha ZARE ; Amir FAHIMIPOUR ; Zohaib KHURSHID ; Maryam Mohammadalizadeh CHAFJIRI ; Sahar GHAEDSHARAF
Imaging Science in Dentistry 2024;54(3):271-275
Purpose:
Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care.
Materials and Methods:
This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams -specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and Advanced Dental Admission Test (ADAT) - excluding image-based questions. Using customized prompts,ChatGPT’s answers were evaluated against official answer sheets.
Results:
ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions.In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5’s rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P = 0.009). Both versions showed similar patterns in incorrect responses.
Conclusion
Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4’s perfect score incomprehension questions underscores its trainability in specific subjects. However, both versions exhibited weakerperformance in mathematical analysis, suggesting this as an area for improvement.
8.Evaluation of deep learning and convolutional neural network algorithms for mandibular fracture detection using radiographic images: A systematic review and meta-analysis
Mahmood DASHTI ; Sahar GHAEDSHARAF ; Shohreh GHASEMI ; Niusha ZARE ; Elena-Florentina CONSTANTIN ; Amir FAHIMIPOUR ; Neda TAJBAKHSH ; Niloofar GHADIMI
Imaging Science in Dentistry 2024;54(3):232-239
Purpose:
The use of artificial intelligence (AI) and deep learning algorithms in dentistry, especially for processing radiographic images, has markedly increased. However, detailed information remains limited regarding the accuracy of these algorithms in detecting mandibular fractures.
Materials and Methods:
This meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specific keywords were generated regarding the accuracy of AI algorithms in detecting mandibular fractures on radiographic images. Then, the PubMed/Medline, Scopus, Embase, and Web of Science databases were searched. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool was employed to evaluate potential bias in the selected studies. A pooled analysis of the relevant parameters was conducted using STATA version 17 (StataCorp, College Station, TX, USA), utilizing the metandi command.
Results:
Of the 49 studies reviewed, 5 met the inclusion criteria. All of the selected studies utilized convolutional neural network algorithms, albeit with varying backbone structures, and all evaluated panoramic radiography images.The pooled analysis yielded a sensitivity of 0.971 (95% confidence interval [CI]: 0.881-0.949), a specificity of 0.813 (95% CI: 0.797-0.824), and a diagnostic odds ratio of 7.109 (95% CI: 5.27-8.913).
Conclusion
This review suggests that deep learning algorithms show potential for detecting mandibular fractures on panoramic radiography images. However, their effectiveness is currently limited by the small size and narrow scope of available datasets. Further research with larger and more diverse datasets is crucial to verify the accuracy of these tools in in practical dental settings.