1.Trends in the Charges and Utilization of Computer-Assisted Navigation in Cervical and Thoracolumbar Spinal Surgery
Calista L. DOMINY ; Justin E. TANG ; Varun ARVIND ; Brian H. CHO ; Stephen SELVERIAN ; Kush C. SHAH ; Jun S. KIM ; Samuel K. CHO
Asian Spine Journal 2022;16(5):625-633
Methods:
Relevant data from the National Readmission Database in 2015–2018 were analyzed, and the computer-assisted procedures of cervical and thoracolumbar spinal surgery were identified using International Classification of Diseases 9th and 10th revision codes. Patient demographics, surgical data, readmissions, and total charges were examined. Comorbidity burden was calculated using the Charlson and Elixhauser comorbidity index. Complication rates were determined on the basis of diagnosis codes.
Results:
A total of 48,116 cervical cases and 27,093 thoracolumbar cases were identified using computer-assisted navigation. No major differences in sex, age, or comorbidities over time were found. The utilization of computer-assisted navigation for cervical and thoracolumbar spinal fusion cases increased from 2015 to 2018 and normalized to their respective years’ total cases (Pearson correlation coefficient=0.756, p =0.049; Pearson correlation coefficient=0.9895, p =0.010). Total charges for cervical and thoracolumbar cases increased over time (Pearson correlation coefficient=0.758, p =0.242; Pearson correlation coefficient=0.766, p =0.234).
Conclusions
The use of computer-assisted navigation in spinal surgery increased significantly from 2015 to 2018. The average cost grossly increased from 2015 to 2018, and it was higher than the average cost of nonnavigated spinal surgery. With the increased utilization and standardization of computer-assisted navigation in spinal surgeries, the cost of care of more patients might potentially increase. As a result, further studies should be conducted to determine whether the use of computer-assisted navigation is efficient in terms of cost and improvement of care.
2.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
3.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
4.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
5.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
6.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.