1.Incidence of incisional hernia following liver surgery for colorectal liver metastases.Does the laparoscopic approach reduce the risk?A comparative study
Ahmed HASSAN ; Kalaiyarasi ARUJUNAN ; Ali MOHAMED ; Vickey KATHERIA ; Kevin ASHTON ; Rami AHMED ; Daren SUBAR
Annals of Hepato-Biliary-Pancreatic Surgery 2024;28(2):155-160
Background:
s/Aims: No reports to compare incisional hernia (IH) incidence between laparoscopic and open colorectal liver metastases (CRLM) resections have previously been made. This is the first comparative study.
Methods:
Single-center retrospective review of patients who underwent CRLM surgery between January 2011 and December 2018.IH relating to liver surgery was confirmed by computed tomography. Patients were divided into laparoscopic liver resection (LLR) and open liver resection (OLR) groups. Data collection included age, sex, presence of diabetes mellitus, steroid intake, history of previous hernia or liver resection, subcutaneous and peri-renal fat thickness, preoperative creatinine and albumin, American Society of Anesthesiologists (ASA) score, major liver resection, surgical site infection, synchronous presentation, and preoperative chemotherapy.
Results:
Two hundred and forty-seven patients were included with a mean follow-up period of 41 ± 29 months (mean ± standard deviation). Eighty seven (35%) patients had LLR and 160 patients had OLR. No significant difference in the incidence of IH between LLR and OLR was found at 1 and 3 years, respectively ([10%, 19%] vs. [10%, 19%], p = 0.95). On multivariate analysis, previous hernia history (hazard ratio [HR], 2.22; 95% confidence interval [CI], 1.56–4.86) and subcutaneous fat thickness (HR, 2.22; 95% CI, 1.19–4.13) were independent risk factors. Length of hospital stay was shorter in LLR (6 ± 4 days vs. 10 ± 8 days, p < 0.001), in comparison to OLR.
Conclusions
In CRLM, no difference in the incidence of IH between LLR and OLR was found. Previous hernia and subcutaneous fat thickness were risk factors. Further studies are needed to assess modifiable risk factors to develop IH in LLR.
2.Serum Branched Chain Amino Acids Are Associated with Type 2 Diabetes Mellitus in Jordan.
Mahmoud A ALFAQIH ; Zaina ABU-KHDAIR ; Rami SAADEH ; Nesreen SAADEH ; Ahmed AL-DWAIRI ; Othman AL-SHBOUL
Korean Journal of Family Medicine 2018;39(5):313-317
BACKGROUND: Diabetes mellitus is a global public health problem that is caused by the lack of insulin secretion (type 1) or resistance to its action (type 2). A low insulin-to-glucagon ratio predicts an increase in the serum levels of branched chain amino acids, a feature confirmed in several populations. This relationship has not been assessed in Jordan. The objective of this study was to investigate the association between serum branched chain amino acids and type 2 diabetes mellitus in patients in Jordan. METHODS: Two hundred type 2 diabetes mellitus patients and an additional 200 non-diabetic controls were recruited. Age, body mass index, and waist circumference of the subjects were recorded. Branched chain amino acid, total cholesterol, and triglyceride levels were measured from the collected serum samples. RESULTS: Serum branched chain amino acid levels were significantly higher in type 2 diabetes mellitus patients than in non-diabetes individuals (P < 0.0001). In binomial regression analysis, serum branched chain amino acid levels remained significantly associated with diabetes mellitus and increased its risk (odds ratio, 1.004; 95% confidence interval, 1.001–1.006; P=0.003). CONCLUSION: Type 2 diabetes mellitus is associated with higher branched chain amino acid levels in Jordan independent of age, sex, body mass index, waist circumference, and total serum cholesterol and serum triglyceride levels.
Amino Acids*
;
Body Mass Index
;
Cholesterol
;
Diabetes Mellitus
;
Diabetes Mellitus, Type 2*
;
Humans
;
Insulin
;
Jordan*
;
Public Health
;
Triglycerides
;
Waist Circumference
3.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
4.Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison
Mateo Restrepo MEJIA ; Juan Sebastian ARROYAVE ; Michael SATURNO ; Laura Chelsea Mazudie NDJONKO ; Bashar ZAIDAT ; Rami RAJJOUB ; Wasil AHMED ; Ivan ZAPOLSKY ; Samuel K. CHO
Neurospine 2024;21(1):149-158
Objective:
Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy.
Methods:
ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories—overconclusiveness, supplementary information, and incompleteness—were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines.
Results:
Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%).
Conclusion
ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.
5.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
6.Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison
Mateo Restrepo MEJIA ; Juan Sebastian ARROYAVE ; Michael SATURNO ; Laura Chelsea Mazudie NDJONKO ; Bashar ZAIDAT ; Rami RAJJOUB ; Wasil AHMED ; Ivan ZAPOLSKY ; Samuel K. CHO
Neurospine 2024;21(1):149-158
Objective:
Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy.
Methods:
ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories—overconclusiveness, supplementary information, and incompleteness—were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines.
Results:
Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%).
Conclusion
ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.
7.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
8.Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison
Mateo Restrepo MEJIA ; Juan Sebastian ARROYAVE ; Michael SATURNO ; Laura Chelsea Mazudie NDJONKO ; Bashar ZAIDAT ; Rami RAJJOUB ; Wasil AHMED ; Ivan ZAPOLSKY ; Samuel K. CHO
Neurospine 2024;21(1):149-158
Objective:
Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy.
Methods:
ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories—overconclusiveness, supplementary information, and incompleteness—were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines.
Results:
Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%).
Conclusion
ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.
9.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
10.Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison
Mateo Restrepo MEJIA ; Juan Sebastian ARROYAVE ; Michael SATURNO ; Laura Chelsea Mazudie NDJONKO ; Bashar ZAIDAT ; Rami RAJJOUB ; Wasil AHMED ; Ivan ZAPOLSKY ; Samuel K. CHO
Neurospine 2024;21(1):149-158
Objective:
Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy.
Methods:
ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories—overconclusiveness, supplementary information, and incompleteness—were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines.
Results:
Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%).
Conclusion
ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.