1.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
2.Surgeon Preference Regarding Wound Dressing Management in Lumbar Fusion Surgery: An AO Spine Global Cross-Sectional Study
Luca AMBROSIO ; Gianluca VADALÀ ; Javad TAVAKOLI ; Laura SCARAMUZZO ; Giovanni Barbanti BRODANO ; Stephen J. LEWIS ; So KATO ; Samuel K. CHO ; S. Tim YOON ; Ho-Joong KIM ; Matthew F. GARY ; Vincenzo DENARO ;
Neurospine 2024;21(1):204-211
Objective:
To evaluate the global practice pattern of wound dressing use after lumbar fusion for degenerative conditions.
Methods:
A survey issued by AO Spine Knowledge Forums Deformity and Degenerative was sent out to AO Spine members. The type of postoperative dressing employed, timing of initial dressing removal, and type of subsequent dressing applied were investigated. Differences in the type of surgery and regional distribution of surgeons’ preferences were analyzed.
Results:
Right following surgery, 60.6% utilized a dry dressing, 23.2% a plastic occlusive dressing, 5.7% glue, 6% a combination of glue and polyester mesh, 2.6% a wound vacuum, and 1.2% other dressings. The initial dressing was removed on postoperative day 1 (11.6%), 2 (39.2%), 3 (20.3%), 4 (1.7%), 5 (4.3%), 6 (0.4%), 7 or later (12.5%), or depending on drain removal (9.9%). Following initial dressing removal, 75.9% applied a dry dressing, 17.7% a plastic occlusive dressing, and 1.3% glue, while 12.1% used no dressing. The use of no additional coverage after initial dressing removal was significantly associated with a later dressing change (p < 0.001). Significant differences emerged after comparing dressing management among different AO Spine regions (p < 0.001).
Conclusion
Most spine surgeons utilized a dry or plastic occlusive dressing initially applied after surgery. The first dressing was more frequently changed during the first 3 postoperative days and replaced with the same type of dressing. While dressing policies tended not to vary according to the type of surgery, regional differences suggest that actual practice may be based on personal experience rather than available evidence.
3.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
4.Surgeon Preference Regarding Wound Dressing Management in Lumbar Fusion Surgery: An AO Spine Global Cross-Sectional Study
Luca AMBROSIO ; Gianluca VADALÀ ; Javad TAVAKOLI ; Laura SCARAMUZZO ; Giovanni Barbanti BRODANO ; Stephen J. LEWIS ; So KATO ; Samuel K. CHO ; S. Tim YOON ; Ho-Joong KIM ; Matthew F. GARY ; Vincenzo DENARO ;
Neurospine 2024;21(1):204-211
Objective:
To evaluate the global practice pattern of wound dressing use after lumbar fusion for degenerative conditions.
Methods:
A survey issued by AO Spine Knowledge Forums Deformity and Degenerative was sent out to AO Spine members. The type of postoperative dressing employed, timing of initial dressing removal, and type of subsequent dressing applied were investigated. Differences in the type of surgery and regional distribution of surgeons’ preferences were analyzed.
Results:
Right following surgery, 60.6% utilized a dry dressing, 23.2% a plastic occlusive dressing, 5.7% glue, 6% a combination of glue and polyester mesh, 2.6% a wound vacuum, and 1.2% other dressings. The initial dressing was removed on postoperative day 1 (11.6%), 2 (39.2%), 3 (20.3%), 4 (1.7%), 5 (4.3%), 6 (0.4%), 7 or later (12.5%), or depending on drain removal (9.9%). Following initial dressing removal, 75.9% applied a dry dressing, 17.7% a plastic occlusive dressing, and 1.3% glue, while 12.1% used no dressing. The use of no additional coverage after initial dressing removal was significantly associated with a later dressing change (p < 0.001). Significant differences emerged after comparing dressing management among different AO Spine regions (p < 0.001).
Conclusion
Most spine surgeons utilized a dry or plastic occlusive dressing initially applied after surgery. The first dressing was more frequently changed during the first 3 postoperative days and replaced with the same type of dressing. While dressing policies tended not to vary according to the type of surgery, regional differences suggest that actual practice may be based on personal experience rather than available evidence.
5.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
6.Surgeon Preference Regarding Wound Dressing Management in Lumbar Fusion Surgery: An AO Spine Global Cross-Sectional Study
Luca AMBROSIO ; Gianluca VADALÀ ; Javad TAVAKOLI ; Laura SCARAMUZZO ; Giovanni Barbanti BRODANO ; Stephen J. LEWIS ; So KATO ; Samuel K. CHO ; S. Tim YOON ; Ho-Joong KIM ; Matthew F. GARY ; Vincenzo DENARO ;
Neurospine 2024;21(1):204-211
Objective:
To evaluate the global practice pattern of wound dressing use after lumbar fusion for degenerative conditions.
Methods:
A survey issued by AO Spine Knowledge Forums Deformity and Degenerative was sent out to AO Spine members. The type of postoperative dressing employed, timing of initial dressing removal, and type of subsequent dressing applied were investigated. Differences in the type of surgery and regional distribution of surgeons’ preferences were analyzed.
Results:
Right following surgery, 60.6% utilized a dry dressing, 23.2% a plastic occlusive dressing, 5.7% glue, 6% a combination of glue and polyester mesh, 2.6% a wound vacuum, and 1.2% other dressings. The initial dressing was removed on postoperative day 1 (11.6%), 2 (39.2%), 3 (20.3%), 4 (1.7%), 5 (4.3%), 6 (0.4%), 7 or later (12.5%), or depending on drain removal (9.9%). Following initial dressing removal, 75.9% applied a dry dressing, 17.7% a plastic occlusive dressing, and 1.3% glue, while 12.1% used no dressing. The use of no additional coverage after initial dressing removal was significantly associated with a later dressing change (p < 0.001). Significant differences emerged after comparing dressing management among different AO Spine regions (p < 0.001).
Conclusion
Most spine surgeons utilized a dry or plastic occlusive dressing initially applied after surgery. The first dressing was more frequently changed during the first 3 postoperative days and replaced with the same type of dressing. While dressing policies tended not to vary according to the type of surgery, regional differences suggest that actual practice may be based on personal experience rather than available evidence.
7.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
8.Surgeon Preference Regarding Wound Dressing Management in Lumbar Fusion Surgery: An AO Spine Global Cross-Sectional Study
Luca AMBROSIO ; Gianluca VADALÀ ; Javad TAVAKOLI ; Laura SCARAMUZZO ; Giovanni Barbanti BRODANO ; Stephen J. LEWIS ; So KATO ; Samuel K. CHO ; S. Tim YOON ; Ho-Joong KIM ; Matthew F. GARY ; Vincenzo DENARO ;
Neurospine 2024;21(1):204-211
Objective:
To evaluate the global practice pattern of wound dressing use after lumbar fusion for degenerative conditions.
Methods:
A survey issued by AO Spine Knowledge Forums Deformity and Degenerative was sent out to AO Spine members. The type of postoperative dressing employed, timing of initial dressing removal, and type of subsequent dressing applied were investigated. Differences in the type of surgery and regional distribution of surgeons’ preferences were analyzed.
Results:
Right following surgery, 60.6% utilized a dry dressing, 23.2% a plastic occlusive dressing, 5.7% glue, 6% a combination of glue and polyester mesh, 2.6% a wound vacuum, and 1.2% other dressings. The initial dressing was removed on postoperative day 1 (11.6%), 2 (39.2%), 3 (20.3%), 4 (1.7%), 5 (4.3%), 6 (0.4%), 7 or later (12.5%), or depending on drain removal (9.9%). Following initial dressing removal, 75.9% applied a dry dressing, 17.7% a plastic occlusive dressing, and 1.3% glue, while 12.1% used no dressing. The use of no additional coverage after initial dressing removal was significantly associated with a later dressing change (p < 0.001). Significant differences emerged after comparing dressing management among different AO Spine regions (p < 0.001).
Conclusion
Most spine surgeons utilized a dry or plastic occlusive dressing initially applied after surgery. The first dressing was more frequently changed during the first 3 postoperative days and replaced with the same type of dressing. While dressing policies tended not to vary according to the type of surgery, regional differences suggest that actual practice may be based on personal experience rather than available evidence.
9.Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery
Bashar ZAIDAT ; Nancy SHRESTHA ; Ashley M. ROSENBERG ; Wasil AHMED ; Rami RAJJOUB ; Timothy HOANG ; Mateo Restrepo MEJIA ; Akiro H. DUEY ; Justin E. TANG ; Jun S. KIM ; Samuel K. CHO
Neurospine 2024;21(1):128-146
Objective:
Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT’s 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines.
Methods:
ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy.
Results:
Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT’s GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response.
Conclusion
ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model’s performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model’s responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.
10.Surgeon Preference Regarding Wound Dressing Management in Lumbar Fusion Surgery: An AO Spine Global Cross-Sectional Study
Luca AMBROSIO ; Gianluca VADALÀ ; Javad TAVAKOLI ; Laura SCARAMUZZO ; Giovanni Barbanti BRODANO ; Stephen J. LEWIS ; So KATO ; Samuel K. CHO ; S. Tim YOON ; Ho-Joong KIM ; Matthew F. GARY ; Vincenzo DENARO ;
Neurospine 2024;21(1):204-211
Objective:
To evaluate the global practice pattern of wound dressing use after lumbar fusion for degenerative conditions.
Methods:
A survey issued by AO Spine Knowledge Forums Deformity and Degenerative was sent out to AO Spine members. The type of postoperative dressing employed, timing of initial dressing removal, and type of subsequent dressing applied were investigated. Differences in the type of surgery and regional distribution of surgeons’ preferences were analyzed.
Results:
Right following surgery, 60.6% utilized a dry dressing, 23.2% a plastic occlusive dressing, 5.7% glue, 6% a combination of glue and polyester mesh, 2.6% a wound vacuum, and 1.2% other dressings. The initial dressing was removed on postoperative day 1 (11.6%), 2 (39.2%), 3 (20.3%), 4 (1.7%), 5 (4.3%), 6 (0.4%), 7 or later (12.5%), or depending on drain removal (9.9%). Following initial dressing removal, 75.9% applied a dry dressing, 17.7% a plastic occlusive dressing, and 1.3% glue, while 12.1% used no dressing. The use of no additional coverage after initial dressing removal was significantly associated with a later dressing change (p < 0.001). Significant differences emerged after comparing dressing management among different AO Spine regions (p < 0.001).
Conclusion
Most spine surgeons utilized a dry or plastic occlusive dressing initially applied after surgery. The first dressing was more frequently changed during the first 3 postoperative days and replaced with the same type of dressing. While dressing policies tended not to vary according to the type of surgery, regional differences suggest that actual practice may be based on personal experience rather than available evidence.

Result Analysis
Print
Save
E-mail