Interpretive radiology reports for lung cancer generated by GPT-4 large language model to enhance doctor–patient communication efficiency

Xiongwen YANG; Jian HUANG

Return

Interpretive radiology reports for lung cancer generated by GPT-4 large language model to enhance doctor–patient communication efficiency

VernacularTitle:大语言模型（GPT-4）简化肺癌影像学报告提高医患沟通效率的研究
Author: Xiongwen YANG ¹ ; Jian HUANG ²
Author Information

1. Department of Thoracic Surgery, Guizhou Provincial People’s Hospital, Guiyang, 550002, P. R. China
2. Department of Thoracic Surgery, Jiangxi Provincial Cancer Hospital, Nanchang, 330006, P. R. China
Publication Type:Journal Article
Keywords: GPT-4; large language model; lung cancer; radiology reports; doctor-patient communication; patient education; readability; artificial intelligence-assisted diagnosis
From: Chinese Journal of Clinical Thoracic and Cardiovascular Surgery 2026;33(02):231-240
CountryChina
Language:Chinese
Abstract: Objective To explore the application of the GPT-4 large language model in simplifying lung cancer radiology reports to enhance patient comprehension and doctor–patient communication efficiency. Methods A total of 362 radiology reports of non-small cell lung cancer (NSCLC) patients were collected from two hospitals between September and December 2024. Interpretive radiology reports (IRRs) were generated using GPT-4. Original reports (ORRs) and IRRs were compared through radiologist consistency evaluation and volunteer-based assessments of reading time, comprehension scores, and simulated communication duration. Results The average word count of ORRs was (459.83±55.76) words, compared with (625.42±41.59) words for IRRs (P<0.001). No significant differences were observed in expert consistency scores between ORRs and IRRs across dimensions of image interpretation accuracy, report detail completeness, explanatory depth and insight, and clinical practicality. Compared with reading ORRs, volunteers (simulated patient) read IRRs with shorter time [(346.88±29.15) s versus (409.01 ±102.40) s], with higher comprehension scores [(7.83±1.04) points versus (5.53±0.94) points] and shorter doctor-patient communication times [(317.31±57.81) s versus (714.20±56.67) s]. All differences were statistically significant (all P<0.001). Conclusion GPT-4 generated IRRs significantly improve patient comprehension and shorten communication time while maintaining medical accuracy. These findings suggest a new approach to optimizing radiology report management and enhancing healthcare service quality.