1.Development and validation of an XGBoost-based prediction model for acute liver injury in statin users
Xianglong MENG ; Yuelin YU ; Yexiang SUN ; Peng SHEN ; Zhiqin JIANG ; Yu ZHU ; Yueqi YIN ; Siyan ZHAN ; Shengfeng WANG
Chinese Journal of Pharmacoepidemiology 2025;34(8):867-876
Objective To develop and validate a prediction model to identify high-risk individuals who are at-risk to develop acute liver injury(ALI)within 180 days in new statin users,and to support early clinical intervention.Methods Data were sourced from the Yinzhou Regional Health Information Platform,covering statin initiators aged 18 years and older from January 1,2010,to October 31,2021.The dataset was divided into a derivation cohort and a temporal validation cohort based on the time of statin initiation.Predictors were selected using LASSO regression,and the model was constructed using the extreme gradient boosting(XGBoost)algorithm combined with cost-sensitive learning.Model performance was evaluated using Brier scores,Harrell's C-index,and calibration curves.Results A total of 126,440 statin initiators were included,with 90,542 in the derivation cohort and 35,898 in the validation cohort.Within 180 days of initial statin use,412(0.33%)patients developed ALI,including 305(0.34%)in the derivation cohort and 107(0.30%)in the validation cohort.The final model incorporated 16 predictors,which included demographic characteristics,lifestyle factors,family history,medical history,statin use,and concomitant medication use.The model demonstrated excellent overall performance[Brier score=0.0043,95%CI(0.0038,0.0049)],discrimination[Harrell's C-index=0.761,95%CI(0.725,0.794)],and calibration in internal validation.In temporal validation,the model also performed well[Brier score=0.0044,95%CI(0.0036,0.0052),Harrell's C-index=0.703,95%CI(0.614,0.781)].Conclusion This study develope and validate a prediction model for ALI in statin users,providing clinicians with a reliable tool for individualized risk assessment.This model can help achieve risk stratification and reduce the occurrence of ALI.
2.Research progress on big-data-driven analysis strategies for imbalanced data of rare events
Jiangjie ZHOU ; Yutong WANG ; Tian FENG ; Xianglong MENG ; Baosheng LIANG ; Shengfeng WANG
Chinese Journal of Pharmacoepidemiology 2025;34(8):952-961
Rare events are widely prevalent in various disciplines,including rare adverse reactions to vaccines and drugs,clinical rare diseases,and low-probability clinical outcomes.The reason for research interest on such events is that their occurrence often brings incalculable and serious consequences.In the context of big data,numerous methods have emerged for rare event data analysis,including sampling based,category weighting,ensemble learning,and deep learning.This article systematically summarizes the research progress of current rare event data analysis methods,and introduces their basic principles and applicable scenarios.By analyzing the advantages and disadvantages of existing methods,the challenges of rare event research are sorted out and summarized,and potential research directions in related fields are explored to provide references for researchers.
3.Distribution characteristics and long-term change trend of body mass index in Chinese older adults aged 65 years and above
Li QI ; Chen CHEN ; Sirui CHEN ; Zhipei LI ; Sixin LIU ; Jinhui ZHOU ; Jiahao CHEN ; Hao QIAN ; Chun TAN ; Xianglong DAI ; Ziyue ZHU ; Jun WANG ; Xi MENG ; Wenhui SHI ; Yuebin LYU ; Xiaoming SHI
Chinese Journal of Preventive Medicine 2025;59(6):908-915
Objective:To describe the body mass index (BMI) level and long-term trends of Chinese older adults aged 65 and above.Methods:Older adults aged 65 and above from six waves (2002-2018) of the China Longitudinal Healthy Longevity Survey were selected as the study population. Multiple cross-sectional design with six survey waves conducted in 2002, 2005, 2008, 2011, 2014, and 2018 was adopted, enrolling 15 647, 15 358, 15 622, 9 166, 6 302, and 12 417 participants, respectively. Additionally, a total of 13, 755 participants were included in the cohort study design. Relevant information was collected through questionnaires and physical examinations. The χ2 trend test was used to compare the changes in the rates of underweight and overweight/obesity over the years, and the linear mixed-e?ects model (LMM) was used to fit trajectory curves of BMI changes with advancing age in older adults. Results:The baseline ages of the participants included in 2002, 2005, 2008, 2011, 2014, and 2018 were (85.16±11.26), (84.23±11.83), (84.99±12.16), (81.10±11.86), (78.89±11.30), and (83.08±12.42) years, respectively, with a relatively high proportion of females and rural residents. In the cohort study, the 13 755 participants had a median ( Q1, Q3) follow-up time of 6.5 (5.2, 10.0) years, with a cumulative follow-up duration of 109 041 person-years. In each wave, males had higher BMI than females, urban residents had higher BMI than rural residents, and BMI gradually decreased with increasing age (all P<0.001). The mean BMI of older adults in China increased from (19.37±3.80) kg/m2 in 2002 to (22.04±4.01) kg/m2 in 2018 ( P<0.001). Across all survey years, the prevalence of underweight was consistently higher in women than in men and in rural areas than in urban areas, with an upward trend as age increased (all P<0.001). In 2018, the underweight rates in the 65-79, 80-89, 90-99, and ≥100-year-old age groups were 8.0%, 16.7%, 26.2%, and 35.5%, respectively. Meanwhile, the prevalence of overweight/obesity was higher in men than in women and in urban areas than in rural areas, showing a declining trend with advancing age (all P<0.001). The prevalence of underweight among the older adults decreased significantly from 45.2% in 2002 to 18.9% in 2018 ( P<0.001), while the prevalence of overweight/obesity increased from 11.0% in 1998 to 29.6% in 2018 ( P<0.001). The trajectory curves fitted by the LMM model showed that individuals born in later decades had higher BMI levels at the same age compared to earlier cohorts. Conclusion:From 2002 to 2018, the BMI level among Chinese older adults showed an increasing trend. The prevalence of underweight showed a declining trend, while the rates of obesity and overweight increased. However, the underweight rate remained notably high among the oldest old.
4.Development and application of a rapid identification algorithm for cutaneous lupus erythematosus and its subtypes based on medical insurance databases
Yutong WANG ; Xianglong MENG ; Yu PAN ; Chen WEI ; Hui JIN ; Shengfeng WANG
Chinese Journal of Pharmacoepidemiology 2025;34(7):743-752
Objective To develop and validate data extraction and patient identification algorithms for cutaneous lupus erythematosus(CLE)and its two subtypes,discoid lupus erythematosus(DLE)and subacute cutaneous lupus erythematosus(SCLE),and to enable high-efficiency patient identification in large-scale electronic health databases.Methods This study utilized data from the 2013-2017 National Insurance Claims for Epidemiological Research(NICER)to construct data extraction and rapid patient identification algorithms.The manual verification results were used as gold standard to assess the sensitivity and specificity of the algorithms.Additionally,the basic characteristics of the identified patients were analyzed.Results Initially,standardized expressions were developed based on medical terminology and diagnostic codes.These were further refined with input from clinicians to include potential synonyms and common misspellings,improving the preliminary screening expressions.Through iterative verification by clinicians and data management engineers,a final disease-specific screening algorithm was established.The developed extraction and identification algorithms for all 3 targeted disease demonstrated strong performance,with sensitivity values of 0.985,1.000,and 0.991,and specificity values of 0.997,0.999,and 0.998 for CLE,DLE,and SCLE,respectively.A total of 34,554 CLE cases,including 2,879 DLE cases,and 623 SCLE cases were identified between 2013 and 2017,with a higher prevalence among females than males.Conclusion This study developed and validated an identification algorithm for CLE patients based on medical insurance databases,demonstrating high performance.The proposed algorithm provides a methodological framework and empirical evidence for designing and optimizing big data-driven rapid patient identification algorithms in dermatology research.
5.Current approaches and challenges in addressing class imbalance in medical prediction models
Xianglong MENG ; Yutong WANG ; Xin ZHANG ; Siyan ZHAN ; Shengfeng WANG
Chinese Journal of Epidemiology 2025;46(9):1632-1639
With the rise of personalized medicine and the rapid development of big data technology, medical prediction models have become increasingly important in disease diagnosis, prognosis assessment, and risk stratification. However, class imbalance is a common problem in medical data, which can result in models being overly trained toward the majority class rather than the minority class, influencing the detection power and clinical application value. This paper systematically summarizes traditional methods in addressing class imbalance, including data pre-processing and algorithm level strategies, and introduces the applications of new technologies such as generative adversarial networks and transfer learning and suggests key considerations and potential research focus for addressing class imbalance to provide reference for researchers to select appropriate strategies.
6.Beverage Interventions in Metabolic Dysfunction-associated Steatotic Liver Disease
Jiawen WEI ; Meng XIA ; Yujun CHEN ; Yong YANG ; Ying ZHANG ; Jiangyin ZHANG ; Kuikui CHEN ; Xianglong QIU
Journal of Kunming Medical University 2025;46(10):145-155
Metabolic dysfunction-associated steatotic liver disease(MASLD)has become the most prevalent chronic liver disease worldwide,and China is facing a severe challenge of rapidly increasing MASLD burden.Beverages,as an important modifiable factor,have become a research focus for primary prevention and lifestyle management of MASLD.This article reviews beverage consumption trends,provides an in-depth analysis of the mechanisms and health effects of sugar-sweetened beverages,alcoholic drinks,coffee,and tea on MASLD,summarizes their potential pathogenic and protective pathways,and explores comprehensive strategies including beverage intervention,lifestyle coordination,functional beverage development,psychological and behavioral mechanism regulation,and targeted population prevention.The aim is to provide theoretical basis and practical guidance for the localized and precise prevention and control of MASLD.
7.Consensus on informed consent for orthodontic treatment
Yang CAO ; Bing FANG ; Zuolin JIN ; Hong HE ; Yuxing BAI ; Lin WANG ; Haiping LU ; Zhihe ZHAO ; Tianmin XU ; Weiran LI ; Min HU ; Jinlin SONG ; Jun WANG ; Fang JIN ; Ding BAI ; Xianglong HAN ; Yuehua LIU ; Bin YAN ; Jie GUO ; Jiejun SHI ; Yongming LI ; Zhihua LI ; Xiuping WU ; Jiangtian HU ; Linyu XU ; Lin LIU ; Yi LIU ; Yanqin LU ; Wensheng MA ; Shuixue MO ; Liling REN ; Shuxia CUI ; Yongjie FAN ; Jianguang XU ; Lulu XU ; Zhijun ZHENG ; Peijun WANG ; Rui ZOU ; Chufeng LIU ; Lunguo XIA ; Li HU ; Weicai WANG ; Liping WU ; Xiaoxing KOU ; Jiali TAN ; Yuanbo LIU ; Bowen MENG ; Yuantao HAO ; Lili CHEN
Chinese Journal of Stomatology 2025;60(12):1327-1336
This consensus was developed by the Orthodontic Society of the Chinese Stomatological Association to provide a systematic, scientific, and practical guideline for informed consent in orthodontic care. Orthodontic treatment is typically lengthy, highly individualized, and involves multiple factors such as growth and development, occlusal function, and facial esthetics. Rapid technological advances and diverse risk profiles make the traditional reliance on orthodontist experience or institutional templates insufficient to ensure patients′ full understanding and autonomous decision-making. To address this, the expert panel conducted extensive reviews of domestic and international guidelines, analyzed representative dispute cases, and performed multicenter patient-clinician surveys. Using a multi-round Delphi method, the group established a standardized informed consent framework covering the initial consultation, treatment, and retention phases. The consensus emphasizes that informed consent is not only a fundamental legal and ethical requirement but also a key step in building trust, improving patient compliance, and enhancing treatment satisfaction. Orthodontists should clearly and comprehensively explain treatment plans, potential risks, uncertainties, and associated costs, while respecting the autonomy of patients or guardians, and maintain continuous communication and dynamic evaluation throughout the treatment process. The release of this consensus provides unified and authoritative guidance for clinical orthodontics, helping to standardize informed consent, enhance its transparency, safeguard patient rights, reduce medical risks, and promote high-quality, sustainable development of orthodontic practice.
8.Current approaches and challenges in addressing class imbalance in medical prediction models
Xianglong MENG ; Yutong WANG ; Xin ZHANG ; Siyan ZHAN ; Shengfeng WANG
Chinese Journal of Epidemiology 2025;46(9):1632-1639
With the rise of personalized medicine and the rapid development of big data technology, medical prediction models have become increasingly important in disease diagnosis, prognosis assessment, and risk stratification. However, class imbalance is a common problem in medical data, which can result in models being overly trained toward the majority class rather than the minority class, influencing the detection power and clinical application value. This paper systematically summarizes traditional methods in addressing class imbalance, including data pre-processing and algorithm level strategies, and introduces the applications of new technologies such as generative adversarial networks and transfer learning and suggests key considerations and potential research focus for addressing class imbalance to provide reference for researchers to select appropriate strategies.
9.Distribution characteristics and long-term change trend of body mass index in Chinese older adults aged 65 years and above
Li QI ; Chen CHEN ; Sirui CHEN ; Zhipei LI ; Sixin LIU ; Jinhui ZHOU ; Jiahao CHEN ; Hao QIAN ; Chun TAN ; Xianglong DAI ; Ziyue ZHU ; Jun WANG ; Xi MENG ; Wenhui SHI ; Yuebin LYU ; Xiaoming SHI
Chinese Journal of Preventive Medicine 2025;59(6):908-915
Objective:To describe the body mass index (BMI) level and long-term trends of Chinese older adults aged 65 and above.Methods:Older adults aged 65 and above from six waves (2002-2018) of the China Longitudinal Healthy Longevity Survey were selected as the study population. Multiple cross-sectional design with six survey waves conducted in 2002, 2005, 2008, 2011, 2014, and 2018 was adopted, enrolling 15 647, 15 358, 15 622, 9 166, 6 302, and 12 417 participants, respectively. Additionally, a total of 13, 755 participants were included in the cohort study design. Relevant information was collected through questionnaires and physical examinations. The χ2 trend test was used to compare the changes in the rates of underweight and overweight/obesity over the years, and the linear mixed-e?ects model (LMM) was used to fit trajectory curves of BMI changes with advancing age in older adults. Results:The baseline ages of the participants included in 2002, 2005, 2008, 2011, 2014, and 2018 were (85.16±11.26), (84.23±11.83), (84.99±12.16), (81.10±11.86), (78.89±11.30), and (83.08±12.42) years, respectively, with a relatively high proportion of females and rural residents. In the cohort study, the 13 755 participants had a median ( Q1, Q3) follow-up time of 6.5 (5.2, 10.0) years, with a cumulative follow-up duration of 109 041 person-years. In each wave, males had higher BMI than females, urban residents had higher BMI than rural residents, and BMI gradually decreased with increasing age (all P<0.001). The mean BMI of older adults in China increased from (19.37±3.80) kg/m2 in 2002 to (22.04±4.01) kg/m2 in 2018 ( P<0.001). Across all survey years, the prevalence of underweight was consistently higher in women than in men and in rural areas than in urban areas, with an upward trend as age increased (all P<0.001). In 2018, the underweight rates in the 65-79, 80-89, 90-99, and ≥100-year-old age groups were 8.0%, 16.7%, 26.2%, and 35.5%, respectively. Meanwhile, the prevalence of overweight/obesity was higher in men than in women and in urban areas than in rural areas, showing a declining trend with advancing age (all P<0.001). The prevalence of underweight among the older adults decreased significantly from 45.2% in 2002 to 18.9% in 2018 ( P<0.001), while the prevalence of overweight/obesity increased from 11.0% in 1998 to 29.6% in 2018 ( P<0.001). The trajectory curves fitted by the LMM model showed that individuals born in later decades had higher BMI levels at the same age compared to earlier cohorts. Conclusion:From 2002 to 2018, the BMI level among Chinese older adults showed an increasing trend. The prevalence of underweight showed a declining trend, while the rates of obesity and overweight increased. However, the underweight rate remained notably high among the oldest old.
10.Development and application of a rapid identification algorithm for cutaneous lupus erythematosus and its subtypes based on medical insurance databases
Yutong WANG ; Xianglong MENG ; Yu PAN ; Chen WEI ; Hui JIN ; Shengfeng WANG
Chinese Journal of Pharmacoepidemiology 2025;34(7):743-752
Objective To develop and validate data extraction and patient identification algorithms for cutaneous lupus erythematosus(CLE)and its two subtypes,discoid lupus erythematosus(DLE)and subacute cutaneous lupus erythematosus(SCLE),and to enable high-efficiency patient identification in large-scale electronic health databases.Methods This study utilized data from the 2013-2017 National Insurance Claims for Epidemiological Research(NICER)to construct data extraction and rapid patient identification algorithms.The manual verification results were used as gold standard to assess the sensitivity and specificity of the algorithms.Additionally,the basic characteristics of the identified patients were analyzed.Results Initially,standardized expressions were developed based on medical terminology and diagnostic codes.These were further refined with input from clinicians to include potential synonyms and common misspellings,improving the preliminary screening expressions.Through iterative verification by clinicians and data management engineers,a final disease-specific screening algorithm was established.The developed extraction and identification algorithms for all 3 targeted disease demonstrated strong performance,with sensitivity values of 0.985,1.000,and 0.991,and specificity values of 0.997,0.999,and 0.998 for CLE,DLE,and SCLE,respectively.A total of 34,554 CLE cases,including 2,879 DLE cases,and 623 SCLE cases were identified between 2013 and 2017,with a higher prevalence among females than males.Conclusion This study developed and validated an identification algorithm for CLE patients based on medical insurance databases,demonstrating high performance.The proposed algorithm provides a methodological framework and empirical evidence for designing and optimizing big data-driven rapid patient identification algorithms in dermatology research.

Result Analysis
Print
Save
E-mail