1.Preliminary exploration of multi-omics data fusion methods for high-dimensional small-sample datasets in traditional Chinese medicine.
Nian WANG ; Cheng-Cheng YU ; Hu YANG ; Zhong WANG ; Jun LIU
China Journal of Chinese Materia Medica 2025;50(1):278-284
With the advancement in big data and artificial intelligence technologies, the extensive application of omics technologies in traditional Chinese medicine(TCM) research has generated large experimental datasets, enabling the exploration of cross-scale correlations among massive data and thereby resulting in the shift toward a data-intensive research paradigm. The emerging approach of multi-omics data fusion analysis, emphasizing technical and computational tools, presents a potential breakthrough in this field. The holistic perspective of TCM aligns with the concept of multi-omics data fusion, yet the data types encountered exhibit high dimensionality with small sample sizes, necessitating data processing techniques such as dimensionality reduction. The current challenge lies in selecting suitable analytical methods for these data to enhance the systematic understanding of physiological functions and disease diagnosis/treatment processes. This paper explores the theories and frameworks of multi-omics data fusion, analyzes methods for fusing high-dimensional, small-sample multi-omics data in TCM, and aims to provide insights for advancing TCM research.
Medicine, Chinese Traditional/methods*
;
Humans
;
Computational Biology/methods*
;
Genomics/methods*
;
Sample Size
;
Artificial Intelligence
;
Multiomics
2.Impact of incorrect designation of working correlation structure matrix on sample size estimation in 2×2 cross design: a simulation study.
Peiyu ZHANG ; Ziheng XIE ; Yan ZHUANG
Journal of Southern Medical University 2025;45(11):2495-2503
OBJECTIVES:
To investigate the impact of incorrect specification of the working correlation structure matrix on estimated sample size in a 2×2 crossover design based on the generalized estimating equation (GEE).
METHODS:
Based on Monte Carlo simulation, the influence of incorrect specification of the work-related structure matrix on the sample size estimation under different conditions was evaluated after controlling the total sample size n, the proportion of subjects assigned to AB sequence (s=1) θ, the correlation coefficient ρ, and the placebo effect OR. Bias and mean square error (MSE) were used to assess the difference between the sample size estimates and the theoretical values.
RESULTS:
When the correctly specified working correlation structure matrix is independent, the sample size estimation effect of correctly specifying the working correlation structure matrix is better than that of incorrect specification. But when the correctly specified working correlation structure matrix is equal and the correlation coefficient is closer to 0, with other factors being smaller (n≤50, θ≤0.5, OR=2 in this article), there is a situation where the bias of the sample size estimation value for the correctly specified working correlation structure matrix is greater than the bias for the incorrectly specified working correlation structure matrix.
CONCLUSIONS
Under most conditions, incorrectly specifying the working correlation structure matrix can cause the estimated sample size to deviate significantly from the theoretical value, but under certain conditions, the impact of incorrectly specifying the working correlation structure matrix can be small on the estimated sample size.
Sample Size
;
Monte Carlo Method
;
Humans
;
Cross-Over Studies
;
Computer Simulation
;
Research Design
;
Bias
3.Comparison of 7 methods for sample size determination based on confidence interval estimation for a single proportion.
Mi Lai YU ; Xiao Tong SHI ; Bi Qing ZOU ; Sheng Li AN
Journal of Southern Medical University 2023;43(1):105-110
OBJECTIVE:
To compare different methods for calculating sample size based on confidence interval estimation for a single proportion with different event incidences and precisions.
METHODS:
We compared 7 methods, namely Wald, AgrestiCoull add z2, Agresti-Coull add 4, Wilson Score, Clopper-Pearson, Mid-p, and Jefferys, for confidence interval estimation for a single proportion. The sample size was calculated using the search method with different parameter settings (proportion of specified events and half width of the confidence interval [ω=0.05, 0.1]). With Monte Carlo simulation, the estimated sample size was used to simulate and compare the width of the confidence interval, the coverage of the confidence interval and the ratio of the noncoverage probability.
RESULTS:
For a high accuracy requirement (ω =0.05), the Mid-p method and Clopper Pearson method performed better when the incidence of events was low (P < 0.15). In other settings, the performance of the 7 methods did not differ significantly except for a poor symmetry of the Wald method. In the setting of ω=0.1 with a very low p (0.01-0.05), failure of iteration occurred with nearly all the methods except for the Clopper-Pearson method.
CONCLUSION
Different sample size determination methods based on confidence interval estimation should be selected for single proportions with different parameter settings.
Confidence Intervals
;
Sample Size
;
Computer Simulation
;
Monte Carlo Method
;
Probability
4.Statistical methods for relative risk estimation and applications in case-cohort study.
Jia Yi TUO ; Jing Hao BI ; Zhuo Ying LI ; Qiu Ming SHEN ; Yu Ting TAN ; Hong Lan LI ; Hui Yun YUAN ; Yong Bing XIANG
Chinese Journal of Epidemiology 2022;43(3):392-396
Objective: To systematically introduce the design of case-cohort study and the statistical methods of relative risk estimation and their application in the design. Methods: First, we introduced the basic principles of case-cohort study design. Secondly, Prentice's method, Self-Prentice method and Barlow method were described in the weighted Cox proportional hazard regression models in detail, finally, the data from the Shanghai Women's Health Study were used as an example to analyze the association between obesity and liver cancer incidence in the full cohort and case-cohort sample, and the results of parameters from each method were compared. Results: Significant association was observed between obesity and risk for liver cancer incidence in women in both the full cohort and the case-cohort sample. In the Cox proportional hazard regression model, the partial regression coefficients of the full cohort and the case-cohort sample fluctuated with the adjustment of confounding factors, but the hazard ratio estimates of them were close. There was a difference in the standard error of the partial regression coefficient between the full cohort and the case-cohort sample. The standard error of the partial regression coefficient of the case-cohort sample was larger than that of the full cohort, resulting in a wider 95% confidence interval of the relative risk. In the weighted Cox proportional hazard regression model, the standard error of the partial regression coefficient of Prentice's method was closer to the parameter estimates from full cohort than Self-Prentice method and Barlow method, and the 95% confidence interval of hazard ratio was closer to that of the full cohort. Conclusions: Case-cohort design could yield parameter results closer to the full cohort by collecting and analyzing data from sub-cohort members and patients with the disease, and reduce sample size and improve research efficiency. The results suggested that Prentice's method would be preferred in case-cohort design.
China/epidemiology*
;
Cohort Studies
;
Female
;
Humans
;
Proportional Hazards Models
;
Risk
;
Sample Size
5.Sample size calculations in health research: Contemporary issues and practices
Amiel Nazer C. Bermudez ; Kim L. Cochon
Philippine Journal of Health Research and Development 2022;26(2):77-80
Sample size computations, which should be done at the planning stage of the study, are necessary for
research to estimate a population parameter or test a hypothesis. For causal analysis of observational
databases, sample size computations are generally not needed. Post-hoc power analyses, which are typically done with non-significant findings, should not be performed since reporting post-hoc power is nothing more than reporting p values differently. While sample size calculations are typically based on the tradition of significance testing, sample size calculations based on precision are feasible – if not preferred – alternatives. Sample size calculations depend on several factors such as the study objective, scale of measurement of the outcome variable, study design, and sampling design. Computing for sample size is not as straightforward as presented in textbooks but specific strategies may be resorted to in the face of challenges and constraints.
Sample Size
;
Power, Psychological
6.Sample size estimation in acupuncture and moxibustion clinical trials.
Jing HU ; Bo LI ; Hui-Na ZHANG ; Wei-Hong LIU ; Shuo FENG
Chinese Acupuncture & Moxibustion 2021;41(10):1147-1152
The appropriate sample size estimation is very important in the design of clinical trials. However, insufficient or inappropriate sample size estimation is still a prominent problem in the currently published acupuncture and moxibustion clinical trials. At present, the superiority test, non-inferiority test and equivalence test have been widely used in acupuncture and moxibustion clinical trials. This article focuses on the application, calculation methods and PASS11 software using of these three hypothesis test types. In view of the problems in the estimation of sample size in acupuncture and moxibustion clinical trials, the particularity of sample size estimation in acupuncture and moxibustion is summarized from the aspects of parameter setting, ratio of intervention group and control group, and multi-group comparison, in order to guide acupuncture clinical researchers to correctly estimate sample size when conducting clinical trials.
Acupuncture
;
Acupuncture Therapy
;
Clinical Trials as Topic
;
Moxibustion
;
Sample Size
7.Influence of group sample size on statistical power of tests for quantitative data with an imbalanced design.
Qihong LIANG ; Xiaolin YU ; Shengli AN
Journal of Southern Medical University 2020;40(5):713-717
OBJECTIVE:
To explore the relationship between sample size in the groups and statistical power of ANOVA and Kruskal-Wallis test with an imbalanced design.
METHODS:
The sample sizes of the two tests were estimated by SAS program with given parameter settings, and Monte Carlo simulation was used to examine the changes in power when the total sample size varied or remained fixed.
RESULTS:
In ANOVA, when the total sample size was fixed, increasing the sample size in the group with a larger mean square error improved the statistical power, but an excessively large difference in the sample sizes between groups led to reduced power. When the total sample size was not fixed, a larger mean square error in the group with increased sample size was associated with a greater increase of the statistical power. In Kruskal-wallis test, when the total sample size was fixed, increasing the sample size in groups with large mean square errors increased the statistical power irrespective of the sample size difference between the groups; when total sample size was not fixed, a larger mean square error in the group with increased sample size resulted in an increased statistical power, and the increment was similar to that for a fixed total sample size.
CONCLUSIONS
The relationship between statistical power and sample size in groups is affected by the mean square error, and increasing the sample size in a group with a large mean square error increases the statistical power. In Kruskal-Wallis test, increasing the sample size in a group with a large mean square error is more cost- effective than increasing the total sample size to improve the statistical power.
Computer Simulation
;
Models, Statistical
;
Monte Carlo Method
;
Sample Size
8.Application of conditional inference forest in time-to-event data analysis.
Yingxin LIU ; Pei KANG ; Jun XU ; Shengli AN
Journal of Southern Medical University 2020;40(4):475-482
OBJECTIVE:
To explore the application and advantages of conditional inference forest in survival analysis.
METHODS:
We used simulated experiment and actual data to compare the predictive performance of 4 models, including Coxproportional hazards model, accelerated failure time model, random survival forest model and conditional inference forest model based on their Brier scores.
RESULTS:
Simulation experiment suggested that both of the two forest models had more accurate and robust predictive performance than the other two regression models. Conditional inference forest model was superior to the other models in analyzing time-to-event data with polytomous covariates, collinearity or interaction, especially for a large sample size and a high censoring rate. The results of actual data analysis demonstrated that conditional inference forest model had the best predictive performance among the 4 models.
CONCLUSIONS
Compared with the commonly used survival analysis methods, conditional inference forest model performs better especially when the data contain polytomous covariates with collinearity and interaction.
Data Analysis
;
Proportional Hazards Models
;
Sample Size
;
Survival Analysis
9.Impact of metabolic syndrome on short-term outcome of carotid revascularization: a large sample size study in Chinese population.
Xue-Song BAI ; Yao FENG ; Tao WANG ; Xiao ZHANG ; Chang-Lin YANG ; Ya-Bing WANG ; Yang HUA ; Jie LU ; Feng-Shui ZHU ; Yan-Fei CHEN ; Peng GAO ; Ren-Jie YANG ; Yan MA ; Li-Qun JIAO
Chinese Medical Journal 2020;133(22):2688-2695
BACKGROUND:
Metabolic syndrome (MetS) is relatively common worldwide and an important risk factor for cardiovascular diseases. It is closely linked to arterial stiffness of the carotid artery. However, the association of MetS with the safety of carotid revascularization has been rarely studied. The aim of this study was to observe the current status of MetS and its components in Chinese carotid revascularized patients, and investigate the impact on major adverse clinical events (MACEs) after carotid endarterectomy (CEA) or carotid artery stenting (CAS).
METHODS:
From January 2013 to December 2017, patients undergoing CEA or CAS in the Neurosurgery Department of Xuanwu Hospital were retrospectively recruited. The changes in prevalence of MetS and each component with time were investigated. The primary outcome was 30-day post-operative MACEs. Univariable and multivariable analyses were performed to identify the impact of MetS on CEA or CAS.
RESULTS:
A total of 2068 patients who underwent CEA (766 cases) or CAS (1302 cases) were included. The rate of MetS was 17.9%; the prevalence rate of MetS increased with time. The occurrence rate of MACEs in CEA was 3.4% (26 cases) and in CAS, 3.1% (40 cases). There was no statistical difference between the two groups (3.4% vs. 3.1%, P = 0.600). For CEA patients, univariate analysis showed that the MACE (+) group had increased diabetes history (53.8% vs. 30.9%, P = 0.014) and MetS (34.6% vs. 15.8%, P = 0.023). For CAS patients, univariate analysis showed that the MACE (+) group had increased coronary artery disease history (40.0% vs. 21.6%, P = 0.006) and internal carotid artery tortuosity (67.5%% vs. 37.6%, P < 0.001). Furthermore, the MACE (+) group had higher systolic blood pressure (143.38 ± 22.74 vs. 135.42 ± 17.17 mmHg, P = 0.004). Multivariable analysis showed that the influencing factors for MACEs in CEA included history of diabetes (odds ratio [OR] = 2.345; 95% confidence interval [CI] = 1.057-5.205; P = 0.036) and MetS (OR = 2.476; 95% CI = 1.065-5.757; P = 0.035). The influencing factors for MACEs in CAS included systolic blood pressure (OR = 1.023; 95% CI = 1.005-1.040; P = 0.010), coronary artery disease (OR = 2.382; 95% CI = 1.237-4.587; P = 0.009) and internal carotid artery tortuosity (OR = 3.221; 95% CI = 1.637-6.337; P = 0.001).
CONCLUSIONS
The prevalence rate of MetS increased with time in carotid revascularized patients. MetS is a risk for short-term MACEs after CEA, but not CAS.
Carotid Arteries/surgery*
;
Carotid Stenosis/surgery*
;
China/epidemiology*
;
Endarterectomy, Carotid/adverse effects*
;
Humans
;
Metabolic Syndrome/epidemiology*
;
Retrospective Studies
;
Risk Factors
;
Sample Size
;
Stents/adverse effects*
;
Stroke
;
Time Factors
;
Treatment Outcome
10.Maintenance of pegylated liposomal doxorubicin/carboplatin in patients with advanced ovarian cancer: randomized study of an Asian Gynecologic Oncology Group
Chyong Huey LAI ; Elizabeth VALLIKAD ; Hao LIN ; Lan Yan YANG ; Shih Ming JUNG ; Hsueh Erh LIU ; Yu Che OU ; Hung Hsueh CHOU ; Cheng Tao LIN ; Huei Jean HUANG ; Kuan Gen HUANG ; Jiantai QIU ; Yao Ching HUNG ; Tzu I WU ; Wei Yang CHANG ; Kien Thiam TAN ; Chiao Yun LIN ; Angel CHAO ; Chee Jen CHANG
Journal of Gynecologic Oncology 2020;31(1):5-
sample size, it suggests that maintenance carboplatin-PLD chemotherapy could improve PFS in advanced ovarian cancer.]]>
Arm
;
Asian Continental Ancestry Group
;
Bias (Epidemiology)
;
Carboplatin
;
Disease-Free Survival
;
Doxorubicin
;
Drug Therapy
;
Follow-Up Studies
;
Humans
;
Maintenance Chemotherapy
;
Ovarian Neoplasms
;
Prognosis
;
Quality of Life
;
Recombination, Genetic
;
Sample Size


Result Analysis
Print
Save
E-mail