1.Construction and analysis of a combined discriminative model of random forest and feedforward neural network for peripheral blood RNA sequencing data in bipolar disorder
Xiangwen WANG ; Shunkang FENG ; Hong CHEN ; Shenghai WANG ; Ping SUN
Chinese Journal of Psychiatry 2024;57(4):213-220
Objective:To identify characteristic genes of bipolar disorder using the random forest method and to construct a discriminative model for bipolar disorder using neural network approaches.Methods:The study utilized gene expression data from individuals with bipolar disorder ( n=20) and healthy controls ( n=15) from the GSE23848 dataset. Background correction was performed using negative control probes, and normalization was done with both negative and positive control probes. Differentially expressed genes were identified through linear model analysis and empirical Bayesian statistical methods. A random forest model was built for feature extraction of differentially expressed genes, and a neural network model was constructed using the characteristic genes identified by the random forest model. The discriminative efficiency of the model was validated on an independent external dataset GSE39653, which included bipolar disorder patients ( n=8) and healthy controls ( n=24). Biological functions of the characteristic genes were explored through gene ontology (GO) and protein-protein interaction networks (PPI). Results:A total of 1 330 differentially expressed genes related to bipolar disorder and 35 characteristic genes were selected for model construction. The final model was a feedforward neural network with four hidden layers and four dropout layers, possessing 50 433 trainable parameters. Bootstrap methods with 1 000 resampling were used to calculate the confidence intervals for sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and accuracy, all of which were 1. In the GSE39653 external validation set, the model′s AUC was 0.72. Enrichment analysis of the characteristic genes suggested that the functions of the genes in the model are related to mitochondrial structure and energy metabolism.Conclusion:The random forest method can identify characteristic genes of bipolar disorder, and a diagnostic model established through the combination of random forests and feedforward neural networks shows good classification performance in bipolar disorder.
2.Construction and analysis of a combined discriminative model of random forest and feedforward neural network for peripheral blood RNA sequencing data in bipolar disorder
Xiangwen WANG ; Shunkang FENG ; Hong CHEN ; Shenghai WANG ; Ping SUN
Chinese Journal of Psychiatry 2024;57(4):213-220
Objective:To identify characteristic genes of bipolar disorder using the random forest method and to construct a discriminative model for bipolar disorder using neural network approaches.Methods:The study utilized gene expression data from individuals with bipolar disorder ( n=20) and healthy controls ( n=15) from the GSE23848 dataset. Background correction was performed using negative control probes, and normalization was done with both negative and positive control probes. Differentially expressed genes were identified through linear model analysis and empirical Bayesian statistical methods. A random forest model was built for feature extraction of differentially expressed genes, and a neural network model was constructed using the characteristic genes identified by the random forest model. The discriminative efficiency of the model was validated on an independent external dataset GSE39653, which included bipolar disorder patients ( n=8) and healthy controls ( n=24). Biological functions of the characteristic genes were explored through gene ontology (GO) and protein-protein interaction networks (PPI). Results:A total of 1 330 differentially expressed genes related to bipolar disorder and 35 characteristic genes were selected for model construction. The final model was a feedforward neural network with four hidden layers and four dropout layers, possessing 50 433 trainable parameters. Bootstrap methods with 1 000 resampling were used to calculate the confidence intervals for sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and accuracy, all of which were 1. In the GSE39653 external validation set, the model′s AUC was 0.72. Enrichment analysis of the characteristic genes suggested that the functions of the genes in the model are related to mitochondrial structure and energy metabolism.Conclusion:The random forest method can identify characteristic genes of bipolar disorder, and a diagnostic model established through the combination of random forests and feedforward neural networks shows good classification performance in bipolar disorder.

Result Analysis
Print
Save
E-mail