1.Metabolic Syndrome Prediction Using Machine Learning Models with Genetic and Clinical Information from a Nonobese Healthy Population
Eun Kyung CHOE ; Hwanseok RHEE ; Seungjae LEE ; Eunsoon SHIN ; Seung Won OH ; Jong Eun LEE ; Seung Ho CHOI
Genomics & Informatics 2018;16(4):e31-
The prevalence of metabolic syndrome (MS) in the nonobese population is not low. However, the identification and risk mitigation of MS are not easy in this population. We aimed to develop an MS prediction model using genetic and clinical factors of nonobese Koreans through machine learning methods. A prediction model for MS was designed for a nonobese population using clinical and genetic polymorphism information with five machine learning algorithms, including naïve Bayes classification (NB). The analysis was performed in two stages (training and test sets). Model A was designed with only clinical information (age, sex, body mass index, smoking status, alcohol consumption status, and exercise status), and for model B, genetic information (for 10 polymorphisms) was added to model A. Of the 7,502 nonobese participants, 647 (8.6%) had MS. In the test set analysis, for the maximum sensitivity criterion, NB showed the highest sensitivity: 0.38 for model A and 0.42 for model B. The specificity of NB was 0.79 for model A and 0.80 for model B. In a comparison of the performances of models A and B by NB, model B (area under the receiver operating characteristic curve [AUC] = 0.69, clinical and genetic information input) showed better performance than model A (AUC = 0.65, clinical information only input). We designed a prediction model for MS in a nonobese population using clinical and genetic information. With this model, we might convince nonobese MS individuals to undergo health checks and adopt behaviors associated with a preventive lifestyle.
Alcohol Drinking
;
Bays
;
Body Mass Index
;
Classification
;
Life Style
;
Machine Learning
;
Polymorphism, Genetic
;
Prevalence
;
ROC Curve
;
Sensitivity and Specificity
;
Smoke
;
Smoking
2.Validation of the Utility of the Genetically Shared Regions of Chromosomes (GD-ICS) Measuring Method in Identifying Complicated Genetic Relatedness
Sohee CHO ; Eunsoon SHIN ; YoonGi PARK ; Haeun YOU ; Eun Young LEE ; Jong-Eun LEE ; Soong Deok LEE
Journal of Korean Medical Science 2024;39(27):e198-
Background:
Relatives share more genomic regions than unrelated individuals, with closer relatives sharing more regions. This concept, paired with the increased availability of highthroughput single nucleotide polymorphism (SNP) genotyping technologies, has made it feasible to measure the shared chromosomal regions between individuals to assess their level of relation to each other. However, such techniques have remained in the conceptual rather than practical stages in terms of applying measures or indices. Recently, we developed an index called “genetic distance-based index of chromosomal sharing (GD-ICS)” utilizing large-scale SNP data from Korean family samples and demonstrated its potential for practical applications in kinship determination. In the current study, we present validation results from various real cases demonstrating the utility of this method in resolving complex familial relationships where information obtained from traditional short tandem repeats (STRs) or lineage markers is inconclusive.
Methods:
We obtained large-scale SNP data through microarray analysis from Korean individuals involving 13 kinship cases and calculated GD-ICS values using the method described in our previous study. Based on the GD-ICS reference constructed for Korean families, each disputed kinship was evaluated and validated using a combination of traditional STRs and lineage markers.
Results:
The cases comprised those A) that were found to be inconclusive using the traditional approach, B) for which it was difficult to apply traditional testing methods, and C) that were more conclusively resolved using the GD-ICS method. This method has overcome the limitations faced by traditional STRs in kinship testing, particularly in a paternity case with STR mutational events and in confirming distant kinship where the individual of interest is unavailable for testing. It has also been demonstrated to be effective in identifying various relationships without specific presumptions and in confirming a lack of genetic relatedness between individuals.
Conclusion
This method has been proven effective in identifying familial relationships across diverse complex and practical scenarios. It is not only useful when traditional testing methods fail to provide conclusive results, but it also enhances the resolution of challenging kinship cases, which suggests its applicability in various types of practical casework.