1.Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine.
Changwon YOO ; Luis RAMIREZ ; Juan LIUZZI
International Neurourology Journal 2014;18(2):50-57
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.
Bayes Theorem
;
Behavioral Sciences
;
Computational Biology
;
Data Interpretation, Statistical
;
Dataset
;
Gene Expression
;
Humans
;
Learning
;
Logistic Models
;
Machine Learning*
;
Models, Statistical
;
Physiology
;
Polymorphism, Single Nucleotide
;
Statistics as Topic*
;
Systems Biology
2.Aspirin Resistance May Not Be Associated with Clinical Outcome after Acute Ischemic Stroke: Comparison with Three Different Platelet Function Assays.
Nam Tae YOO ; Hyo Jin BAE ; Ji Eun KIM ; Ri Young GOH ; Jin Yeong HAN ; Moo Hyeon KIM ; Jae Kwan CHA
Korean Journal of Stroke 2012;14(1):35-42
BACKGROUND: Aspirin resistance (AR) in platelet function assays showed substantial variation depending on the methods used to evaluate it. METHODS: In this study, we prospectively compared the results of Multiplate impedance platelet aggregometry (IPA) with those of light transmission aggregometry (LTA) and VerifyNow(R) system in determination of the prevalence of aspirin resistance (AR) and investigated the correlation between its presence and poor outcome (modified Rankin scale >2) in 105 patients with aspirin after acute ischemic stroke (AIS). RESULTS: After 5 days of using aspirin, 15 patients (14.3%) were classified as aspirin-resistance with the use of IPA, 24 patients (22.9%) by the LTA, and 14 patients (13.3%) by VerifyNow. Good agreement between the results of IPA and VerifyNow, was found (R=0.674, P<0.01). The concordance rate of AR detection was high between VerifyNow and IPA (k=0.72, P<0.01), albeit quite low between LTA and IPA. Regarding on its influence on clinical outcome after AIS, there wasn't any significant relationship between occurrence of poor outcome and the presence of AR in three platelet function assays. CONCLUSION: This study reveals that the incidence of AR in AIS might be highly test-specific. IPA seems to be similar to VerifyNow as a platelet function test.
Aspirin
;
Blood Platelets
;
Electric Impedance
;
Humans
;
Incidence
;
Light
;
Platelet Function Tests
;
Prevalence
;
Prospective Studies
;
Stroke
3.Comparison of clinical utility between diaphragm excursion and thickening change using ultrasonography to predict extubation success.
Jung Wan YOO ; Seung Jun LEE ; Jong Deog LEE ; Ho Cheol KIM
The Korean Journal of Internal Medicine 2018;33(2):331-339
BACKGROUND/AIMS: Both diaphragmatic excursion and change in muscle thickening are measured using ultrasonography (US) to assess diaphragm function and mechanical ventilation weaning outcomes. However, which parameter can better predict successful extubation remains to be determined. The aim of this study was to compare the clinical utility of these two diaphragmatic parameters to predict extubation success. METHODS: This study included patients subjected to extubation trial in the medical or surgical intensive care unit of a university-affiliated hospital from May 2015 through February 2016. Diaphragm excursion and percent of thickening change (Δtdi%) were measured using US within 24 hours before extubation. RESULTS: Sixty patients were included, and 78.3% (47/60) of these patients were successfully extubated, whereas 21.7% (13/60) were not. The median degree of excursion was greater in patients with extubation success than in those with extubation failure (1.65 cm vs. 0.8 cm, p < 0.001). Patients with extubation success had a greater Δtdi% than those with extubation failure (42.1% vs. 22.5%, p = 0.03). The areas under the receiver operating curve for excursion and Δtdi% were 0.836 (95% confidence interval [CI], 0.717 to 0.919) and 0.698 (95% CI, 0.566 to 0.810), respectively (p = 0.017). CONCLUSIONS: Diaphragm excursion seems more accurate than a change in the diaphragm thickness to predict extubation success.
Critical Care
;
Diaphragm*
;
Humans
;
Respiration, Artificial
;
Ultrasonography*
;
Weaning
4.Relationship between the Intake of Children's Favorite Foods and Policy based on Special Act on Safety Control of Children's Dietary Life
Taejung WOO ; Jihye YOO ; Kyung Hea LEE
Korean Journal of Community Nutrition 2019;24(2):106-116
OBJECTIVES: This study examined the status of children's favorite foods intake and the relationship with the policy environment based on the Special Act on Safety Control of Children's Dietary Life for suggesting a supportive policy strategy. METHODS: The subjects were 4th grade students (n=1,638) in elementary school from 45 schools collected from seven areas (Seoul, Daegu, Daejeon, Gyeonggi, Chungnam, Jeonbuk, and Gyeongnam). The children participated in a self-administered questionnaire survey in class under the supervision of the teacher. The questionnaire consisted of items, such as social demographic characteristics, frequency of intake of the children's favorite foods, and policy cognition. A t-test and ANOVA were applied to explore the relationship between the frequency of children's favorite foods intake and policy cognition. The survey was implemented from August 2016 to September 2016. RESULTS: For the boys, the frequency of ‘high-calorie low nutrient foods intake’ (HCLN) was significantly higher than that of the girls (p<0.01). For the children who received information on their favorite foods from the internet, the frequency of HCLN was higher than the other sources (p<0.01). The time of TV viewing and computer usage, and smartphone usage was associated with a higher frequency of HCLN, and a lower healthy favorite food intake (all p<0.001). The intake frequency of healthy favorite foods indicated a positive correlation with the policy cognition, including policy perception, usefulness, necessity and buying intention, and educational experience. CONCLUSIONS: This study showed a correlation with the frequency of children's favorite foods intake and policy. In particular, the frequency of children's healthy favorite foods intake indicated a meaningful relationship with the policy than the frequency of HCLN. This study also found that the consumption of children's healthy favorite foods was positively correlated with the educational experience. To develop a supportive policy for a good dietary environment for children, there is a need to focus on how to collaborate with multiple levels of influences, such as the national level, school level, and family.
Child
;
Chungcheongnam-do
;
Cognition
;
Daegu
;
Eating
;
Female
;
Gyeonggi-do
;
Humans
;
Intention
;
Internet
;
Jeollabuk-do
;
Organization and Administration
;
Smartphone
5.Relationship between the Intake of Children's Favorite Foods and Policy based on Special Act on Safety Control of Children's Dietary Life
Taejung WOO ; Jihye YOO ; Kyung Hea LEE
Korean Journal of Community Nutrition 2019;24(2):106-116
OBJECTIVES: This study examined the status of children's favorite foods intake and the relationship with the policy environment based on the Special Act on Safety Control of Children's Dietary Life for suggesting a supportive policy strategy. METHODS: The subjects were 4th grade students (n=1,638) in elementary school from 45 schools collected from seven areas (Seoul, Daegu, Daejeon, Gyeonggi, Chungnam, Jeonbuk, and Gyeongnam). The children participated in a self-administered questionnaire survey in class under the supervision of the teacher. The questionnaire consisted of items, such as social demographic characteristics, frequency of intake of the children's favorite foods, and policy cognition. A t-test and ANOVA were applied to explore the relationship between the frequency of children's favorite foods intake and policy cognition. The survey was implemented from August 2016 to September 2016. RESULTS: For the boys, the frequency of ‘high-calorie low nutrient foods intake’ (HCLN) was significantly higher than that of the girls (p<0.01). For the children who received information on their favorite foods from the internet, the frequency of HCLN was higher than the other sources (p<0.01). The time of TV viewing and computer usage, and smartphone usage was associated with a higher frequency of HCLN, and a lower healthy favorite food intake (all p<0.001). The intake frequency of healthy favorite foods indicated a positive correlation with the policy cognition, including policy perception, usefulness, necessity and buying intention, and educational experience. CONCLUSIONS: This study showed a correlation with the frequency of children's favorite foods intake and policy. In particular, the frequency of children's healthy favorite foods intake indicated a meaningful relationship with the policy than the frequency of HCLN. This study also found that the consumption of children's healthy favorite foods was positively correlated with the educational experience. To develop a supportive policy for a good dietary environment for children, there is a need to focus on how to collaborate with multiple levels of influences, such as the national level, school level, and family.
Child
;
Chungcheongnam-do
;
Cognition
;
Daegu
;
Eating
;
Female
;
Gyeonggi-do
;
Humans
;
Intention
;
Internet
;
Jeollabuk-do
;
Organization and Administration
;
Smartphone
6.Complication After Gastrectomy for Gastric Cancer According to Hospital Volume: Based on Korean Gastric Cancer Association-Led Nationwide Survey Data
Sang-Ho JEONG ; Moon-Won YOO ; Miyeong PARK ; Kyung Won SEO ; Jae-Seok MIN ;
Journal of Gastric Cancer 2023;23(3):462-475
Purpose:
This study aimed to analyze the incidence and risk factors of complications following gastric cancer surgery in Korea and to compare the correlation between hospital complications based on the annual number of gastrectomies performed.
Materials and Methods:
A retrospective analysis was conducted using data from 12,244 patients from 64 Korean institutions. Complications were classified using the Clavien-Dindo classification (CDC). Univariate and multivariate analyses were performed to identify the risk factors for severe complications.
Results:
Postoperative complications occurred in 14% of the patients, severe complications (CDC IIIa or higher) in 4.9%, and postoperative death in 0.2%. The study found that age, stage, American Society of Anesthesiologists (ASA) score, Eastern Cooperative Oncology Group (ECOG) score, hospital stay, approach methods, and extent of gastric resection showed statistically significant differences depending on hospital volumes (P<0.05). In the univariate analysis, patient age, comorbidity, ASA score, ECOG score, approach methods, extent of gastric resection, tumor-node-metastasis (TNM) stage, and hospital volume were significant risk factors for severe complications. However, only age, sex, ASA score, ECOG score, extent of gastric resection, and TNM stage were statistically significant in the multivariate analysis (P<0.05). Hospital volume was not a significant risk factor in the multivariate analysis (P=0.152).
Conclusions
Hospital volume was not a significant risk factor for complications after gastric cancer surgery. The differences in the frequencies of complications based on hospital volumes may be attributed to larger hospitals treating patients with younger age, lower ASA scores, better general conditions, and earlier TNM stages.
7.Causal Inference Network of Genes Related with Bone Metastasis of Breast Cancer and Osteoblasts Using Causal Bayesian Networks.
Sung Bae PARK ; Chun Kee CHUNG ; Efrain GONZALEZ ; Changwon YOO
Journal of Bone Metabolism 2018;25(4):251-266
BACKGROUND: The causal networks among genes that are commonly expressed in osteoblasts and during bone metastasis (BM) of breast cancer (BC) are not well understood. Here, we developed a machine learning method to obtain a plausible causal network of genes that are commonly expressed during BM and in osteoblasts in BC. METHODS: We selected BC genes that are commonly expressed during BM and in osteoblasts from the Gene Expression Omnibus database. Bayesian Network Inference with Java Objects (Banjo) was used to obtain the Bayesian network. Genes registered as BC related genes were included as candidate genes in the implementation of Banjo. Next, we obtained the Bayesian structure and assessed the prediction rate for BM, conditional independence among nodes, and causality among nodes. Furthermore, we reported the maximum relative risks (RRs) of combined gene expression of the genes in the model. RESULTS: We mechanistically identified 33 significantly related and plausibly involved genes in the development of BC BM. Further model evaluations showed that 16 genes were enough for a model to be statistically significant in terms of maximum likelihood of the causal Bayesian networks (CBNs) and for correct prediction of BM of BC. Maximum RRs of combined gene expression patterns showed that the expression levels of UBIAD1, HEBP1, BTNL8, TSPO, PSAT1, and ZFP36L2 significantly affected development of BM from BC. CONCLUSIONS: The CBN structure can be used as a reasonable inference network for accurately predicting BM in BC.
Bayes Theorem
;
Breast Neoplasms*
;
Breast*
;
Gene Expression
;
Indonesia
;
Machine Learning
;
Methods
;
Neoplasm Metastasis*
;
Osteoblasts*
8.Erratum to: Causal Inference Network of Genes Related with Bone Metastasis of Breast Cancer and Osteoblasts Using Causal Bayesian Networks
Sung Bae PARK ; Chun Kee CHUNG ; Efrain GONZALEZ ; Changwon YOO
Journal of Bone Metabolism 2019;26(1):61-61
The Acknowledgement was published incorrectly.
9.Development of a graphical model of causal gene regulatory networks using medical big data and Bayesian machine learning
Journal of the Korean Medical Association 2022;65(3):167-172
Data collection from medicine and biomedical science is becoming a large task and increasingly complicated with each passing day. Machine learning methods have been applied to elucidate interactions between genes and genes and their environment.Current Concepts: Many machine learning methods have been used to determine the statistical meaning or relationship in the prediction or progression of diseases through the creation of causal networks based on medical big data. Through these analyses, the occurrence and progression of diseases have been shown to be related to several genes and environmental factors. However, these methods cannot identify the key upstream regulators inferred from genomic, clinical, and environmental medical data.Discussion and Conclusion: The causal Bayesian network (CBN) is a machine learning method that can be used to understand a causal network inferred from the gene expression data. The CBN can help identify the key upstream regulators through examining the causal network inferred from medical big data having genomic information. We can easily improve the clinical outcome through regulation of these identified key upstream factors. Therefore, the CBN may be a powerful and flexible tool in the era of precision medicine.
10.The first case of novel variants of the FSHR mutation causing primary amenorrhea in 2 siblings in Korea
Sukdong YOO ; Ju Young YOON ; Changwon KEUM ; Chong Kun CHEON
Annals of Pediatric Endocrinology & Metabolism 2023;28(1):54-60
Follicle-stimulating hormone receptor (FSHR) mutation is a rare cause of amenorrhea. We report the first case of FSHR mutations in Korea. Two female siblings, aged 16 (patient 1) and 19 (patient 2) years, were referred to the pediatric endocrinology clinic because of primary amenorrhea despite normal breast budding. Gonadotropin-releasing hormone stimulation test showed markedly elevated luteinizing hormone and follicle-stimulating hormone with a relatively low level of estrogen, suggesting hypergonadotropic hypogonadism. Pelvic magnetic resonance imaging revealed a bicornuate uterus in patient 1 and uterine hypoplasia with thinning of the endometrium in patient 2. The progesterone challenge test revealed no withdrawal of bleeding. After two months of administration of combined oral contraceptives, menarche was initiated at regular intervals. To determine the genetic cause of amenorrhea in these patients, whole exome sequencing (WES) was performed, which revealed a compound heterozygous FSHR mutation, c.1364T>G (p.Val455Gly) on exon 10, and c.374T>G (p.Leu125Arg) on exon 4; both of which were novel mutations and were confirmed by Sanger sequencing. The patients maintained regular menstruation and improved bone mineral density while taking combined oral contraceptives, calcium, and vitamin D. Therefore, FSHR mutations can be the cause of amenorrhea in Koreans, and WES facilitates diagnosing the rare cause of amenorrhea.