1.How to Increase Your “Power”
Hip & Pelvis 2018;30(1):1-4
No abstract available.
Data Accuracy
;
Data Interpretation, Statistical
;
Statistics as Topic
;
Biomedical Research
;
Analysis of Variance
3.Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine.
Changwon YOO ; Luis RAMIREZ ; Juan LIUZZI
International Neurourology Journal 2014;18(2):50-57
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.
Bayes Theorem
;
Behavioral Sciences
;
Computational Biology
;
Data Interpretation, Statistical
;
Dataset
;
Gene Expression
;
Humans
;
Learning
;
Logistic Models
;
Machine Learning*
;
Models, Statistical
;
Physiology
;
Polymorphism, Single Nucleotide
;
Statistics as Topic*
;
Systems Biology
4.Test of Validity and Reliability of the Adolescent Mental Problem Questionnaire for Korean High School Students.
Soo Jin KIM ; Chung Sook LEE ; Young Ran KWEON ; Mi Ra OH ; Bo Young KIM
Journal of Korean Academy of Nursing 2009;39(5):700-708
PURPOSE: This study was done to test the validity and reliability of the Adolescent Mental Problem Questionnaire (AMPQ) for Korean high school students. METHODS: The AMPQ was designed to assess adolescents' mental health status and problem behavior (Ahn, 2006). A methodological study design was used with exploratory factor analysis, Pearson's correlation coefficients, and a fitness of the modified model for validity. Also, Cronbach's alpha coefficients and alternative-form method for reliability were used. AMPQ was tested with a sample of 36,313 high school students. The participants consisted of 18,701 males and 17,612 females. RESULTS: Seven factors were extracted through factor analysis: 'Psychiatric problems', 'Delinquency', 'Academic troubles', 'Family problems', 'Hazardous behavior', 'Harmful circumstance', 'Eating problems'. These factors explained 51.1% of the total variance. The fitness of the modified model was good (chi-square=38,413.76, Goodness of Fit Index [GFI]=.94, Adjusted Goodness of Fit Index [AGFI]=.93, Comparative Fit Index [CFI]=.95, Root Mean Square Error of Approximation [RMSEA]=.05), and concurrent validity with Korea-Youth Self-Report [K-YSR] was .63. Cronbach's alpha coefficient of the 31 items was .85. CONCLUSION: The results of present study suggest that the modified AMPQ instrument may be useful for efficiently assessing mental health status and problem behavior in late adolescent, high school students.
Adolescent
;
Data Interpretation, Statistical
;
Factor Analysis, Statistical
;
Female
;
Humans
;
Male
;
*Mental Health
;
Psychometrics
;
*Questionnaires
;
Republic of Korea
;
Students/*psychology
5.Recent advances in Bayesian inference of isolation-with-migration models
Genomics & Informatics 2019;17(4):37-
Isolation-with-migration (IM) models have become popular for explaining population divergence in the presence of migrations. Bayesian methods are commonly used to estimate IM models, but they are limited to small data analysis or simple model inference. Recently three methods, IMa3, MIST, and AIM, resolved these limitations. Here, we describe the major problems addressed by these three software and compare differences among their inference methods, despite their use of the same standard likelihood function.
Bayes Theorem
;
Gene Flow
;
Likelihood Functions
;
Phylogeny
;
Statistics as Topic
6.Comparison of Bayesian interim analysis and classical interim analysis in group sequential design.
Lingling YUAN ; Zhiying ZHAN ; Xuhui TAN
Journal of Southern Medical University 2015;35(11):1638-1642
OBJECTIVETo explore the differences between the Bayesian interim analysis and the classical interim analysis.
METHODSTo compare the means of two independent samples between control and treatment, superior hypothesis test was established. In line with the data requirements for group sequential design, Type Iota error of Bayesian interim analysis based on various prior distributions, Power, Average Sample Size and Average Stage were estimated in the interim analysis.
RESULTSIn the Pocock and O' Brien & Fleming designs, the Type Iota errors in the Bayesian interim analysis based on the skeptical prior distribution and the handicap prior distribution were controlled at around 0.05. When the powers of these two classical designs were both 80%, Bayesian powers of the skeptical prior distribution and the handicap prior distribution were markedly lower. The powers of the non-informative prior distribution and the enthusiastic prior distribution were distinctly higher than 80%.
CONCLUSIONIn the Bayesian interim analysis based on the skeptical prior distribution and the handicap Prior distribution, the Type Iota errors can be well controlled. Bayesian interim analyses using these two prior distributions, compared with the analysis adopting the O' Brien & Fleming method, can markedly increase the possibility of ending the clinical trials ahead of time. The Bayesian interim analyses based on these two distributions do not have practical value for group sequential design of the Pocock method.
Bayes Theorem ; Data Interpretation, Statistical ; Sample Size
7.Knowledge discovery in database and its application in clinical diagnosis.
Journal of Biomedical Engineering 2004;21(4):677-680
Nowadays the tremendous amount of data has far exceeded our human ability for comprehension, and this has been particularly true for the medical database. However, traditional statistical techniques are no longer adequate for analyzing this vast collection of data. Knowledge discovery in database and data mining play an important role in analyzing data and uncovering important data patterns. This paper briefly presents the concepts of knowledge discovery in database and data mining, then describes the rough set theory, and gives some examples based on rough set.
Artificial Intelligence
;
Clinical Medicine
;
Data Interpretation, Statistical
;
Databases as Topic
;
Databases, Factual
;
Decision Making, Computer-Assisted
;
Diagnosis
;
Factor Analysis, Statistical
;
Knowledge
;
Mathematical Computing
;
Medical Records Systems, Computerized
8.Pelvic Injury Discriminative Model Based on Data Mining Algorithm.
Fei-Xiang WANG ; Rui JI ; Lu-Ming ZHANG ; Peng WANG ; Tai-Ang LIU ; Lu-Jie SONG ; Mao-Wen WANG ; Zhi-Lu ZHOU ; Hong-Xia HAO ; Wen-Tao XIA
Journal of Forensic Medicine 2022;38(3):350-354
OBJECTIVES:
To reduce the dimension of characteristic information extracted from pelvic CT images by using principal component analysis (PCA) and partial least squares (PLS) methods. To establish a support vector machine (SVM) classification and identification model to identify if there is pelvic injury by the reduced dimension data and evaluate the feasibility of its application.
METHODS:
Eighty percent of 146 normal and injured pelvic CT images were randomly selected as training set for model fitting, and the remaining 20% was used as testing set to verify the accuracy of the test, respectively. Through CT image input, preprocessing, feature extraction, feature information dimension reduction, feature selection, parameter selection, model establishment and model comparison, a discriminative model of pelvic injury was established.
RESULTS:
The PLS dimension reduction method was better than the PCA method and the SVM model was better than the naive Bayesian classifier (NBC) model. The accuracy of the modeling set, leave-one-out cross validation and testing set of the SVM classification model based on 12 PLS factors was 100%, 100% and 93.33%, respectively.
CONCLUSIONS
In the evaluation of pelvic injury, the pelvic injury data mining model based on CT images reaches high accuracy, which lays a foundation for automatic and rapid identification of pelvic injuries.
Algorithms
;
Bayes Theorem
;
Data Mining
;
Least-Squares Analysis
;
Support Vector Machine
9.Effect of Normalization on Detection of Differentially-Expressed Genes with Moderate Effects.
Seoae CHO ; Eunjee LEE ; Youngchul KIM ; Taesung PARK
Genomics & Informatics 2007;5(3):118-123
The current existing literature offers little guidance on how to decide which method to use to analyze one-channel microarray measurements when dealing with large, grouped samples. Most previous methods have focused on two-channel data;therefore they can not be easily applied to one-channel microarray data. Thus, a more reliable method is required to determine an appropriate combination of individual basic processing steps for a given dataset in order to improve the validity of onechannel expression data analysis. We address key issues in evaluating the effectiveness of basic statistical processing steps of microarray data that can affect the final outcome of gene expression analysis without focusingon the intrinsic data underlying biological interpretation.
Analysis of Variance
;
Dataset
;
Gene Expression
;
Statistics as Topic
10.Significance of Microsatellite Instability in Early Gastric Cancer Treated by Endoscopic Submucosal Dissection.
Kyoung Min KIM ; Yeon Soo KIM ; Joo Young CHO ; In Sup JUNG ; Wan Jung KIM ; Ik Seong CHOI ; Chang Beom RYU ; Jin Oh KIM ; Joon Seong LEE ; So Young JIN ; Chan Sup SHIM ; Boo Sung KIM
The Korean Journal of Gastroenterology 2008;51(3):167-173
BACKGROUND/AIMS: Microsatellite instability (MSI) is defined as a change of any length due to either insertion or deletion of repeating units, in a microsatellite within a tumor when compared to normal tissue. MSI is closely related with genetic instability, particularly in hereditary nonpolyposis colorectal cancer. MSI is found in 10-50% of all gastric cancers, suggesting that MSI may play an important role in carcinogenesis. The aim of this study was to investigate the relationship between microsatellite instability and clinicopathologic features in early gastric cancers (EGCs) treated by endoscopic submucosal dissection (ESD). METHODS: We analyzed clinicopathological features of 95 specimens of EGCs including MSI, histologic type, mucin phenotype, p53, VEGF, location of cancer, depth of invasion, incidence of synchronous and metachronous cancer, age, and gender derived from 94 patients, treated by ESD during recent 19 months were analyzed in this study. RESULTS: According to microsatellite stability, MSI was observed in 13 (13.7%) cases of 95 specimens. The incidence of MSI was increased in patients with cancer at lower part of stomach and female gender. There was no significant relation between MSI and clinicopathologic features including histologic type, mucin phenotype, p53, VEGF, and depth of invasion. CONCLUSIONS: Our results demonstrate that there is no relationship between MSI and clinicopathologic features except tumor location and gender in ECGs treated by ESD. However, further studies are needed to evaluate the significance of MSI in EGCs.
Adult
;
Aged
;
Aged, 80 and over
;
DNA Mutational Analysis
;
Data Interpretation, Statistical
;
Endoscopy, Gastrointestinal
;
Female
;
Humans
;
Male
;
*Microsatellite Instability
;
Middle Aged
;
Mucins/analysis
;
Neoplasm Staging
;
Predictive Value of Tests
;
Stomach Neoplasms/*diagnosis/genetics/surgery
;
Tumor Suppressor Protein p53/analysis
;
Vascular Endothelial Growth Factor A/analysis