An Approach to Survey Data with Nonresponse: Evaluation of KEPEC Data with BMI.
- Author:
Jieun BAEK
1
;
Weechang KANG
;
Youngjo LEE
;
Byung Joo PARK
Author Information
1. Department of Statistics, Seoul National University College of Natural Science, Korea.
- Publication Type:Original Article
- Keywords:
Algorithm;
Questionnaire;
Log-linear Models;
Body Mass Index;
Korean Elderly Pharmacoepidemiologic Cohort
- MeSH:
Aged;
Body Mass Index;
Humans;
Life Style;
Linear Models;
Postal Service;
Surveys and Questionnaires;
Smoke;
Smoking
- From:Korean Journal of Preventive Medicine
2002;35(2):136-140
- CountryRepublic of Korea
- Language:Korean
-
Abstract:
OBJECTIVES: A common problem with analyzing survey data involves incomplete data with either a nonresponse or missing data. The mail questionnaire survey conducted for collecting lifestyle variables on the members of the Korean Elderly Phamacoepidemiologic Cohort(KEPEC) in 1996 contains some nonresponse or missing data. The proper statistical method was applied to evaluate the missing pattern of a specific KEPEC data, which had no missing data in the independent variable and missing data in the response variable, BMI. METHODS: The number of study subjects was 8,689 elderly people. Initially, the BMI and significant variables that influenced the BMI were categorized. After fitting the log-linear model, the probabilities of the people on each category were estimated. The EM algorithm was implemented using a log-linear model to determine the missing mechanism causing the nonresponse. RESULTS: Age, smoking status, and a preference of spicy hot food were chosen as variables that influenced the BMI. As a result of fitting the nonignorable and ignorable nonresponse log-linear model considering these variables, the difference in the deviance in these two models was 0.0034(df=1). CONCLUSION: There is a lot of risk if an inference regarding the variables and large samples is made without considering the pattern of missing data. On the basis of these results, the missing data occurring in the BMI is the ignorable nonresponse. Therefore, when analyzing the BMI in KEPEC data, the inference can be made about the data without considering the missing data.