Statistical Methods for Multivariate Missing Data in Health Survey Research.
- Author:
Dong Kee KIM
1
;
Eun Cheol PARK
;
Myong Sei SOHN
;
Han Joong KIM
;
Hyung Uk PARK
;
Chae Hyung AHN
;
Jong Gun LIM
;
Ki Jun SONG
Author Information
1. Department of Biostatistics, Yonsei University College of Medicine, Korea.
- Publication Type:Original Article
- Keywords:
Missing data;
Multivariate normal data;
EM algorithm;
Biostatistics;
Resource-Based Relative Value Scale
- MeSH:
Biostatistics;
Dataset;
Health Surveys*;
Models, Statistical;
Relative Value Scales
- From:Korean Journal of Preventive Medicine
1998;31(4):875-884
- CountryRepublic of Korea
- Language:Korean
-
Abstract:
Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the Em algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which used only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.