Data distribution and transformation in population based sampling survey of viral load in HIV positive men who have sex with men in China
10.3760/cma.j.issn.0254-6450.2017.11.011
- VernacularTitle:中国MSM人群HIV感染者病毒载量抽样调查数据分布特征及数据转换研究
- Author:
Zhi DOU
1
;
Jun CHEN
;
Zhen JIANG
;
Weilu SONG
;
Jie XU
;
Zunyou WU
Author Information
1. 中国疾病预防控制中心性病艾滋病预防控制中心预防干预室
- Keywords:
Human immunodeficiency virus;
Viral load;
Distribution characteristics
- From:
Chinese Journal of Epidemiology
2017;38(11):1494-1498
- CountryChina
- Language:Chinese
-
Abstract:
Objective To understand the distribution of population viral load (PVL) data in HIV infected men who have sex with men (MSM), fit distribution function and explore the appropriate estimating parameter of PVL. Methods The detection limit of viral load (VL) was ≤ 50 copies/ml. Box-Cox transformation and normal distribution tests were used to describe the general distribution characteristics of the original and transformed data of PVL, then the stable distribution function was fitted with test of goodness of fit. Results The original PVL data fitted a skewed distribution with the variation coefficient of 622.24%, and had a multimodal distribution after Box-Cox transformation with optimal parameter (λ) of-0.11. The distribution of PVL data over the detection limit was skewed and heavy tailed when transformed by Box-Cox with optimal λ=0. By fitting the distribution function of the transformed data over the detection limit, it matched the stable distribution (SD) function (α=1.70, β=-1.00, γ=0.78, δ=4.03). Conclusions The original PVL data had some censored data below the detection limit, and the data over the detection limit had abnormal distribution with large degree of variation. When proportion of the censored data was large, it was inappropriate to use half-value of detection limit to replace the censored ones. The log-transformed data over the detection limit fitted the SD. The median (M) and inter-quartile ranger (IQR) of log-transformed data can be used to describe the centralized tendency and dispersion tendency of the data over the detection limit.