Biological and medical researchers often collect count data in clusters at multiple time points. The data can exhibit excessive zeros and a wide range of dispersion levels. In particular, our research was motivated by a dental dataset with such complex data features: the Iowa Fluoride Study (IFS). The study was designed to investigate the effects of various dietary and nondietary factors on the caries development of a cohort of Iowa school children at the ages of 5, 9, and 13. To analyze the multiyear IFS data, we propose a novel longitudinal method of a generalized estimating equations based marginal regression model. We use a zero‐inflated model with a Conway–Maxwell–Poisson (CMP) distribution, which has the flexibility to account for all levels of dispersion. The parameters of interest are estimated through a modified expectation–solution algorithm to account for the clustered and temporal correlation structure. We fit the proposed zero‐inflated CMP model and perform a comprehensive secondary analysis of the IFS dataset. It resulted in a number of notable conclusions that also make clinical sense. Additionally, we demonstrated the superiority of this modeling approach over two other popular competing models: the zero‐inflated Poisson and negative binomial models. In the simulation studies, we further evaluate the performance of our point estimators, the variance estimators, and that of the large sample confidence intervals for the parameters of interest. It is also demonstrated that our longitudinal CMP model can correctly identify the time‐varying dispersion patterns.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.