Background: So far, several studies were conducted to estimate the prevalence of cigarette smoking in Iran, but none of them used a statistical model to deal with unobserved smokers. The present study planned to estimate the accurate prevalence of cigarette smoking using mixture of truncated Poisson distribution.
Methods: A cross-sectional study was conducted in Hamadan, west of Iran in 2009, using cluster sampling and 1146 men and women aged≥18 years were enrolled. The data collection was done by an expert group of psychologists and sociologists. A truncated mixture Poisson distribution was fitted to the daily number of cigarettes smoked by smokers. The number of components of the mixture model and related mean and weight were specified using Bayesian information criteria. Accordingly, the number of cigarette smokers who answered incorrectly to the relevant question was estimated. To investigate the validity of the results, a simulation study was conducted using CAMCR software.
Results: Mixture Poisson distribution with four components was the most appropriate model fitted to the count data. After correction for underestimation, the prevalence rate of cigarette smoking in the population was 20.6%, including 36.2% for men and 3.3% for women. According to the simulation study, the bias of estimated prevalence was about zero and the root mean square error was estimated 2.5.
Conclusion: The number of unobserved data can be estimated by fitting model to truncated count data. The mixture of truncated Poisson distribution is particularly useful to estimate population size when the main objective of the study is to investigate negative traits to which the participants may answer incorrectly.