. Evaluation with statistical quality measures
A. Degree of missingness
B. Performance of imputation method
C. Accuracy of imputation results
D. Variability of statistics based on the imputeddataset
Figure 1. Missing data patterns for standard micro (=observation by variable) data
Figure 2. Missing data patterns for multivariate time series and univariate cross-sectional time series
Figure 3. Missing data patterns for multivariate cross-sectional time series
analysis”), a procedure that simply excludes all observation units with missing values from further analysis, or similar approaches are used instead of proper imputation techniques. With these procedures, a large share of information gets lost and biased estimates are a frequent consequence. Researchers have recurrently demonstrated that estimates based on imputeddatasets outperform estimates based on reduced datasets that ignore observation units and/or variables with missing values irrespective of the underlying imputation method (e.g. Colledge et al., 1978 ; Little
International organizations collect data from national authorities to create multivariate cross-sectional time series for their analyses. As data from countries with not yet well-established statistical systems may be incomplete, the bridging of data gaps is a crucial challenge. This paper investigates data structures and missing data patterns in the cross-sectional time series framework, reviews missing value imputation techniques used for micro data in official statistics, and discusses their applicability to cross-sectional time series. It presents statistical methods and quality indicators that enable the (comparative) evaluation of imputation processes and completed datasets.
where data are missing through multiply-imputeddatasets estimated by a linear regression. The distinguishing characteristic of the Multiple Imputation method is that, as its name suggests, instead of imputing a single point estimate for a missing data point, it produces a set of plausible estimates, building into the ultimate estimate the uncertainty associated with the missing data ( Rubin 1987 ). 1 Once these plausible datasets are produced, results from them are then aggregated, often by an average, to make the final estimate.
The chapter uses Predictive Mean