Abstract
Missing data frequently occurs in quantitative social research. For example, in a survey of individuals, some of those selected for interview will not agree to participate (unit non-response) and others who do agree to be interviewed will not always answer all the questions (item non-response).
At its most benign, missing data reduces the achieved sample size, and consequently the precision of estimates. However, missing data can also result in biased inferences about outcomes and relationships of interest. Broadly, if the underlying, unseen, responses from those individuals in the survey frame who have one or more missing responses differ systematically from those individuals in the survey frame whose responses are all observed, then any analysis restricted to the subset of individuals whose responses are all observed runs the risk of producing biased inferences for the target population.
Thus every researcher needs to take seriously the potential consequences of missing data. This paper describes the use of Multiple Imputation (MI) to correct estimates for missing data, under a general assumption about the cause, or reason for missing data. This is generally termed the missingness mechanism. MI has robust theoretical properties while being flexible, generalisable and readily available in a range of statistical software.