/******************* imputey.do Example file exploring the question of imputing the dependent variable in a regression Written by Russell Dimond, Summer 2012 for the Social Science Computing Cooperative at UW-Madison ********************/ clear all set more off // generate random data // x1-x3 drawn independently from standard normal distribution // y is sum of x's, plus normal error term set obs 10000 set seed 4409 forval i=1/3 { gen x`i'=invnorm(runiform()) } egen y=rowtotal(x*) replace y=y+invnorm(runiform()) // drop values at random foreach var of varlist * { replace `var'=. if runiform()<.2 } misstable sum, gen(miss_) // complete cases analysis reg y x* preserve // leave y out of imputation model mi set wide mi register imputed x* mi register regular y miss_* mi impute chained (regress) x*, add(10) // note coefficients biased towards zero mi estimate: reg y x* mi xeq 0: cor y x1 mi xeq 1/5: cor y x1 if miss_x1 restore preserve // include y in imputation model mi set wide mi register imputed y x* mi register regular miss_* mi impute chained (regress) x* y, add(10) // don't use imputed values of y in analysis mi estimate: reg y x* if !miss_y // use impute values of y mi estimate: reg y x* mi xeq 0: cor y x1 mi xeq 1/5: cor y x1 if miss_x1 restore