2/17/2016- Jann Spiess (Harvard)- Robust Post-Matching Inference
Publication information:
Abstract
Title: Robust Post-Matching Inference
Abstract:
Nearest-neighbor matching (Cochran, 1953; Rubin, 1973) is a popular nonparametric tool to create balance between treatment and control groups in non-experimental data. As a preprocessing step for regression analysis, it reduces the dependence on parametric modeling assumptions (Ho et al., 2007). In this paper, we show how to obtain valid standard error estimates for linear regression after nearest-neighbor matching without replacement. We show that standard error estimates that ignore the matching step are not generally valid if the regression model is misspecified, and can lead to severe over- or underestimation of the asymptotic variance of the post-matching estimator. As a remedy, we offer two easily implementable tools for inference that are robust to misspecification: First, we show that standard errors clustered at the match level are valid. Second, we show that a simple nonparametric block bootstrap procedure yields a valid approximation of the distribution of the post-matching estimator. A simulation study and an empirical example demonstrate the importance and usefulness of our results.
[[{"fid":"3122761","view_mode":"default","type":"media","attributes":{"height":"360","width":"480","alt":"Applied Stats 2/17/16 - Jann Spiess on YouTube","title":"Applied Stats 2/17/16 - Jann Spiess","class":"media-element file-default"}}]]
Full text
Title: Robust Post-Matching Inference
Abstract:
Nearest-neighbor matching (Cochran, 1953; Rubin, 1973) is a popular nonparametric tool to create balance between treatment and control groups in non-experimental data. As a preprocessing step for regression analysis, it reduces the dependence on parametric modeling assumptions (Ho et al., 2007). In this paper, we show how to obtain valid standard error estimates for linear regression after nearest-neighbor matching without replacement. We show that standard error estimates that ignore the matching step are not generally valid if the regression model is misspecified, and can lead to severe over- or underestimation of the asymptotic variance of the post-matching estimator. As a remedy, we offer two easily implementable tools for inference that are robust to misspecification: First, we show that standard errors clustered at the match level are valid. Second, we show that a simple nonparametric block bootstrap procedure yields a valid approximation of the distribution of the post-matching estimator. A simulation study and an empirical example demonstrate the importance and usefulness of our results.
[[{"fid":"3122761","view_mode":"default","type":"media","attributes":{"height":"360","width":"480","alt":"Applied Stats 2/17/16 - Jann Spiess on YouTube","title":"Applied Stats 2/17/16 - Jann Spiess","class":"media-element file-default"}}]]