Ying Jin presents Quantifying the role of distribution shift in effect generalization with scientific replication data

Publication information:

Ying Jin presents Quantifying the role of distribution shift in effect generalization with scientific replication data. 2024.

Abstract

Abstract: Effect estimates from executions of the same experiment often differ, not just due to statistical uncertainty but also various distribution shifts. These shifts can arise from differences in participant characteristics, changes in intermediate variables, or variations in participant responses to a treatment. Understanding their impact on effect estimates is crucial for scientific understanding, result reporting, and informing study design for effect generalization.

In this talk, I will introduce some recent advances in quantifying the role of distribution shift in effect generalization. First, I will present a methodological framework that quantifies the contribution of distribution shifts to effect discrepancy between paired studies, motivated by the need for such understanding in scientific replications. It decomposes the effect discrepancy into additive contributions of sampling uncertainty, covariate shift, mediation shift, and residual shift. It then estimates these quantities using generalizability and post-selection inference techniques. When applied to psychological experiments, our methods provide differentiated insights into factors driving the effect discrepancies between original studies and their replications. 

In addition, I will discuss ongoing findings on the nature of distribution shifts between studies using multi-site replication data. Such data provides valuable insights due to a high number of replications with minimal publication bias. Our results shed light on the relationship between observable covariate shift and unknown conditional shift in the variability of experimental outcomes. 

The paper is available on arXiv: https://arxiv.org/abs/2309.01056


Full text

Abstract: Effect estimates from executions of the same experiment often differ, not just due to statistical uncertainty but also various distribution shifts. These shifts can arise from differences in participant characteristics, changes in intermediate variables, or variations in participant responses to a treatment. Understanding their impact on effect estimates is crucial for scientific understanding, result reporting, and informing study design for effect generalization.

In this talk, I will introduce some recent advances in quantifying the role of distribution shift in effect generalization. First, I will present a methodological framework that quantifies the contribution of distribution shifts to effect discrepancy between paired studies, motivated by the need for such understanding in scientific replications. It decomposes the effect discrepancy into additive contributions of sampling uncertainty, covariate shift, mediation shift, and residual shift. It then estimates these quantities using generalizability and post-selection inference techniques. When applied to psychological experiments, our methods provide differentiated insights into factors driving the effect discrepancies between original studies and their replications. 

In addition, I will discuss ongoing findings on the nature of distribution shifts between studies using multi-site replication data. Such data provides valuable insights due to a high number of replications with minimal publication bias. Our results shed light on the relationship between observable covariate shift and unknown conditional shift in the variability of experimental outcomes. 

The paper is available on arXiv: https://arxiv.org/abs/2309.01056