Luis Costa, Vivek F. Farias, Patricio Foncea, Jingyuan (Donna) Gan, Ayush Garg, Ivo Rosa Montenegro, Kumarjit Pathak, Tianyi Peng, and Dusan Popovic, Generalized Synthetic Control for TestOps at ABI: Models, Algorithms, and Infrastructure, To appear in INFORMS Journal on Applied Analytics (Winner, Daniel H. Wagner Prize 2022)
We describe a novel optimization-based approach– Generalized Synthetic Control (GSC)– to learning from experiments conducted in the world of physical retail. GSC solves a long-standing problem of learning from physical retail experiments when treatment effects are small, the environment is highly noisy and non-stationary, and interference and adherence problems are commonplace. The use of GSC has been shown to yield an approximately 100x increase in power relative to typical inferential methods and forms the basis of a new large-scale testing platform: ‘TestOps’. TestOps was developed and has been broadly implemented as part of a collaboration between Anheuser Busch Inbev (ABI) and an MIT team of operations researchers and data engineers. TestOps currently runs physical experiments impacting approximately 135M USD in revenue every month and routinely identifies innovations that result in a 1-2% increase in sales volume. The vast majority of these innovations would have remained unidentified absent our novel approach to inference: prior to our implementation, statistically significant conclusions could be drawn on only ∼ 6% of all experiments; a fraction that has now increased by over an order of magnitude.
Our contribution, compared to the state-of-the-art experimental platforms available commercially today (such as the MasterCard APT, Optimizely or Adobe platforms), is thus three-fold:
- (Much) Higher Power: Our efficient non-convex panel data solver yields an effectively 100x increase in power relative to the standard DID approach inherent in commercial experimentation platforms. Indeed, because of the fact that the control data and portions of the tests data are routinely corrupted, there are no obvious inference methods outside of DID available on a commercial platform that would apply prior to this implementation.
- Far Fewer Assumptions: Relative to existing techniques such as DID, our approach dispenses with the need of the so-called parallel trends assumption which is very difficult to justify in the 3physical retail environment. Our approach also dispenses with the need to make assumptions on independence in observed noise and can flexibly accommodate complex temporal and crosssectional correlations as well as endogenous treatment assignments (existing sophisticated alternatives to DID do not allow for this). Both these features give a great deal of confidence in inferential results that would otherwise be questioned given the high-stakes nature of a rollout.
- Robust Optimization: Our differentiation is facilitated by viewing the inherent estimation task through the lens of robust optimization. This is a first on a commercially deployed platform, and we believe this is a lens that is particularly useful and interpretable in business environments such as ABI.
沒有留言:
張貼留言