Jacob Feldman, Dennis J. Zhang, Xiaofei Liu, and Nannan Zhang (2021) Customer Choice Models vs. Machine Learning: Finding Optimal Product Displays on Alibaba. Operations Research 70(1):309-328. (Best OM Paper in Operations Research Award: Finalist, pdf, implementation details)
We compare the performance of two approaches for finding the optimal set of products to display to customers landing on Alibaba’s two online marketplaces, Tmall and Taobao. We conducted a large-scale field experiment, in which we randomly assigned 10,421,649 customer visits during a one-week-long period to one of the two approaches and measured the revenue generated per customer visit. The first approach we tested was Alibaba’s current practice, which embeds product and customer features within a sophisticated machine-learning algorithm to estimate the purchase probabilities of each product for the customer at hand. The products with the largest expected revenue (revenue × predicted purchase probability) are then made available for purchase. Our second approach, which we developed and implemented in collaboration with Alibaba engineers, uses a featurized multinomial logit (MNL) model to predict purchase probabilities for each arriving customer. We used historical sales data to fit the MNL model, and then, for each arriving customer, we solved a cardinality-constrained assortment-optimization problem under the MNL model to find the optimal set of products to display. Our field experiments revealed that the MNL-based approach generated 5.17 renminbi (RMB) per customer visit, compared with the 4.04 RMB per customer visit generated by the machine-learning-based approach when both approaches were given access to the same set of the 25 most important features. This improvement represents a 28% gain in revenue per customer visit, which corresponds to a 4 million RMB improvement over the week in which the experiments were conducted. Motivated by the results of our initial field experiment, Alibaba then implemented a full-featured version of our MNL-based approach, which now serves the majority of customers in this setting. Using another small-scale field experiment, we estimate that our new MNL-based approach that utilizes the full feature set is able to increase Alibaba’s annual revenue by 87.26 million RMB (12.42 million U.S. dollars).
Research questions:
The necessity to better understand this trade-o with regard to the wealth of existing choice models has given rise to two general research problems that over the last decade have guided much of the work in the eld of revenue management. These two problems are summarized below.
- Estimation: Can a choice model's parameters be eciently and accurately estimated from historical data?
- Assortment Optimization: Given a fully specied customer choice model, is it possible to develop ecient algorithms with provable performance guarantees for the uncapacitated assortment problem and variants thereof?
Machine-learning algorithms
The current system implements various models and ensembles their predictions together for both estimation problems. These models include regularized logistic regression (Ravikumar et al. 2010), gradient-boosted decision trees (Friedman 2002), and deep learning (LeCun et al. 2015). As of the time our system is deployed (i.e., March 2018), regularized logistic regression and gradient-boosted decision trees contribute the most to the final prediction outcome due to their superior prediction performance compared to that of deep neural networks. The implementation of these machine learning algorithms is conducted oine using historical purchases from a seven-day rolling window.
A nice introduction about Customer Choice Models is provided in the book: Guillermo Gallego and Huseyin Topaloglu, Revenue Management and Pricing Analytics, Springer , 2019. (Downlable by using CYCU vpn)
沒有留言:
張貼留言