A Two-Tier Approach to Buy It Again Recommendations Using Category and Item Models
Authors:
(1) Amit Pande, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]);
(2) Kunal Ghosh, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]);
(3) Rankyung Park, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]).
Table of Links
Abstract and 1 Introduction
- Literature Review
- Model
- Experiments
- Deployment Journey
- Future Directions and References
ABSTRACT
Buy It Again (BIA) recommendations are crucial to retailers to help improve user experience and site engagement by suggesting items that customers are likely to buy again based on their own repeat purchasing patterns. Most existing BIA studies analyze guests’ personalized behaviour at item granularity. This finer level of granularity might be appropriate for small businesses or small datasets for search purposes. However, this approach can be infeasible for big retailers which have hundreds of millions of guests and tens of millions of items. For such data sets, it is more practical to have a coarse-grained model that captures customer behaviour at the item category level. In addition, customers commonly explore variants of items within the same categories, e.g., trying different brands or flavors of yogurt. A category-based model may be more appropriate in such scenarios. We propose a recommendation system called a hierarchical PCIC model that consists of a personalized category model (PC model) and a personalized item model within categories (IC model). PC model generates a personalized list of categories that customers are likely to purchase again. IC model ranks items within categories that guests are likely to reconsume within a category. The hierarchical PCIC model captures the general consumption rate of products using survival models. Trends in consumption are captured using time series models. Features derived from these models are used in training a category-grained neural network. We compare PCIC to twelve existing baselines on four standard open datasets. PCIC improves NDCG up to 16% while improving recall by around 2%. We were able to scale and train (over 8 hours) PCIC on a large dataset of 100M guests and 3M items where repeat categories of a guest outnumber repeat items. PCIC was deployed and A/B tested on the site of a major retailer, leading to significant gains in guest engagement.
1 INTRODUCTION
Note [1]
With the advent of e-commerce, recommendation systems have become a hot topic for research. Personalized recommendations are a key asset for successful apps or sites across a wide variety of industries including music or video streaming services, e-commerce platforms, gaming, finance, and banks. Digitization came late to the grocery shopping experience, as many people previously preferred to shop for groceries in person. However, digital grocery sales skyrocketed with the advent of Covid-19 as most shoppers switched to digital orders backed by digital fulfillment, order-pickup, driveup, or personal shopper [8]. With this change in shoppers’ behavior, a lot of attention went to both next basket recommendation (NBR) [12, 16, 18–21] that suggests items customers would like to purchase or consume next and to building personalized virtual aisles to aid the customer shopping experience. Effective personalized recommendations improve customer lifetime value (LTV) by increasing repeat purchases and by allowing customers to explore new relevant items. This brings a very good opportunity, especially for an omni-channel retailer, to design strategies which can keep them engaged by facilitating their shopping experience. Making the recurring purchases of customers quick is thus paramount to improve their shopping experience, and to free their time to purchase novel discretionary items.
Given a sequence of baskets that a customer has purchased or consumed in the past, the goal of a NBR system is to generate the next basket of items that the customer would like to purchase or consume next. Within a basket, items have no temporal order and are equally important. The NBR can be further divided into two similar but different problems. The first is repeat purchase recommendation, called the Buy It Again (BIA) problem, where the goal is to recommend items that customers have already purchased and do so at times when the customers might be running out of the item(s). The second is adjacent inspiration recommendation, or the You might also like problem, where the goal is to inspire customers to shop for items that may complement ones they have bought before or ones similar customers have purchased. Although many research papers on next basket recommendations lump the two subproblems together, most retailers implement them as entirely distinct products on their apps and webpages.
Existing work in BIA recommendations has focused on modeling item repurchase probabilities by using variants of recurrent neural networks or statistical models. Large retailers handle hundreds of millions of items and guests, but the majority of repurchase transactions are on a small subset of items and guests. This can lead to underfitting for item-grained models, as the data ends up being represented sparsely in a very high dimensional space. In the worst case, training itself may become infeasible due to computational resource limitations.
In this work, we emphasize the effectiveness of personalized category frequency modeling on BIA predictions. Customers will often explore variants of an item or new items within a category for reasons such as the desire to try different brands, the need to satisfy varying taste preferences in the customer’s family, or the presence of discounts on alternative items. Category-based repurchase modeling can effectively capture higher abstraction information on these item repurchase dynamics. As shown in Figure 1, the percentage of items that have high numbers of repurchases is small (Figure 1a), but most categories demonstrate high levels of repurchases (Figure 1b). The discrepancy means that models geared toward category repurchases may be more effective at satisfying guest preferences. Furthermore, due to the aforementioned sparsity, it is far more difficult to train performant BIA recommendation models on item repurchases than it is on category repurchases.
In this work, we emphasize the importance of both personalized product frequency as well as repeat purchase prediction models to make good Buy It Again predictions. More specifically, we observe that the product purchase frequency may be sparse in predicting customer repurchases and we discuss how using personalized category frequency can be a better choice. Customers often like to explore new items within a category.
In this paper, we propose a 2-tier PCIC model for BIA recommendations. The personalized category model (PC model) predicts which categories customers will buy again on their next visit, and the personalized item within categories model (IC model) provides personalized ranks of items in categories. Final BIA recommendations for individual customers are generated by combining both predictions. PC is a neural network that outputs category-level likelihoods for each customer. Input features to PC are generated by an ensemble of time-series machine learning algorithms that captures personalized consumption rates of each category and predicts when customers will buy items in each category. IC is a regression model that predicts category-agnostic item ranking. The outputs of the two models are combined to generate personalized BIA item recommendations for individual customers.
We compare PCIC to twelve existing state of the art baseline algorithms on four standard open datasets. PCIC improves NDCG up to 16% while improving recall by around 2%. We were able to scale and train PCIC on a large dataset of 100M guests and 3M items where repeat categories of a guest outnumers repeat items. PCIC was deployed on an Apache Spark cluster, allowing us to train and score the model in around 8 hours. It was A/B tested on the site of a major retailer, leading to significant gains in guest engagement.
The main contributions of this work as summarized below:
(1) We propose a hierarchical PCIC model for Buy It Again recommendations which combines coarse prediction by a personalized category model (PC model) and finer-grained prediction by a personalized item within categories model (IC model). We show how the model supports our insights that customers tend to explore brands, sizes, flavors, etc. similar to a given item within a category.
(2) We demonstrate that the proposed PCIC model outperforms existing baselines of public datasets. We also show that PCIC scales to large datasets.
(3) We deploy PCIC in a commercial setting to provide BIA recommendations for millions of customers. We demonstrate improved guest experience on the site as evidenced by multiple A/B tests. We discuss our experiences deploying and scaling PCIC.
[1] The short version of this paper appears in RecSys 2023