Comparing and combining statistical and machine learning approaches for customer lifetime value prediction in freemium settings

Key Info

Basic Information

Lehrstuhl für Marketing


  • Dr. Julian Runge

Seminal papers on the prediction of customer lifetime value in marketing rely on stochastic models to predict customer purchasing and spending (Schmittlein et al. 1987; Fader et al. 2005). These models were mostly developed on traditional retail datasets. New business models such as „freemium,“ particularly developed to offer and sell digital goods (Voigt and Hinz 2016; Sifa et al. 2018), produce different datasets with different statistical characteristics. E.g., only a small subset of freemium users become paying customers, often leading to zero inflation and class imbalance (Weiss 2004; Sifa et al. 2018), and, for the case of non-contractual in-app purchases, customers spend drastically different amounts of money with a firms‘ catalog, leading to skewed and outlier-infested distributions (Sifa et al. 2018). The stochastic models prevalent in the marketing literature are ill-equipped to deal with some of these challenges.
In this master thesis, the student will draw on scientific literature in marketing and computer science to develop perspectives on how to compare and combine stochastic models with machine learning approaches to provide more accurate predictions of the lifetime value of customers acquired under such new business models. The student will have the opportunity to work with a real-world dataset and to receive insights and guidance from an advanced analytics company active in the space who will join for some of the meetings. With this, the aim is to assist the student in developing their own research contribution with relevance vis-à-vis the practitioner frontier.
Data manipulation and analysis should be implemented in a common software for data analysis, ideally R or Python. The code should be included in the final seminar paper or be made available and referenced as an online appendix.

D. Schmittlein, D. Morrison, and R. Colombo (1987). “Counting Your Customers: Who Are They and What Will They Do Next?” Management Science.
P. Fader, B. Hardie, and K. Lee (2005). “‘Counting Your Customers’ the Easy Way: An Alternative to the Pareto/NBD Model.” Marketing Science.
S. Voigt and O. Hinz (2016). “Making Digital Freemium Business Models a Success: Predicting Customers Lifetime Value via Initial Purchase Information.” Business and Information Systems Engineering.
R. Sifa, J. Runge, C. Bauckhage, and D. Klapper (2018). "Customer lifetime value prediction in non-contractual freemium settings: Chasing high-value users using deep neural networks and SMOTE." Proceedings of the 51st Hawaii International Conference on System Sciences (HICSS).
G. Weiss (2004). “Mining with Rarity: A Unifying Framework.” ACM SIGKDD Explorations.