Google’s LightweightMMM and Meta’s Robyn are two main open-source automated Marketing Mix Modeling tools both aiming to democratize econometrics and marketing science. They share the same goal, but their approach, implementation, and overall paradigms are rather different from each other.
In this article, we present a comparison and breakdown of the two tools regarding different aspects such as implementation, modeling, optimization, documentation, and community. At the time of writing, the stable versions of the libraries are 0.1.6 for LightweightMMM and 3.7.2 for Robyn.
First Things First: License
Despite being used interchangeably at times, the term “open source” does not always mean “free”. However, in the case of LightweightMMM and Robyn, such equivalence holds. Both of these libraries have licenses that allow modification, distribution, private use as well as commercial use. One difference is that Robyn has the MIT License which does not restrict trademark use while LightweightMMM has the Apache License 2.0 which does. Robyn seems to be more open to external contributors as they have a separate section about how to contribute to the codebase in their documentation which LightweightMMM does not have. Overall, both of these libraries are free (as in “free sandwich”) and free (as in “free speech”), embracing the open source mindset.
Diving In: Implementation Details
Google’s LightweightMMM is a super-easy-to-install Python library built on Google’s new machine learning framework JAX. Since the release of their famous TensorFlow framework in 2015, Google has been at the forefront of open-source machine learning and data science.
Proportion of publications that use TensorFlow vs. PyTorch 
But due to various reasons (including serious design flaws), TensorFlow has clearly been losing to its competitor: PyTorch from Meta. This led Google to start the race from scratch (dropping TensorFlow somewhat slowly and silently) by introducing JAX and developing their own new tools such as LightweightMMM with this new framework. In fact, Meta’s PyTorch framework has had such a significant influence in the data science field that LightweightMMM’s probability distribution module heavily relies on the framework Numpyro which states in its documentation: “the design of the distributions module largely follows from PyTorch”.
Meta’s Robyn is a library implemented in the R programming language and requires a Python installation on the side, therefore making it difficult to install, maintain, or deploy as software. The R portion relies on Meta’s forecasting library Prophet, and the optimization module utilizes the Python library Nevergrad, also developed by Meta. It’s somewhat unclear why Robyn was developed in R as Prophet has a Python implementation as well.
One advantage of LightweightMMM over Robyn is that, if available, it can utilize GPU hardware to speed up the model training significantly thanks to its JAX framework. In contrast, Robyn’s optimization module, Nevergrad, implements such optimization techniques that make it embarrassingly easy to parallelize the computation in order to speed up the process, which is precisely what Robyn does (more on optimization later).
The way LightweightMMM and Robyn approach MMM is rather different. Robyn takes a frequentist approach (opposed to Bayesian) and models the marketing phenomena essentially as a “curve fitting” problem using the Prophet library.
There are certain confusions in the MMM community because Prophet might be using some Bayesian methods (e.g. MCMC) to estimate uncertainty bands, but in essence, it does not employ a purely Bayesian approach when modeling time series. It is a specific implementation of a GAM (Generalized Additive Model) where time series are assumed to be made of additive components such as trend, seasonality, holidays, and noise. One of the main advantages of using the Prophet library is that it already provides holidays for numerous countries which can be directly used for modeling.
Frequentists think in terms of cost functions and Bayesians think in terms of priors.
At this point, a completely valid question would be: If we know that marketing is a non-linear phenomenon (e.g. law of diminishing returns) then why do we use a linear model? The answer is we stick to the linear model to keep the benefits mentioned above but we transform the input data with certain non-linear transformations (saturation curves, adstock, etc.) to be able to capture the non-linearities of the domain. All of this can easily be seen in the Ridge Regression formula in Robyn’s documentation:
LightweightMMM approaches the problem in a purely probabilistic way with Bayesian modeling. In the Bayesian paradigm, rather than thinking in terms of point observations or point estimates, everything is modeled as probability distributions. The main advantage of this method is that the users can incorporate their past experiences or expectations as priors. Another advantage of the Bayesian approach is that uncertainty estimates indeed have true probabilistic interpretations which can be thought of as measures of risk. Bayesian modeling, in theory, enables users to incorporate their domain knowledge as prior probabilities.
However, in practice, it may not be very realistic to expect digital marketing experts to conclude that their past experience with a certain marketing phenomenon actually follows a Lewandowski-Kurowicka-Joe distribution which is implemented in the LKJCholesky function in Numpyro in Python. Although this sounds ridiculous, this is actually the capability that LightweightMMM provides. When it comes to geo-level hierarchical modeling, LightweightMMM supports it out of the box. As of December 2022, Robyn does not support any hierarchical breakdown dimensions but it has been communicated this is on the 2023 roadmap.
Google’s LightweightMMM cites several publications [3, 4, 5, 6] to support their argument for using a Bayesian approach to MMM. We think it worth mentioning none of these publications have been published in peer-reviewed conferences or journals and all of them are written by people working at Google. While Robyn does not cite evidence-based, peer-reviewed research for their design choices either, we believe both of these libraries have put significant thought into their implementation choices and serve as powerful tools.
Supporting Aspects: Documentation and Community
When it comes to source code and documentation, neither of the libraries is close to following the best practices of scalable software development. Robyn is full of unused, commented-out code while also missing comments on the active code snippets. Clearly, development speed is prioritized over robustness in Robyn which often results in backward compatibility issues. In other words, updating the library tends to break things. Lightweight MMM has slightly better code quality, but Robyn’s documentation is far more in-depth and comprehensive.
In terms of community, Robyn undeniably takes the cake. Having an active Facebook group (Robyn Open Source MMM Users) dedicated to a transparent roadmap and all things Project Robyn, users are able to ask questions, get help analyzing results and receive insight into new feature requests. This peer-to-peer network gives Robyn users a massive upper hand. With Robyn developers actively posting around Robyn’s next development priorities and chiming in to assist, it’s reassuring to know answers to your questions are simply a post away or already exist in the discussion board. Additionally, the issues/bug tracker in Robyn’s GitHub is significantly more active than that of LightweightMMM.
Both frameworks are developed by R&D teams of tech giants and serve as powerful tools for Marketing Mix Modeling. They are both open-source and neither are production-ready applications. They have different paradigms and approaches when it comes to MMM which is great for the community. At this current stage, Robyn seems to be slightly ahead of others when it comes to democratizing marketing science.