Machine learning is the talk of the town but is it going to crush statistical forecasting completely? The recent success of machine learning in the M5 competition has inspired some commentators to predict the death of statistical forecasting, but I think they are cutting corners. While machine learning is rightfully claiming its place in demand forecasting, there are good reasons to believe that even many years from now it will be used in conjunction with statistical forecasting, as a complementary rather than substitute technique. Here’s why.
Let me first briefly summarize the principles of statistical forecasting. The technology is all about using statistical techniques such as regression, sampling, and hypothesis testing on historical time series of sales data. The technique allows historical trends, seasonal patterns, and correlations with external factors to be identified. Trends are extrapolated, based on the assumption that past behavior is the best predictor of the future.
The frequently used Holt-Winters model, for example, essentially extrapolates the average values, trends, and cyclical behavior of the past to predict the future. The figure below illustrates what it does, showing the historical time series in blue and the extrapolated data in red.
Now, you don’t need a master’s degree in statistics to understand that the Covid-19 pandemic severely disturbed the time series in 2020 and into 2021, which undermines their predictive value. Demand forecasters now need to apply several additional techniques to smooth out or otherwise adjust the historical time series before using them to compute the forecast.
Does machine learning (ML) solve that problem? Not at all. Just like statistical forecasting, ML uses historical data, including the recent time series where its predictive value was undermined by the virus. The old GIGO principle (garbage in, garbage out) applies to ML as much as to statistical forecasting.
But ML does use these historical data in a completely different way, and that presents opportunities. Put simply, the neural network technique links inputs to outputs. There’s no need to define the model equation upfront. You just feed the machine with huge amounts of (carefully selected) input and output data, perhaps shipment data from a large number of products at different time intervals. Then, you let the machine identify patterns in the way the data are linked. This is the so-called learning phase during which the machine is trained and ‘wires’ its neural network
This immediately highlights one of the biggest appeals of ML: you can relatively easily feed it with a much broader range of data, with almost no modeling effort. You can add sales or shipment data from different products in the same product family, product master data, calendar data, promotional data, orders, point-of-sale data, the weather, and economic factors. ML identifies patterns somewhat ingenuously but if you’re feeding it with the right data you can make it smarter and smarter over time.
Will this mean the end of statistical forecasting, as some have suggested after ML’s success in the M5 competition? Not so fast. The M5 competition used only one particular dataset, and we’re not so sure that the winning models would also work well on other datasets. There are also some downsides to ML, for example:
Machine learning does have great potential but it does not guarantee the best forecast. Naive as they are, machines sometimes see patterns where there are none. And they are vulnerable to disrupted historical data in much the same way as statistical forecasting. I think it’s very likely that ML and statistical forecasting will be used as complementary techniques for many years to come. And neither can do without human intelligence, but that’s another story.
Do you want to know more about demand forecasting? Let me know how I can help!
Lennert oversees the R&D of OMP for Demand Management. He is mostly driven by looking for innovations that make our customer’s demand planning journey more manageable and, at the same time, more effective.