Time Series First Principles Series
This post dives into the third principle of a good time series forecast, the future is similar to the past. Check out the initial post in this series to get a high level view of each principle.
Here Comes The Sun
For whatever you’re trying to forecast, it will be a lot easier to do with with machine learning (ML) if the future is similar to the past. It’s as simple as that.
When you open the weather app on your phone, have you ever looked at when the sun is expected to rise and set? If you’re on the new human optimization craze about getting morning sunlight, you most likely have. That forecast is down to the minute, most likely even second, and has a high degree of accuracy. Is the forecast accurate because of expert human judgement, or the type of weather related features fed into a ML model? Nope. It’s accurate because the sun has risen and set at relatively the same time, based on time of year, for millions of years. We don’t expect future sun rises and sun sets to change that much going forward, that’s why your weather app gives you an exact time for the sun rise but gives you only a percent probability of rain. Even then, that chance of rain may not even be accurate. It’s almost a joke now how many times in Seattle I’ve seen a dry forecast only to step out of my house and have it immediately start raining. At least I know the exact minute when the sun will set that day.
Handling A Changing Future
Your business is most likely not like the sun. It’s constantly changing, reacting to market forces and industry competitors. The best way to teach a model about your expectations of the future is to give it data about the past and future.
Let’s use an example of a monthly revenue forecast for a product. If you only use historical sales data to forecast the product, then you are making the assumption that the future of the product will be almost identical to the past, especially the most recent past. For some established products in mature industries this could be totally fine, but often this is not the case.
One thing to try is adding features that can explain how outside forces impact the product. For example, how much money consumers have to spend might greatly impact who buys your product. So using an economic feature like consumer sentiment can help a model adjust it’s predictions based on changes in consumer spending habits.
We can add features into our data in two ways. The first is to just give historical values of that feature. This will force us to only use historical lags of the feature when training a ML model, since we don’t know what the future value of that feature will be. We can take that original feature and create new features (this is called feature engineering). Ones that are a 3 month lag, 6 month lag, or 12 month lag of the original data. Often macro data like consumer sentiment can be a lagging indicator. Meaning their impact is delayed and takes a while to flow through the economy. Changes in consumer sentiment from 6 months ago can actually have a strong correlation with how customers purchase our product today.
A second way is to use both historical values and future values. We could use a future forecast of consumer sentiment in our model, in addition to using the historical data. That way a model can learn from any lagged relationships as well as understand how changes in consumer sentiment impact our product in real time. These future values can either come from an expert forecast (like from famous economists) or created by your own ML solution.
The Future Must Always Learn From The Past
You might have a ton of ideas for new kinds of future information your can encode as features to train a model. In order to use this data we need to make sure there are historical examples for a model to learn from. The upcoming 2024 presidential election in the US could have a large impact on your business, which will impact your future forecast for the rest of 2024. We know exactly when the election is going to happen, so it’s easy to give that information into a ML model to learn from.
The catch is we need to make sure that we show examples from the past to allow the model to learn how previous elections impact our business and how the model should handle similar events in the future. If we only have product sales data from the last three years, then we cannot feed it future election data because we don’t have the data from the 2020 or 2016 US presidential elections.
If we know something is going to happen in the future, but we can’t quantify it with historical data for a model, then we need to go old school. Instead we need to use our domain expertise about the business to take the ML output, without knowledge of the future event, and make a manual adjustment to the forecast based on the expected impact of the future. For the election example, maybe your product sales will grow as we get closer to the election, so if you don’t have enough historical data for a model to learn about the election’s impact you can make a manual adjustment to the ML forecast based on your assumption about the election’s impact.
This kind of hybrid approach, ML first with a light human touch second, can create a powerful combination. A ML model can do 80-90% of the initial work and a human can make the final manual adjustments based on their domain expertise. This allows a human to add more insight into a forecast that is not easily quantifiable for a ML model to learn from.
New Time Series
A new product at your company might be exciting, but is harder to forecast accurately with ML models. The lack of historical data will make it hard for any new ML model to learn from. Initial trends and seasonality may not always carry into the future. For example there might be a big spike in initial sales around release but then taper off over time. The new product may not even be on sale yet, so you are now tasked with forecasting something with zero historical data.
If the time series in question has some historical data, ideally more than one year of historical observations, a good way to deal with it is to train a ML model with the new time series alongside similar existing products with a lot of historical data. This is sometimes called a “global model”, where a model learns from multiple time series instead of one. Training on one specific time series is sometimes called a “local model”. Training a global model allows the ML model to learn general trends and seasonality patters across similar time series and apply it to the newer time series. This can work well if the other time series are similar to one another.
If the product you want to forecast has no historical data, then you are in a tough boat. Traditional time series methods cannot help you, since they all rely on quality historical data. What you can do is take a more traditional machine learning regression approach. This involves taking all historical products that have launched over time and training a model to understand the initial demand of a new product and how it either grows or shrinks over time. For example with iPhone sales, you can train a model on the initial sales of each iPhone model from V1 to V14, then use that model to predict the kind of demand the latest V15 iPhone might have. This type of approach would need a more detailed post to explain fully, but hopefully you get the broader picture.
To learn more about forecasting new time series, check out the forecasting bible written by our forecasting godfather Rob Hyndman. The chapter on judgemental forecasts goes deep into forecasting new products and discusses other approaches you can take.
Reversal
When using future values of a feature, there is a risk of compounding errors. Let’s go back to the consumer sentiment example. If you create your own expectation of future consumer sentiment, or use an economist’s prediction, there is a good chance the forecast will be wrong. If the forecast about consumer sentiment is wrong, and that forecast is fed into a model to predict your product’s sales, then your sales forecast will be even more wrong. The errors compound. Add in other features and you can see how the house of cards can tumble pretty fast. Always be weary about using future values of features where you don’t have 100% confidence in their future value. For example using future holiday features are great because they will always happen on a specific day with 100% certainty, but trying to tell a model where inflation is headed in the future can get you in trouble.
Having a human make manual adjustments after the initial ML forecast can add unneeded human bias to the final forecast. This bias can sometimes be wrong and hurt the accuracy of the forecast. It’s good practice to capture these adjustments and always report on the forecast accuracy of the pure ML model and the accuracy for the model + human adjustments. That way you can track how helpful the manual adjustments are, and remember why they were made in the first place.
Final Thoughts
When embarking on the journey of time series forecasting, remember it’s more art than exact science—akin to predicting rain in Seattle. The key takeaway? Use the past as a guide but sprinkle in educated guesses about the future with caution. Whether you’re launching new products or navigating established markets, blending machine learning with a dash of human intuition can create robust forecasts. May your forecasts be as reliable as the sunrise, with just enough flexibility to handle an unexpected downpour.