Multistep Horizon Forecasting With finnts

New feature that improves forecast accuracy
time-series
machine-learning
finance
finnts
Author

Mike Tokic

Published

May 13, 2024

TL;DR

I’m excited to announce that we just released a new feature in our machine learning forecast package, finnts, centered around multistep horizon forecasting. It’s a mouthful to say but at a high level it helps improve forecast accuracy by optimizing models to be accurate at each period of a forecast horizon. For example, a 3 month forecast would then be optimized for the forecast in each month (period) of the future forecast.

Let’s dive in to how multivariate modeling used to work in the package and how multistep horizon forecasting can help.

How It Used to Work

Let’s use an example of a monthly revenue forecast for our business’s main product. In practice we would want more than a year of data but let’s just keep it simple today.

Date Revenue
2023-01-01 120
2023-02-01 135
2023-03-01 140
2023-04-01 145
2023-05-01 150
2023-06-01 155
2023-07-01 160
2023-08-01 165
2023-09-01 170
2023-10-01 175
2023-11-01 180
2023-12-01 185

If we wanted to forecast the next 3 months of revenue using multivariate machine learning models we would have to do some feature engineering to get our data in good shape. This involves creating lags on our target variable. Let’s try create some lags on this data.

Date Revenue Revenue_Lag1 Revenue_Lag2 Revenue_Lag3
2023-01-01 120
2023-02-01 135 120
2023-03-01 140 135 120
2023-04-01 145 140 135 120
2023-05-01 150 145 140 135
2023-06-01 155 150 145 140
2023-07-01 160 155 150 145
2023-08-01 165 160 155 150
2023-09-01 170 165 160 155
2023-10-01 175 170 165 160
2023-11-01 180 175 170 165
2023-12-01 185 180 175 170
2024-01-01 ??? 185 180 175
2024-02-01 ??? ??? 185 180
2024-03-01 ??? ??? ??? 185

We added some rows onto the bottom of the data, to allow us to forecast out the next 3 months after our historical data ends. That’s what we have question marks “???” for those values. We also added lags for a 1 month, 2 month, and 3 month lag. But hey, looks like we have a problem. We have more question marks for a few future months for lag 1 and lag 2. If we wanted to forecast the next three months we wouldn’t be able to use those lags, since once we get out further in the forecast horizon we start to have missing lag data.

This means that the smallest lag we could use would always be equal to or greater than the forecast horizon. Since our forecast horizon is 3 than the smallest lag we could use to train a model on would be lag 3. This approach can yield good results, but it removes a lot of potential signal in the data. Revenue next month is most likely impacted by how revenue grew in the current month, but if our forecast horizon is long a lot of this insight has to get thrown away before we can train models. Imagine a forecast horizon of 12. For a monthly forecast this limits our lags to 12 months or more, which is a really bummer since our business can drastically change within 3-6 months, and not using that information in our model can hurt forecast accuracy.

How Multistep Horizon Forecasting Works

Multistep horizon helps fix this issue that allows us to use smaller lags while still being able to have long forecast horizons. In our 3 month forecast horizon example, we can keep the lag 1 and lag 2 features, but how the model gets trained will be different.

In the non-multistep horizon approach, a specific model is trained once on the data using lags that are equal or greater than the forecast horizon. When we run a multistep horizon approach, we can actually train multiple sub models under the hood of a specific model. In our 3 month forecast horizon, here’s how one model like linear regression will be trained.

  • For the first month in the forecast horizon, we can use all available lags. Lag 1, lag 2, and lag 3 of revenue will all be used to predict the first month.
  • In the second month of the forecast horizon, we will use lag 2 and 3 to predict the second month.
  • In the third month of the forecast horizon, we will use lag 3 to predict the third month.

Are you starting to get the hang of it? With multistep horizon forecasting we can still have one model that under the hood has multiple sub models that are each optimized on forecasting out a specific part of our forecast horizon. This allows us to have greater accuracy in the first few periods of our forecast horizon. In a non-mulitstep horizon approach, we are always optimizing for the last period in a forecast horizon. If the forecast horizon is 12 months, the way we do the feature engineering and train models is optimized for forecasting out the 12th month. When running a multistep horizon approach, we instead optimize for every period of the forecast horizon.

This kind of approach is so crucial to forecasters in the corporate finance space. Often these financial analysts are tasked with always forecasting out the rest of the entire fiscal year, even though they might only care about the next 3 months, since they are most likely going to be re-creating a new forecast in the following quarter. Multistep horizon forecasting allows these analysts to still forecast out long forecast horizons like 9 or 12 months, while still being able to optimize for the next 1-3 months. How cool is that!

Reversal

If each specific model can have 2-5 sub models under the hood, the amount of time needed to train these models can multiply by the same amount. Make sure to keep that in mind if run time is a big factor in your forecasting process.

A multistep horizon forecast may not result in a more accurate forecast for smaller forecast horizons. Some time series may have a strong relationship with a 12 month lag, but less with a 1 month or 2 month lag. This means there is strong yearly seasonality in the data. If there is not a strong relationship with 1 month or 2 month lag, then having multiple sub models optimize for each future month in a multistep horizon approach may not result in more accurate forecasts. Consider doing some exploratory data analysis to see what kind of relationships there are with historical lags of your target variable and other features.

Final Thoughts

The new multistep horizon forecasting approach in finnts allows users to create even more accurate forecasts, regardless of their forecast horizon length. If you’d like to learn more check out the official finnts documentation to see how you can use the newest multistep horizon feature!