Demand sensing and forecasting: Designing a unified solution for retail and manufacturing use cases
Mar 20, 2024 • 20 min read
For retailers, brands, and manufacturers, the ability to make informed decisions about procurement, transportation, workforce allocation, distribution, and pricing hinges on one critical factor—precise demand forecasting. Each decision, whether it pertains to manufacturer procurement, inventory allocation and distribution, labor planning, inventory rebalancing, or price optimization, requires an accurate prediction of demand or sales. These forecasts must cater to varying levels of aggregation, time horizons, and different accuracy and quality requirements.
Long-term forecasts rely predominantly on trends, seasonality, and variables that can be predicted well in advance. However, the landscape changes dramatically when it comes to short-term forecasts. Here, the ability to rapidly assimilate and act upon ongoing, immediate, and fluctuating factors such as market disruptions and competitive dynamics to more transient elements like weather conditions becomes crucial. This capability, known as demand sensing, is integral in applications like price management where forecasting must converge with scenario simulation to evaluate diverse decision outcomes.
Providing accurate demand forecasts is complex for several reasons:
- High scalability: Companies often need to make forecasts for billions of SKU x Channel x Location combinations which require high scalability, handling various demand patterns across different categories and proper configuration monitoring tools.
- Forecasting new and slow-moving products: Products with limited sales history, particularly new and slow-moving items, present a significant forecasting challenge due to the scarcity of historical data.
- Tool adoption by business users: For business users to adopt forecasting tools, two crucial factors must be met: interpretability of the forecasting models and the capability to simulate various scenarios.
- Hierarchical consistency: Forecasts need to be consistent across various levels—from SKU to product category, store, region, etc.
- Uncertainty estimation: Business users need to understand the uncertainty of the forecast and how it propagates into the outcomes of their decisions.
- Economic and market conditions: The continuous shifts in economic and market conditions directly affect demand, necessitating a forecasting approach that can adapt and respond to these changes.
- Cross-product effects: The interdependencies between products due to cannibalization and halo effects add another layer of complexity, especially when simulating different market scenarios.
- Data availability and quality: Particularly in B2B settings, the challenge is often the limited amount of data available for forecasting, compounded by issues of data quality, such as untracked stockouts and other discrepancies.
Navigating these challenges requires a comprehensive approach that goes beyond traditional forecasting methods. This involves embracing advanced techniques such as probabilistic forecasting, hierarchical forecasting, cross-effects forecasting, and attribute-based forecasting. Moreover, selecting the right machine learning modeling approaches is crucial for enhancing forecast accuracy and utility across different use cases.
Demand Sensing and Forecasting Starter Kit
Enter the Demand Sensing and Forecasting Starter Kit, a solution for comprehensive demand analysis and accurate long-term and short-term forecasting.
A structured approach to demand sensing and forecasting
The Demand Sensing and Forecasting Starter Kit plays a pivotal role in aligning business strategies with market dynamics, ensuring organizations can proactively meet future demands.
Use cases
This section outlines the primary use cases that a robust demand sensing and forecasting approach supports, highlighting the specific requirements of each:
- Financial planning: Accurate forecasts are crucial for budgeting, financial projections, and investment planning. This use case requires a blend of short-term and long-term forecasting to navigate market volatility and ensure financial stability.
- Inventory allocation and distribution: Effective inventory management hinges on understanding future demand to optimize stock levels across locations, minimizing holding costs, and reducing stockouts.
- Markdown optimization: Predicting the right timing and depth of discounts for products can significantly impact profitability. This involves analyzing past sales trends and predicting future demand elasticity.
- Price optimization: Setting the optimal price for products requires analyzing historical data and forecasting how changes in pricing will affect future demand, considering factors like competitor pricing and customer sensitivity.
- Merchandise procurement: Forecasting demand at the product level helps businesses make informed procurement decisions, ensuring they can meet customer demand without overstocking.
- Labor planning: Accurate demand forecasts enable businesses to optimize workforce allocation, ensuring sufficient staffing during peak times without overspending during slower periods.
- Inventory rebalancing: This involves redistributing products between locations to align inventory levels with demand, reducing the likelihood of stockouts in high-demand areas and excess stock in others.
Capabilities
To support these use cases, a comprehensive demand sensing and forecasting solution must offer the following capabilities:
- Demand sensing: Quickly detects short-term market changes, allowing businesses to adjust supply chain and inventory strategies in real time.
- Demand forecasting: Enables long-term strategic planning by transforming historical data into actionable insights for future customer demand.
- Hierarchical forecasting: Provides multi-level forecasting from item-level detail to broader category and market levels, ensuring consistency across business segments.
- Probabilistic forecasting: Offers a range of potential outcomes to better prepare for demand variability, enhancing risk mitigation in planning.
- Cross-effects forecasting: Analyzes the interplay between products to inform strategic product launches and portfolio management.
- Attribute-based forecasting: Utilizes product attributes for forecasting, especially useful for new products without historical sales data.
Solution approach
The demand sensing and forecasting process is structured as a comprehensive pipeline consisting of several layers, each designed to refine and enhance the quality of the forecasts produced. This systematic approach ensures that the output is both actionable and tailored to the specific needs of various use cases.
- Input layer: The pipeline initiates with the input layer, which is fundamental for data ingestion. This layer is tasked with collecting a wide array of data, encompassing historical sales data, real-time data streams, and forecasts of external signals such as weather conditions. Additionally, it incorporates user inputs like forecast horizons, planned future promotions, and pricing strategies, while also undertaking the crucial process of data cleaning to ensure the integrity of the data for subsequent analysis.
- Data pre-processing layer: Following data ingestion, the data pre-processing layer takes over, focusing on the manipulation and preparation of data. This involves various forms of data aggregation, feature engineering, selection of the most indicative features, and the computation of product embeddings. This layer is pivotal in transforming raw data into a structured format that is optimized for predictive modeling.
- Modeling layer: At the core of the pipeline lies the modeling layer. Here, machine learning models are developed, trained, and tuned with the prepared data. This stage is critical for the actual generation of demand forecasts, leveraging the insights and features identified in the previous layers. Model evaluation also occurs here, ensuring the reliability and accuracy of the forecasts generated.
- Data post-processing layer: Once the models have generated forecasts, the data post-processing layer engages. Its functions include further data aggregation and disaggregation, reconciliation of forecasts to ensure consistency across different levels, and the application of methods to visualize and interpret forecasts. This layer is crucial for refining the forecasts and making them understandable and actionable for decision-makers.
- Output layer: The culmination of the pipeline is the output layer, which delivers the forecasts in formats aligned with specific use cases. This includes short-term forecasts for immediate decision-making, long-term forecasts for strategic planning, probabilistic forecasts to gauge uncertainty, attribute-based forecasts for products with limited historical data, hierarchical forecasts to ensure consistency across various levels of aggregation, and cross-effects forecasts to understand the impact of new product introductions or changes in product mix.
Figure 1 provides a schematic overview of the demand sensing and forecasting pipeline, illustrating the transition from data ingestion through to the generation of diverse forecast types. This visual representation underscores the pipeline’s capability to cater to a broad spectrum of forecasting needs through its structured, layered approach.
In subsequent chapters, we will explore each of these areas in greater detail, discussing their theoretical foundations and demonstrating their application through practical business scenarios.
Demand sensing
Demand sensing is a sophisticated forecasting approach that transcends traditional models by integrating a variety of data sources, signals, and advanced analytical methods. It primarily focuses on a short-term forecast horizon, typically up to 6 weeks ahead, allowing businesses to react swiftly and effectively to imminent market shifts.
The cornerstone of demand sensing lies in its ability to process and analyze diverse data types. These include real-time point of sale data, weather forecasts, social media trends, competitor activities, promotional events, and even macroeconomic indicators. By harnessing these varied data streams, demand sensing models can capture a holistic view of the factors influencing consumer demand.
To make sense of this extensive data, demand sensing employs a range of machine learning models and algorithms. Techniques such as time series analysis, regression models, neural networks, and ensemble methods are commonly used. These models are adept at identifying patterns and extracting meaningful insights from large, complex datasets. Additionally, advanced analytics like causal modeling (causal inference methods) and sentiment analysis play a crucial role in interpreting the impact of external events and trends on consumer behavior.
An exemplary application of demand sensing is evident in how it enables businesses to respond proactively to weather-related fluctuations in demand. For instance, consider the scenario of an early November winter storm illustrated in Figure 2. Traditional forecasting methods might not accurately predict the sudden spike in demand for heating products associated with such an event. However, demand sensing models, equipped with real-time weather data and predictive analytics, can anticipate this shift.
These models, integrating advanced algorithms and real-time data, as depicted in Figure 2, analyze weather patterns and correlate them with potential impacts on consumer behavior. This enables businesses to forecast an increase in demand even without promotional discounts for specific products like electric radiators in this case. With this predictive insight, retailers can proactively manage inventory and supply chain logistics, ensuring availability and optimal distribution of these high-demand items.
Demand forecasting
Demand Forecasting is a critical component of supply chain and business planning, enabling companies to make informed decisions about production, inventory management, staffing, and other operational aspects. This process involves analyzing historical sales data, market trends, and other relevant factors to predict future customer demand over a typical horizon of 6 to 52 weeks.
At the heart of demand forecasting lie both statistical and machine learning models. These range from classical statistical methods like ARIMA (AutoRegressive Integrated Moving Average) and Exponential Smoothing, to advanced machine learning techniques such as Random Forests, Gradient Boosting Models, and deep learning methods such as Transformers. The choice of model often depends on the specific characteristics of the data, including seasonality, trend patterns, and responsiveness to external factors, and will be discussed in a separate chapter at the end of this blog post.
Incorporating big data analytics in demand forecasting allows for a deeper and more nuanced understanding of demand patterns. Analysis of extensive datasets encompassing customer behavior, market dynamics, and economic indicators enables these models to deliver more precise and detailed predictions. This level of sophistication in forecasting is particularly effective at higher aggregation levels, providing planners with valuable insights to manage production, labor, inventory, and capacity more efficiently.
A practical example of demand forecasting in action is in the optimization of inventory levels for seasonal products. Unlike demand sensing, which excels in the short-term horizon and at a granular level, demand forecasting provides a more accurate prediction at higher aggregation levels, crucial for seasonal planning. For instance, consider a major online retailer preparing for the holiday season. By analyzing historical sales data, current market trends, and upcoming promotional events, the retailer can use demand forecasting to estimate the expected sales volume of various products.
This predictive insight allows the retailer to optimize inventory levels, ensuring sufficient stock is available to meet customer demand without overstocking. Effective demand forecasting helps the retailer avoid potential revenue loss due to stockouts, and minimizes excess inventory that could lead to costly markdowns post-holiday season.
Through the application of sophisticated forecasting models, businesses can achieve a balance between supply and demand, ensuring operational efficiency and customer satisfaction. This empowers retailers to harness these insights, transforming data into actionable strategies for inventory management and operational planning.
Hierarchical forecasting
Hierarchical forecasting addresses the complexities of forecasting across various levels of data aggregation, such as SKU-level, category-level, store-level, and regional-level forecasts. This approach is particularly beneficial for large organizations with diverse product ranges and multiple sales channels. It ensures consistency and alignment across different levels of the organization, providing a coherent view that aids in strategic decision-making. An example of this tree hierarchy, based on product category or store location, is depicted in Figure 4.
The first crucial step in hierarchical forecasting is constructing a tree that accurately reflects the business structure and data relationships. Incorporating deep domain knowledge in this construction, especially factors like store location or product category, is essential for enhancing the accuracy of the forecasts. Figure 4 and 5 illustrate simple examples of such a hierarchy, demonstrating how products are grouped based on category or location (geography).
Hierarchical forecasting employs various reconcilers to ensure consistency among forecasts at different levels, including:
- Bottom-up reconciliation: Generating forecasts at the lowest (SKU) level and aggregating them upwards.
- Top-down reconciliation: Starting with a forecast at the top level and disaggregating it down to lower levels, such as SKUs.
- Middle-out reconcilers: Basing predictions on a middle level of the hierarchy, using bottom-up for higher levels and top-down for lower levels.
- Min-trace reconciliation: A sophisticated approach that minimizes squared errors for coherent forecasts, leveraging advanced statistical estimators.
Consider an online retailer implementing hierarchical forecasting. By creating a hierarchical tree that accounts for store locations, product categories, and individual SKUs, the retailer can accurately forecast demand at each level. This approach allows for inventory and marketing strategies that are tailored to each store, aligning them with both local demand and broader market trends. Figure 6 showcases how this process is facilitated. It features a stacked area graph demonstrating how forecasts at the SKU level aggregate to correctly match store-level sales forecasts, illustrating the capability to ensure accurate and coherent forecasting across different levels.
This approach supports various reconciler strategies, enabling retailers to test and identify the most effective method for their specific context. This ensures accurate, reliable demand forecasts across all organizational levels, enhancing strategic decision-making and operational efficiency.
Probabilistic forecasting
This approach not only provides forecasts but also accompanies each with a probability and a confidence interval, offering an uncertainty estimate for every prediction. It includes an option to configure these confidence intervals, allowing businesses to tailor the uncertainty bounds according to their specific risk tolerance and decision-making processes. This method acknowledges and quantifies the inherent uncertainty in forecasting, allowing businesses to make more informed decisions under varying conditions.
The essence of probabilistic forecasting lies in utilizing statistical models to generate probability distributions or confidence intervals for future outcomes. Techniques such as Monte Carlo simulations, Bayesian inference models, and ensemble methods are frequently used. These models excel in outlining a spectrum of possible outcomes and their probabilities, offering a comprehensive view of potential future scenarios.
Probabilistic forecasting is particularly useful in inventory management, especially in situations marked by high volatility or seasonal demand fluctuations. Consider a retailer managing inventory for a seasonally popular product. Utilizing probabilistic forecasting, the retailer gains insight not only into the most likely demand level but also into the range of possible demand levels, each bracketed by upper and lower confidence intervals, as illustrated in Figure 7.
This nuanced understanding enables the retailer to make risk-aware inventory decisions. For instance, if the probability of higher demand is significant, the retailer might choose to increase stock levels, thereby mitigating the risk of stockouts during peak periods. On the other hand, the knowledge of potential lower demand scenarios helps in avoiding overstocking.
These probabilistic models empower retailers to adeptly navigate the complexities of inventory management. By grasping the full range of potential outcomes and their associated probabilities, businesses are better positioned to optimize their inventory, balancing the dual objectives of satisfying customer demand and minimizing the risk of surplus stock.
Cross-effects forecasting
Cross-effects forecasting is an advanced approach that considers the interdependencies and influences among products within a portfolio, as well as the impact of competitors’ actions. This method is particularly crucial for understanding complex market dynamics, including cannibalization, halo effects, and competitive influences.
Cannibalization occurs when a new product’s introduction leads to a decrease in the sales of an existing product within the same portfolio. This is common when both products target similar customer segments. Accurate forecasting and management of cannibalization are essential for optimizing overall demand and product life cycles.
In contrast, the halo effect describes a scenario where the success of one product positively influences the sales of other related products in the portfolio. This often happens when a flagship product enhances brand awareness or value, leading to increased demand for other products under the same brand umbrella.
Moreover, competitor actions can significantly impact product performance. For example, a competitor’s aggressive marketing campaign or the launch of a rival product can affect the demand for similar products in your portfolio. Understanding and forecasting these competitive effects are key to maintaining market position and making strategic decisions.
Cross-effects forecasting plays a vital role in retail, particularly in managing a product portfolio. When introducing a new product line, it’s essential to forecast not only the demand for this new line but also its impact on existing products. Consider a retailer launching a new range of eco-friendly household products. This approach can analyze and predict how this introduction might affect the sales of existing household items. Figure 8 showcases a stacked area plot, demonstrating forecasts with and without considering cannibalization and halo effects. This visualization highlights the critical impact these factors can have on demand forecasting.
Incorporating cross-effects forecasting allows businesses to make well-informed decisions regarding product launches, marketing, and inventory management. It ensures a balanced and optimized product portfolio, aligned with market dynamics and consumer preferences.
Attribute-based forecasting
Attribute-based forecasting is an innovative method that utilizes detailed product attributes, such as text descriptions and images, to predict demand, especially for new or slow-moving products that lack historical sales data. By employing advanced technologies like text and image embedders, these attributes are transformed into quantifiable, vectorized data representations for analysis. This approach also considers attributes like product type, size, color, etc, to forecast demand based on the performance of similar products.
To find similar products in the embedding space, several similarity measurement techniques are utilized, such as cosine similarity, euclidean distance, jaccard similarity, and others. These techniques help in effectively identifying products with attributes that closely match those of the new product. Once similar products are identified, the typical approach is to average the forecasts of these products. The number of similar products considered, KNN (K Nearest Neighbors), can be configured according to the business’s requirements. This averaged forecast is then commonly used as a surrogate to predict the demand for the new product, assuming that products with similar attributes will exhibit similar market performance.
For instance, when a retailer is introducing a new USB cable, the approach can analyze its text descriptions and images, generating embeddings for the cable and comparing them with embeddings of existing products to find the most similar ones. Figure 9 showcases this process, illustrating how a search for products similar to the new USB cable is performed based on text and image embeddings. The figure also demonstrates how the average of the forecasts from these similar products is used to predict the demand for the new USB cable.
By employing attribute-based forecasting, retailers can make well-informed stocking and marketing decisions for new products, thereby reducing the uncertainties and risks typically associated with new product introductions.
Selecting the right ML technique and best practices
This chapter explores various techniques for demand sensing and forecasting, providing recommendations and best practices for their use.
Exponential Smoothing
Error-Trend-Seasonality (ETS) is an old time-series forecasting method that has been widely used for its simplicity and effectiveness in capturing seasonal patterns. However, in modern, more complex environments where there’s a lot of data and non-linear relationships are common, ETS’s simplicity can be a limitation. Its inability to incorporate external factors or adapt to rapid market changes makes it less advisable for contemporary demand forecasting.
The simplest of the ETS methods is called “Simple Exponential Smoothing” (SES), which is suitable for data with no trend or seasonal pattern. An alternative method for applying exponential smoothing while capturing trend and seasonality in the data is to use Holt-Winters’ Seasonal Method.
ETS lacks advanced forecasting capabilities and it is not suitable for probabilistic forecasting, attribute-based forecasting, cross-effect forecasting, etc.
Benefits: Fast to train. High interpretability; decisions made by the model can be easily explained.
Disadvantages: Cannot cover non-linear complexities of relationships in data. Limited in handling external factors or sudden market changes. Potential loss of forecast accuracy due to model simplification.
ARIMA/SARIMA
Autoregressive Integrated Moving Average (ARIMA), and its seasonal counterpart SARIMA, have been staples in time-series forecasting. These models are powerful for data with a clear trend or seasonal pattern but require stationary data to function correctly. While they can model complex time-series datasets, their reliance on manual parameter selection (e.g., p, d, q values for ARIMA) can be cumbersome and not intuitive for users without statistical training. Additionally, they struggle with large datasets and external data factors, limiting their applicability in dynamic market conditions.
While ARIMA/SARIMA can be adapted for short-term forecasting, their traditional formulation may not quickly respond to the rapid changes typical in demand sensing scenarios.
These models are not inherently designed for hierarchical demand forecasting either, which involves forecasting demand at multiple aggregation levels within a hierarchical structure. Although ARIMA/SARIMA models can be applied to individual time series within the hierarchy, they do not directly capture the hierarchical relationships or ensure consistency across levels.
It is possible to achieve probabilistic demand forecasting using ARIMA/SARIMA models, which provide probabilistic forecasts by incorporating uncertainty estimates into their predictions.
However, they are not suitable for incorporating non-temporal features like text or image embeddings directly, making them less viable for attribute-based forecasting.
The linear nature and focus on individual time series of ARIMA/SARIMA models may not adequately capture the complex interdependencies between products or services, such as the cross-product effect phenomenon.
Benefits: Capable of modeling a wide range of time-series data. High accuracy for data with clear trends or seasonal patterns.
Disadvantages: Requires stationary data. Manual tuning of parameters can be complex and non-intuitive. Struggles with large datasets and incorporating external variables.
Linear Regression/GLM Regression
Linear Regression (LR) and Generalized Linear Models (GLM) regression are foundational techniques for demand forecasting, prized for their interpretability and ability to estimate causal effects. These models work well when relationships between the forecasted demand and independent variables are linear and well-understood. However, their simplicity can be a drawback in scenarios where the underlying relationships are complex and nonlinear. They are recommended for use cases where understanding the relationship between variables is more important than achieving the highest possible accuracy.
LR and GLM can be effective for demand sensing and forecasting when the relationship between demand and its predictors is relatively linear or can be transformed to be linear. Their linear nature limits the complexity of relationships they can model, making them less ideal for datasets with intricate patterns, high volatility, or significant nonlinear interactions.
Although these models are primarily used for point predictions, by applying MCMC (Markov Chain Monte Carlo) or bootstrapping, they can be utilized for probabilistic demand forecasting.
However, due to their linear nature and assumptions, LR and GLM are limited in their effectiveness for complex cross-effects forecasting and are not good options for attribute-based forecasting, especially with high-dimensional text and image embeddings.
Benefits: Fast to train. High interpretability (model decisions can be easily explained).
Disadvantages: Cannot naturally cover seasonality or non-linear relationships. Simplification leads to potential loss of quality in forecasts.
Gradient Boosting Decision Tree (GBDT)
GBDT models, such as XGBoost, LightGBM, and CatBoost, have gained popularity for their ability to handle non-linear data and improve accuracy over linear models. By sequentially correcting errors from previous trees, GBDT can model complex patterns and interactions between variables. The trade-off, however, is a loss of interpretability compared to linear models. GBDT is advisable when the primary goal is accuracy, and the model’s complexity can be managed.
In comparison to previously mentioned models, GBDT can be an effective tool for accurate demand sensing and forecasting by leveraging lagged variables and external factors.
Although not designed for probabilistic forecasting, GBDT can be adapted for this purpose through techniques such as quantile regression, bootstrapping, model stacking, or parametric distribution fitting.
GBDT models do not inherently support hierarchical structures but can be adapted for hierarchical forecasting using bottom-up, top-down, or middle-out approaches and post-hoc reconciliation.
They are highly effective for attribute-based forecasting due to their flexibility in handling various types of features and modeling complex relationships within the data. For high-dimensional attributes such as image or text embeddings, deep learning forecasting methods are a better option.
Benefits: Handles non-linear data well. Improves accuracy over linear models.
Disadvantages: Reduced interpretability compared to simpler models. Model complexity can lead to overfitting if not managed properly.
Prophet
Prophet is a forecasting tool designed by Facebook (now Meta) for handling time series data with ease, even when the data exhibits strong seasonal effects, irregularities, and missing values. It stands out for its user-friendly approach to forecasting, enabling analysts and developers to produce high-quality forecasts without needing deep expertise in time series analysis. Prophet utilizes a decomposable time series model with three main components: trend, seasonality, and holidays. It is built on an additive model that incorporates various components to model different aspects of time series data, making it particularly suited for forecasting tasks with complex seasonal patterns or when data includes holidays and other special events.
Prophet can be effectively used for both demand forecasting and demand sensing. Still, for modeling complex interdependencies, non-linear relationships, or high-frequency demand sensing, machine learning models (RF, GBDT) or deep learning models might offer better performance.
Prophet can be utilized for hierarchical forecasting with some manual effort through independent forecasts at each hierarchical level followed by manual aggregation and reconciliation.
One of Prophet’s biggest strengths is its ability to generate forecasts that include uncertainty intervals, making it well-suited for probabilistic demand forecasting.
While Prophet offers a robust platform for time series forecasting with the capability to include additional regressors, its direct integration with image or text embeddings for attribute-based forecasting is limited. For applications where image data significantly influences demand predictions, exploring deep learning models that can natively integrate and learn from both time series and image data is more appropriate.
Directly modeling cross-effects such as cannibalization between products using Prophet is challenging, as Prophet is primarily designed for univariate time series forecasting.
Benefits: Handles missing data and trend changes well. Intuitive and easily adjustable for seasonality and holidays.
Disadvantages: May not perform as well with non-daily data or datasets lacking strong seasonal effects. Requires domain knowledge for setting holidays and seasonality adjustments.
Deep learning techniques
Deep learning techniques, such as Long Short-Term Memory (LSTM) networks and Transformer-based networks, have shown great promise in demand forecasting, especially in environments with large datasets or where capturing long-term dependencies is crucial. These models can automatically learn features from raw data, reducing the need for manual feature engineering. However, they require large amounts of data to train effectively and are computationally intensive. Deep learning is most effective in scenarios where traditional models fail to capture complex patterns or when the dataset is sufficiently large and rich.
Deep learning models can be effectively used for both demand sensing and demand forecasting.
Architectures like Recurrent Neural Networks (RNNs), LSTMs, and Temporal Convolutional Networks (TCNs) are particularly good at modeling multivariate time series data and can inherently capture hierarchical relationships. E.g., for hierarchical forecasting, RRN and variants such as LSTM, Gated Recurrent Units (GRU), can be extended to handle multiple time series simultaneously, learning the temporal patterns at each level of the hierarchy. By representing the hierarchical structure as a graph, Graph Neural Networks (GNN) can explicitly model the relationships between different nodes (e.g., products or regions) in a hierarchy. Attention-based models, particularly Transformers, can effectively handle long-range dependencies and complex relationships in data. For hierarchical forecasting, Transformers can be designed to attend to different parts of the hierarchy differently, learning which levels or nodes are most important for predicting demand at other levels.
Deep learning offers sophisticated methods for generating probabilistic forecasts by using Bayesian Neural Networks (BNN), or by applying MC Dropout techniques that can effectively turn the network into a Bayesian approximation, or by incorporating quantile regression in model training. Another approach is to apply mixture density loss functions.
Some deep learning models can process sequential data from multiple series, learning the dynamic interactions over time. For forecasting multiple steps into the future, sequence-to-sequence (Seq2Seq) models with an encoder-decoder architecture can be trained to predict future sales of all products simultaneously, taking into account the cross-effects in the prediction. Also, special design loss functions that penalize inaccuracies in forecasting the target product while considering the impact of other products can be used to enhance model accuracy.
Deep learning models are perfectly suited for attribute-based forecasting especially when using rich information contained within high dimensional image or text embeddings.
Benefits: Capable of capturing complex, non-linear relationships. Ability to automatically extract and learn important features from raw data. Effective in large datasets.
Disadvantages: Requires significant amounts of data and computational resources. Lower interpretability compared to traditional models.
Vector Autoregression (VAR)
Vector Autoregression (VAR) is a statistical model used to capture the linear interdependencies among multiple time series. It’s particularly effective in situations where the variables influence each other over time. In the context of demand forecasting, VAR can be instrumental in understanding how variables such as sales, marketing spend, economic indicators, and other external factors interact and impact demand.
VAR can be a powerful tool for both demand sensing and forecasting when dealing with multiple interrelated time series but may not always be the best approach due to its complexity and high dimensionality. For hierarchical demand forecasting, methods specifically designed to handle hierarchical structures, such as Hierarchical Time Series (HTS) method, are generally more suitable than VAR.
Benefits: Captures relationships between multiple interrelated time series. Useful for multivariate time series forecasting.
Disadvantages: High dimensionality can lead to model complexity and overfitting. Requires stationary data, making preprocessing necessary.
Automated Machine Learning (AutoML)
AutoML platforms like Google Cloud’s AutoML and AWS SageMaker offer streamlined development pipelines, automating many of the tedious aspects of model training, such as feature selection and hyperparameter tuning. These tools can significantly accelerate development cycles, making advanced modeling techniques more accessible. However, they may lack flexibility, and the models they produce can suffer from issues related to accuracy, runtime, and costs. AutoML is best suited for scenarios where speed of deployment is critical, and the limitations regarding flexibility and cost are manageable.
Benefits: Accelerates model development. Makes advanced modeling techniques more accessible.
Disadvantages: May lack flexibility in model customization. Potential issues with accuracy, runtime, and costs.
Conclusion
Selecting the right forecasting method depends on various factors, including the nature of the data, the business context, and the specific requirements of accuracy, interpretability, and computational resources. While no one-size-fits-all solution exists, understanding the strengths and limitations of each method can help businesses tailor their forecasting approaches to meet their unique challenges, ultimately leading to more informed decision-making and better outcomes.
Final words
Leveraging years of experience and numerous successful projects in demand forecasting, Grid Dynamics developed the Demand Sensing and Forecasting Starter Kit. This culmination of expertise and the best practices described in this blog post is designed to redefine how businesses approach demand forecasting. Dive into the future with our starter kit, and contact us today.