Version: Next

Forecasting (FCST)

Updated 2024.05.17

What is Forecasting?

Forecasting is an AI content that analyzes time series data to predict future values. This technology learns patterns from past data through various prediction models and uses them to predict future trends and values. It helps users understand the flow of data and anticipate future changes, enabling better decision-making.

When to use Forecasting?

Forecasting AI content is an effective tool applicable in sales, supply chain management (SCM), procurement, and marketing. Here are some use cases for each field:

Sales: Predict product demand to optimize inventory management and sales. Analyze customer purchase patterns to provide personalized services to individual customers.
Supply Chain Management (SCM): Manage inventory effectively through demand forecasting to reduce costs and improve productivity. Use prediction models to identify and mitigate supply chain risks.
Procurement: Forecast supplier capabilities to create effective procurement plans. Predict price fluctuations of raw materials or components to optimize procurement costs.
Marketing: Predict consumer purchase behavior and preferences to leverage targeted marketing, forecast advertising effectiveness, and optimize advertising budgets to improve marketing ROI.

Key Features

Forecasting boasts excellent performance and high efficiency, requiring minimal training resources while delivering superior results. The majority of data preprocessing and prediction processes are automated, allowing users with limited knowledge of time series data analysis to easily and quickly apply and deploy powerful prediction models.

Automated Preprocessing and Modeling

Forecasting simplifies the complex preprocessing tasks required for time series data analysis. It selectively applies preprocessing tasks such as time data type conversion, handling duplicate values, and managing missing values, making it easy to prepare data suitable for time series models. This minimizes the time and effort spent on developing data preprocessing code, enabling users to build prediction models more efficiently and quickly.

Easy Analysis Reports

Forecasting provides interactive reports on automated training/inference results according to the AIOps pipeline. Users can perform quick data exploration based on these reports to rapidly identify data anomalies and review prediction results.

Fast Speed and Low Memory Requirements

Forecasting ensures efficient AI model operations through code optimization, providing models with excellent accuracy while guaranteeing fast speed and low memory usage.

Wide Range of Models (TBD)

Forecasting, developed based on Darts, offers various models. Currently, the Nbeats model is available, and new models will be added through periodic updates. Users can utilize the latest technology for time series data prediction by simply selecting a model, even without in-depth knowledge of Darts or NBeats.

Quick Start

Installation

Install ALO. Learn more: Start ALO
Download the contents using the git URL: https://github.com/mellerikat-aicontents/Forecasting.git
Installation code: git clone https://github.com/mellerikat-aicontents/Forecasting.git solution (Run this within the ALO installation folder)

Data Preparation

Prepare the train data file for training and the test data file for inference (if not conducting inference, the test data file is not needed).
Data files should be composed of columns as specified in the yaml file (y_column, time_column, groupkey_column, x_covariates, static_covariates) and currently only support the csv extension. Learn more: Forecasting Data Structure
The time_column in the data file must have the same time_format as specified in the yaml file (e.g., 2023-03-01 -> %Y-%m-%d).
There should be no missing values in the time_column and groupkey_column.

Example file for univariable/univariate (if time_column is 'day' and y_column is 'average temperature'):

day average temperature
2024-03-01 5
2024-03-02 6
... ...

Example file for multivariable/univariate (if time_column is 'day', y_column is 'average temperature', and x_covariates are [minimum temperature, maximum temperature]):

day minimum temperature maximum temperature average temperature
2024-03-01 -1 8 5
2024-03-02 0 9 6
... ... ... ...

day	average temperature
2024-03-01	5
2024-03-02	6
...	...

day	minimum temperature	maximum temperature	average temperature
2024-03-01	-1	8	5
2024-03-02	0	9	6
...	...	...	...

Essential Parameter Settings

Modify the following data paths in forecasting/experimental_plan.yaml to the file paths you want to test. If only training is conducted, you do not need to modify load_train_inference_data_path.
```
external_path:
    - load_train_data_path: ./sample_data/train_input_path/
    - load_inference_data_path: ./sample_data/inference_input_path/
```

Modify the essential fields in the readiness asset of the train_pipeline to match your data. The following settings are essential for training/inference. Set the sample_frequency according to the actual data accumulation frequency. Currently, only year, month, week, day, hour, minute, and second units are supported. If the data is organized by groups such as store names or region names, specify the column name for distinguishing these groups in groupkey_column. If there are variables that affect predictions, add them to x_covariates if they change over time or to static_covariates if they do not change.

    - step: readiness
      args:
        - y_column: target                     # (str), The target column to be predicted
          time_column: log_date                # (str), The column containing time data
          time_format: "%Y-%m-%d"              # (str), The format of the time data
          sample_frequency: daily              # (str), Prediction frequency (yearly, monthly, weekly, daily, hourly, minutely, secondly)
          input_chunk_length: 6                # (int), Length of training data (number of data points)
          forecast_periods: 3                  # (int), Length of forecast (number of data points)
          ## Add the following only if applicable.
          groupkey_column: shop_name           # (str), Group column name
          x_covariates: []                     # (list), Time-varying independent variables
          static_covariates: []                # (list), Group-dependent, time-invariant variables

Modify the essential fields in the train asset of the train_pipeline to match your data. The following settings are essential for training/inference.
```
    - step: train
      args:
        - forecaster_name: nbeats                     # (str), The model to be used
```
For optional parameter settings, refer to the page on the right. Learn more: Forecasting Parameter

Execution

Run in terminal or Jupyter notebook. Learn more: Develop AI Solution
The execution results will include the trained model files, prediction results, and performance metrics.

Topics

Forecasting Version: 2.1.0, ALO Version: 2.3.1

What is Forecasting?​

When to use Forecasting?​

Key Features​

Automated Preprocessing and Modeling​

Easy Analysis Reports​

Fast Speed and Low Memory Requirements​

Wide Range of Models (TBD)​

Quick Start​

Installation​

Data Preparation​

Essential Parameter Settings​

Execution​

Topics​