FCST Features
Feature Overview
FCST Pipeline
The FCST AI Contents pipeline is composed of a combination of assets, each representing a functional unit. The FCST pipeline consists of a total of 5 assets.
Train Pipeline
Input - Readiness - Preprocess - Train - Output
Inference Pipeline
Input - Readiness - Preprocess - Inference - Output
Each step is divided into assets.
Input Asset
The Forecasting AI receives time series data in a tabular format with the time path and target label for training. The input asset reads these data files and passes them to the next asset. As shown in the diagram, it is passed to the train asset in the training pipeline and to the inference asset in the inference pipeline.
Readiness Asset
Checks the quality of the Ground Truth data. If there are missing values in the time path or target label, a warning is displayed and the missing values are excluded before proceeding with training.
Preprocess Asset
Performs necessary preprocessing tasks for time series data prediction. It selectively applies preprocessing tasks such as time data type conversion, handling duplicate values, and handling missing values to prepare the data suitable for the time series model.
Modeling (Train) Asset
The train asset in Forecasting AI performs several tasks. It first reads the time series data passed from the input asset and organizes the data based on the time path. Then, it creates and trains the model according to the model name and parameters specified in the experimental plan. Once training is complete, it saves the performance and prediction results from the training process.
Modeling (Inference) Asset
The inference asset reads the inference data paths received from the input asset, reads the data, loads the best model saved in the previous training phase, performs the prediction, and saves the results. If it is in the experimental stage and there are ground truth labels in the file, it also saves the performance.
Output Asset
The output asset handles the results needed for each project. Currently, if the inference data includes a single prediction, the result contains the predicted value. For multiple predictions, the result contains the predicted values for each type. The result field can store up to 32 characters.
Usage Tips
Choosing Data Paths
Forecasting AI receives data inputs in CSV file format. Specify the data path and CSV file name in the configuration file (experimental_plan.yaml). ALO copies the CSV files to the analysis environment. The data should include the time path (time_column), independent variables (x_covariates, static_covariates), and the target variable (y_column). If the data is organized into groups, specify the group column (groupkey_column) to perform predictions for each group.
Selecting a Model
Currently, the supported model is nbeats, with more models to be added in future updates. To select a prediction model, modify the forecaster_name field in the train asset within the experimental_plan.yaml file. Different models may perform better on different datasets, so try various models to find the best one.
Choosing Preprocessing Options
Forecasting AI offers convenient preprocessing options to handle data efficiently. Set preprocessing options such as handling missing values, duplicate values, and others to enhance model performance. Select appropriate preprocessing options to improve model accuracy.
Improving Performance and Resource Efficiency
Increasing the number of epochs can enhance performance for complex and large datasets. When you have a large amount of data and sufficient resources, increasing the batch_size can lead to faster training and better model performance. Adjust the train_ratio to balance the amount of data used for training and validation.
Understanding and Utilizing Prediction Results
The prediction results from the inference asset are saved by the output asset. The results include predicted values, which can be used to make better decisions. Both single and multiple predictions are supported.
Detailed Features
Detailed descriptions to utilize various features of Forecasting AI.
Train Pipeline: Input Asset
Read Time Series Data
Reads the time series data file and identifies the time path and target label.
Train Pipeline: Readiness Asset
Check Data Quality
Checks for missing values in the time path and target label columns of the time series data file. If there are missing values, it displays a warning and excludes the rows with missing values before proceeding with the training pipeline.
Train Pipeline: Preprocess Asset
Preprocess Time Series Data
Performs necessary preprocessing tasks such as time data type conversion, handling duplicate values, and handling missing values to prepare the data suitable for time series modeling.
Train Pipeline: Train Asset
Initialize Model
Creates the nbeats model as specified in the forecaster_name field of the experimental plan. Additional models will be supported in future updates.
Fit Model on Training Data
Trains the model for a specified number of epochs using the training dataset. During training, it saves the model with the best performance based on validation data.
Evaluate Score and Save Outputs
Calculates various performance metrics for the validation dataset and saves a summary of the performance. Saves the prediction results (prediction.csv) for the training dataset and the confusion matrix (confusion.csv) showing the true and predicted classes for the training dataset.
Inference Pipeline: Input Asset
Read Inference File or Generate It
Reads the inference file if available. If only images are provided, it generates the inference file based on the image paths.
Inference Pipeline: Inference Asset
Predict Time Series Data
Reads the data based on the time path in the inference file and uses the saved model from the training phase to perform predictions.
Inference Pipeline: Output Asset
Save Inference Results
Saves the inference results, which include predicted values. If there are no ground truth labels, performance metrics are not saved. The prediction results are saved with the predicted values.
FCST Version: 2.1.0