Skip to main content
Version: Next

FCST Parameter

Updated 2024.05.05

Overview of experimental_plan.yaml

To apply AI Contents to your data, you need to write the data information and the functions of the Contents you want to use in the experimental_plan.yaml file. When you install AI Contents in the solution folder, you can find the pre-written experimental_plan.yaml file under the contents folder. By entering 'data information' and modifying/adding 'user arguments' provided by each asset in this yaml file, you can generate a data analysis model with the desired settings using ALO.

Structure of experimental_plan.yaml

The experimental_plan.yaml contains various settings needed to run ALO. By modifying the 'data path' and 'user arguments' parts of these settings, you can use AI Contents immediately.

Entering Data Paths (external_path)

  • The external_path parameters specify the paths to load or save files. If save_train_artifacts_path and save_inference_artifacts_path are not specified, the modeling artifacts are saved in the default paths train_artifacts and inference_artifacts folders.
external_path:
- load_train_data_path: ./solution/sample_data/train
- load_inference_data_path: ./solution/sample_data/test
- save_train_artifacts_path:
- save_inference_artifacts_path:
Parameter NameDEFAULTDescription and Options
load_train_data_path./sample_data/train/Path to the folder containing training data (no file names)
load_inference_data_path./sample_data/test/Path to the folder containing inference data (no file names)

User Parameters (user_parameters)

  • The step under user_parameters represents the asset name. Below, step: input indicates the input asset step.
  • args represents the user arguments for the input asset (step: input). User arguments are data analysis-related settings provided by each asset. Refer to the User arguments explanation below for details.
user_parameters:
- train_pipeline:
- step: input
args:
- file_type
...
ui_args:
...

Explanation of User Arguments

What are User Arguments?

User arguments are parameters for each asset's operation, written under args in each asset step of experimental_plan.yaml. AI Contents provide user arguments to apply various functions to the data. Users can modify and add user arguments to perform modeling that suits their data. User arguments are divided into "Required arguments," which are pre-written in experimental_plan.yaml, and "custom arguments," which users can add by referring to the guide provided by each asset.

Required Arguments

  • Required arguments are basic arguments that are immediately visible in experimental_plan.yaml. Most Required arguments have default values. If default values exist, the user does not need to set values separately for the arguments to operate with the default values.
  • Among the Required arguments in experimental_plan.yaml, data-related arguments must be set by the user. (ex. x_columns, y_column)

Custom Arguments

  • Custom arguments are not written in experimental_plan.yaml but are functionalities provided by the asset. Users can add these arguments under 'args' of each asset in experimental_plan.yaml.

The FCST pipeline consists of Input - Readiness - Preprocess - Modeling(train/inference) - Output assets, with user arguments structured differently according to each asset's function. First, use the Required user arguments written in experimental_plan.yaml, and then add user arguments to create an FCST model tailored to your data!


Summary of User Arguments

Below is a summary of the user arguments for FCST. Click on the 'Argument Name' to go to the detailed explanation of the respective arguments.

Default

  • The 'Default' field indicates the default value of the respective user argument.
  • If there is no default value, it is indicated as '-'.
  • If there is logic in the default, it is indicated as 'Refer to the explanation'. Click on the 'Argument Name' for detailed explanation.

ui_args

  • The 'ui_args' column in the table below indicates whether the ui_args function, which allows changing argument values in the AI Conductor UI, is supported.
  • O: If you enter the argument name under ui_args in experimental_plan.yaml, you can change the argument values in the AI Conductor UI.
  • X: ui_args functionality is not supported.
  • For detailed explanation of ui_args, refer to the guide. Write UI Parameter
  • The FCST experimental_plan.yaml pre-writes all potential ui_args for user arguments under ui_args_detail.

Required User Settings

  • The 'Required User Settings' column in the table below indicates whether the user must check and change the user argument for the AI Contents to operate.
  • O: These are generally arguments related to the task and data that the user must check before modeling.
  • X: If the user does not change the value, modeling proceeds with the default value.
Asset NameArgument TypeArgument NameDefaultDescriptionRequired User Settingsui_args
InputCustomfile_typecsvEnter the file extension of the input data.OX
InputCustomencodingutf-8Enter the encoding type of the input data.XX
ReadinessRequiredy_columntargetEnter the name of the target column to predict.OO
ReadinessRequiredtime_columntimeEnter the name of the column containing time information.OO
ReadinessRequiredtime_format“%Y-%m-%d”Enter the format of the time information.OO
ReadinessRequiredsample_frequencydailyEnter the frequency of the time information. Available values: yearly, monthly, weekly, daily, hourly, minutely, secondlyOO
ReadinessRequiredinput_chunk_length6Enter the length of the input time series for the model. Please enter the value based on the unit set in sample_frequency.OO
ReadinessRequiredforecast_periods3Enter the length of the time series to predict for the model. Please enter the value based on the unit set in sample_frequency.OO
ReadinessCustomgroupkey_columnNoneEnter the name of the column containing group key information, if available.XX
ReadinessCustomx_covariates[]Enter a list of names of x columns that change over time, if available.XX
ReadinessCustomstatic_covariates[]Enter a list of names of columns containing unique information for each group, such as franchise names or equipment types, if available.XX
ReadinessCustomstatic_cov_unify_methodlatestIf static_covariates are not the same within a group, unify them into one value. Choices are “oldest” (earliest value), “latest” (most recent value), “most_common” (most frequent value).XX
PreprocessCustomnormalizing_methodminmaxEnter the data normalization method.XX
PreprocessCustomencoding_methodonehotEnter the encoding method for categorical variables.XX
PreprocessCustomlinear_interpolationFalseIf True, linear interpolation will fill in any missing values within the time series data for each group. Recommended only if missing values in the middle of data are a problem.XX
PreprocessCustomglobal_padding_interpolationFalseIf True, pads the time series data for each group to match the minimum and maximum time indices. Recommended only if the start and end times for each group should be identical.XX
PreprocessCustomglobal_padding_methodzeroEnter the padding method: "zero" for zero padding, "mean" for padding with the group mean value, "same" for padding with the earliest and latest values in the group.XX
PreprocessCustomglobal_time_index_beginNoneEnter the minimum time index, if available. If blank, it defaults to the minimum time index in all groups. Must match the time_format in readiness.XX
PreprocessCustomglobal_time_index_endNoneEnter the maximum time index, if available. If blank, it defaults to the maximum time index in all groups. Must match the time_format in readiness.XX
PreprocessCustomoutlier_smoothingFalseIf True, detects outliers in x covariates for each group using the isolationforest method and replaces them with the previous values. Recommended only if outliers affect prediction.XX
PreprocessCustomisolationforest_contamination0.001The proportion of outliers in the entire time series for the isolationforest model. Typically, values between 0 and 0.3 are used.XX
PreprocessCustomexpand_featuresFalseIf True, generates features for x covariates using the tsfresh package. Recommended for machine learning models, depending on resource availability.XX
PreprocessCustomexpand_methodminimalEnter the feature generation method in the tsfresh package: "minimal" for statistical features only, "comprehensive" for all features.XX
PreprocessCustomensure_stationarityFalseIf True, checks the stationarity of x covariates and transforms them by taking the square root and first difference if not stationary. Recommended for machine learning models.XX
TrainRequiredforecaster_namenbeatsSelect the model to use for forecasting. Available value: nbeats.OX
TrainCustomdo_validationTrueWhether to divide evaluation data for performance evaluation. Select False if there are too many group keys.XX
TrainCustomcv_numbers1The number of divisions for cross-validation. Recommended to set to 1 for experiments.XX
TrainCustomfull_trainTrueIf do_validation is True, whether to train the final model on the entire data. Set to True to reflect the latest trends in the final model.XX
TrainCustomoptimize_parametersFalseWhether to run hyper-parameter optimization. Recommended to set to False considering running time if there is a lot of data.XX
TrainCustomuse_gpuFalseWhether to use GPU. Recommended to set to True if there is a lot of data and GPU is available.XX
TrainCustommemory_checkFalseFunction to check memory usage during training and inference.XX
TrainCustomruntime_checkFalseFunction to check the execution time during training. When memory check is enabled, it affects runtime, so set memory check to False.XX
TrainCustommetric_to_comparemaeEvaluation metric. Available values: mae, mape, smape, mse, rmse, r2_scoreXX
TrainCustommodel_parameters{nbeats: {“n_epochs”: 2, “batch_size”: 800,...}}Parameters related to model training. If not set, the model is trained with default parameters. See the detailed parameter explanation below.XX

Detailed Explanation of User Arguments

Input Asset

file_type

Enter the file extension of the input data.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • csv (default)
  • Usage:
    • file_type : csv
  • ui_args: X

encoding

Enter the encoding type of the input data.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • utf-8 (default)
  • Usage:
    • encoding : utf-8
  • ui_args: X

Readiness Asset

y_column

Enter the name of the target column to predict.

  • Argument type: Required
  • Input type: string
  • Available values:
    • '' (default)
  • Usage:
    • y_column : target
  • ui_args: O

time_column

Enter the name of the column containing time information.

  • Argument type: Required
  • Input type: string
  • Available values:
    • '' (default)
  • Usage:
    • time_column : time
  • ui_args: O

time_format

Enter the format of the time information.

  • Argument type: Required
  • Input type: string
  • Available values:
    • “%Y-%m-%d” (default)
  • Usage:
    • time_format : “%Y-%m-%d”
  • ui_args: O

sample_frequency

Enter the frequency of the time information.

  • Argument type: Required
  • Input type: string
  • Available values:
    • daily (default)
    • yearly
    • monthly
    • weekly
    • daily
    • hourly
    • minutely
    • secondly
  • Usage:
    • sample_frequency : daily
  • ui_args: O

input_chunk_length

Enter the length of the input time series for the model. Please enter the value based on the unit set in sample_frequency.

  • Argument type: Required
  • Input type: integer
  • Available values:
    • 6 (default)
  • Usage:
    • input_chunk_length : 6
  • ui_args: O

forecast_periods

Enter the length of the time series to predict for the model. Please enter the value based on the unit set in sample_frequency.

  • Argument type: Required
  • Input type: integer
  • Available values:
    • 3 (default)
  • Usage:
    • forecast_periods : 3
  • ui_args: O

groupkey_column

Enter the name of the column containing group key information, if available.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • None (default)
  • Usage:
    • groupkey_column : region
  • ui_args: X

x_covariates

Enter a list of names of x columns that change over time, if available.

  • Argument type: Custom
  • Input type: list
  • Available values:
    • [] (default)
  • Usage:
    • x_covariates : []
  • ui_args: X

static_covariates

Enter a list of names of columns containing unique information for each group, such as franchise names or equipment types, if available.

  • Argument type: Custom
  • Input type: list
  • Available values:
    • [] (default)
  • Usage:
    • static_covariates : []
  • ui_args: X

static_cov_unify_method

If static_covariates are not the same within a group, unify them into one value. Choices are “oldest” (earliest value), “latest” (most recent value), “most_common” (most frequent value).

  • Argument type: Custom
  • Input type: string
  • Available values:
    • latest (default)
  • Usage:
    • static_cov_unify_method : latest
  • ui_args: X

Preprocess Asset

normalizing_method

Enter the data normalization method.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • minmax (default)
    • z-norm
  • Usage:
    • normalizing_method : minmax
  • ui_args: X

encoding_method

Enter the encoding method for categorical variables.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • onehot (default)
    • label
  • Usage:
    • encoding_method : onehot
  • ui_args: X

linear_interpolation

If True, linear interpolation will fill in any missing values within the time series data for each group. Recommended only if missing values in the middle of data are a problem.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
    • True
  • Usage:
    • linear_interpolation : False
  • ui_args: X

global_padding_interpolation

If True, pads the time series data for each group to match the minimum and maximum time indices. Recommended only if the start and end times for each group should be identical.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
    • True
  • Usage:
    • global_padding_interpolation : False
  • ui_args: X

global

_padding_method Enter the padding method: "zero" for zero padding, "mean" for padding with the group mean value, "same" for padding with the earliest and latest values in the group.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • zero (default)
    • mean
    • same
  • Usage:
    • global_padding_method : zero
  • ui_args: X

global_time_index_begin

Enter the minimum time index, if available. If blank, it defaults to the minimum time index in all groups. Must match the time_format in readiness.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • None (default)
  • Usage:
    • global_time_index_begin : None
  • ui_args: X

global_time_index_end

Enter the maximum time index, if available. If blank, it defaults to the maximum time index in all groups. Must match the time_format in readiness.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • None (default)
  • Usage:
    • global_time_index_end : None
  • ui_args: X

outlier_smoothing

If True, detects outliers in x covariates for each group using the isolationforest method and replaces them with the previous values. Recommended only if outliers affect prediction.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • outlier_smoothing : False
  • ui_args: X

isolationforest_contamination

The proportion of outliers in the entire time series for the isolationforest model. Typically, values between 0 and 0.3 are used.

  • Argument type: Custom
  • Input type: float
  • Available values:
    • 0.001 (default)
  • Usage:
    • isolationforest_contamination : 0.001
  • ui_args: X

expand_features

If True, generates features for x covariates using the tsfresh package. Recommended for machine learning models, depending on resource availability.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • expand_features : False
  • ui_args: X

expand_method

Enter the feature generation method in the tsfresh package: "minimal" for statistical features only, "comprehensive" for all features.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • minimal (default)
    • comprehensive
  • Usage:
    • expand_method : minimal
  • ui_args: X

ensure_stationarity

If True, checks the stationarity of x covariates and transforms them by taking the square root and first difference if not stationary. Recommended for machine learning models.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • ensure_stationarity : False
  • ui_args: X

Train Asset

forecaster_name

Select the model to use for forecasting. More models supported by Darts will be added in the future.

  • Argument type: Required
  • Input type: string
  • Available values:
    • nbeats (default)
  • Usage:
    • forecaster_name : nbeats
  • ui_args: X

do_validation

Whether to divide evaluation data for performance evaluation. Select False if there are too many group keys.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • True (default)
  • Usage:
    • do_validation : True
  • ui_args: X

cv_numbers

The number of divisions for cross-validation. Recommended to set to 1 for experiments.

  • Argument type: Custom
  • Input type: integer
  • Available values:
    • 1 (default)
  • Usage:
    • cv_numbers : 1
  • ui_args: X

full_train

If do_validation is True, whether to train the final model on the entire data. Set to True to reflect the latest trends in the final model.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • True (default)
  • Usage:
    • full_train : True
  • ui_args: X

optimize_parameters

Whether to run hyper-parameter optimization. Recommended to set to False considering running time if there is a lot of data.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • optimize_parameters : False
  • ui_args: X

use_gpu

Whether to use GPU. Recommended to set to True if there is a lot of data and GPU is available.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • use_gpu : False
  • ui_args: X

memory_check

Function to check memory usage during training and inference.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • memory_check : False
  • ui_args: X

runtime_check

Function to check the execution time during training. When memory check is enabled, it affects runtime, so set memory check to False.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • False (default)
  • Usage:
    • runtime_check : False
  • ui_args: X

metric_to_compare

Evaluation metric.

  • Argument type: Custom
  • Input type: string
  • Available values:
    • mae (default)
    • mape
    • smape
    • mse
    • rmse
    • r2_score
  • Usage:
    • metric_to_compare : mae
  • ui_args: X

model_parameters

Parameters related to model training. If not set, the model is trained with default parameters. See the detailed parameter explanation below.

  • Argument type: Custom
  • Input type: dictionary
  • Available values:
    • {nbeats: {“n_epochs”: 2, “batch_size”: 800,...}} (default)
  • Usage:
    • model_parameters : {nbeats: {"n_epochs": 2, "batch_size": 800, "dropout": 0,"generic_architecture": True, "num_stacks": 30, "num_blocks": 1, "num_layers": 4, "layer_widths": 256, "activation": "ReLU", "expansion_coefficient_dim": 128, "trend_polynomial_degree": 4}}
  • ui_args: X

FCST Version: 2.1.0