Version: Next

FCST Parameter

Updated 2024.05.05

Overview of experimental_plan.yaml

To apply AI Contents to your data, you need to write the data information and the functions of the Contents you want to use in the experimental_plan.yaml file. When you install AI Contents in the solution folder, you can find the pre-written experimental_plan.yaml file under the contents folder. By entering 'data information' and modifying/adding 'user arguments' provided by each asset in this yaml file, you can generate a data analysis model with the desired settings using ALO.

Structure of experimental_plan.yaml

The experimental_plan.yaml contains various settings needed to run ALO. By modifying the 'data path' and 'user arguments' parts of these settings, you can use AI Contents immediately.

Entering Data Paths (`external_path`)

The external_path parameters specify the paths to load or save files. If save_train_artifacts_path and save_inference_artifacts_path are not specified, the modeling artifacts are saved in the default paths train_artifacts and inference_artifacts folders.

external_path:
    - load_train_data_path: ./solution/sample_data/train
    - load_inference_data_path:  ./solution/sample_data/test
    - save_train_artifacts_path:
    - save_inference_artifacts_path:

Parameter Name	DEFAULT	Description and Options
load_train_data_path	./sample_data/train/	Path to the folder containing training data (no file names)
load_inference_data_path	./sample_data/test/	Path to the folder containing inference data (no file names)

User Parameters (`user_parameters`)

The step under user_parameters represents the asset name. Below, step: input indicates the input asset step.
args represents the user arguments for the input asset (step: input). User arguments are data analysis-related settings provided by each asset. Refer to the User arguments explanation below for details.

user_parameters:
    - train_pipeline:
        - step: input
          args:
            - file_type
            ...
          ui_args:
            ...

Explanation of User Arguments

What are User Arguments?

User arguments are parameters for each asset's operation, written under args in each asset step of experimental_plan.yaml. AI Contents provide user arguments to apply various functions to the data. Users can modify and add user arguments to perform modeling that suits their data. User arguments are divided into "Required arguments," which are pre-written in experimental_plan.yaml, and "custom arguments," which users can add by referring to the guide provided by each asset.

Required Arguments

Required arguments are basic arguments that are immediately visible in experimental_plan.yaml. Most Required arguments have default values. If default values exist, the user does not need to set values separately for the arguments to operate with the default values.
Among the Required arguments in experimental_plan.yaml, data-related arguments must be set by the user. (ex. x_columns, y_column)

Custom Arguments

Custom arguments are not written in experimental_plan.yaml but are functionalities provided by the asset. Users can add these arguments under 'args' of each asset in experimental_plan.yaml.

The FCST pipeline consists of Input - Readiness - Preprocess - Modeling(train/inference) - Output assets, with user arguments structured differently according to each asset's function. First, use the Required user arguments written in experimental_plan.yaml, and then add user arguments to create an FCST model tailored to your data!

Summary of User Arguments

Below is a summary of the user arguments for FCST. Click on the 'Argument Name' to go to the detailed explanation of the respective arguments.

Default

The 'Default' field indicates the default value of the respective user argument.
If there is no default value, it is indicated as '-'.
If there is logic in the default, it is indicated as 'Refer to the explanation'. Click on the 'Argument Name' for detailed explanation.

ui_args

The 'ui_args' column in the table below indicates whether the ui_args function, which allows changing argument values in the AI Conductor UI, is supported.
O: If you enter the argument name under ui_args in experimental_plan.yaml, you can change the argument values in the AI Conductor UI.
X: ui_args functionality is not supported.
For detailed explanation of ui_args, refer to the guide. Write UI Parameter
The FCST experimental_plan.yaml pre-writes all potential ui_args for user arguments under ui_args_detail.

Required User Settings

The 'Required User Settings' column in the table below indicates whether the user must check and change the user argument for the AI Contents to operate.
O: These are generally arguments related to the task and data that the user must check before modeling.
X: If the user does not change the value, modeling proceeds with the default value.

Asset Name	Argument Type	Argument Name	Default	Description	Required User Settings	ui_args
Input	Custom	file_type	csv	Enter the file extension of the input data.	O	X
Input	Custom	encoding	utf-8	Enter the encoding type of the input data.	X	X
Readiness	Required	y_column	target	Enter the name of the target column to predict.	O	O
Readiness	Required	time_column	time	Enter the name of the column containing time information.	O	O
Readiness	Required	time_format	“%Y-%m-%d”	Enter the format of the time information.	O	O
Readiness	Required	sample_frequency	daily	Enter the frequency of the time information. Available values: yearly, monthly, weekly, daily, hourly, minutely, secondly	O	O
Readiness	Required	input_chunk_length	6	Enter the length of the input time series for the model. Please enter the value based on the unit set in sample_frequency.	O	O
Readiness	Required	forecast_periods	3	Enter the length of the time series to predict for the model. Please enter the value based on the unit set in sample_frequency.	O	O
Readiness	Custom	groupkey_column	None	Enter the name of the column containing group key information, if available.	X	X
Readiness	Custom	x_covariates	[]	Enter a list of names of x columns that change over time, if available.	X	X
Readiness	Custom	static_covariates	[]	Enter a list of names of columns containing unique information for each group, such as franchise names or equipment types, if available.	X	X
Readiness	Custom	static_cov_unify_method	latest	If static_covariates are not the same within a group, unify them into one value. Choices are “oldest” (earliest value), “latest” (most recent value), “most_common” (most frequent value).	X	X
Preprocess	Custom	normalizing_method	minmax	Enter the data normalization method.	X	X
Preprocess	Custom	encoding_method	onehot	Enter the encoding method for categorical variables.	X	X
Preprocess	Custom	linear_interpolation	False	If True, linear interpolation will fill in any missing values within the time series data for each group. Recommended only if missing values in the middle of data are a problem.	X	X
Preprocess	Custom	global_padding_interpolation	False	If True, pads the time series data for each group to match the minimum and maximum time indices. Recommended only if the start and end times for each group should be identical.	X	X
Preprocess	Custom	global_padding_method	zero	Enter the padding method: "zero" for zero padding, "mean" for padding with the group mean value, "same" for padding with the earliest and latest values in the group.	X	X
Preprocess	Custom	global_time_index_begin	None	Enter the minimum time index, if available. If blank, it defaults to the minimum time index in all groups. Must match the time_format in readiness.	X	X
Preprocess	Custom	global_time_index_end	None	Enter the maximum time index, if available. If blank, it defaults to the maximum time index in all groups. Must match the time_format in readiness.	X	X
Preprocess	Custom	outlier_smoothing	False	If True, detects outliers in x covariates for each group using the isolationforest method and replaces them with the previous values. Recommended only if outliers affect prediction.	X	X
Preprocess	Custom	isolationforest_contamination	0.001	The proportion of outliers in the entire time series for the isolationforest model. Typically, values between 0 and 0.3 are used.	X	X
Preprocess	Custom	expand_features	False	If True, generates features for x covariates using the tsfresh package. Recommended for machine learning models, depending on resource availability.	X	X
Preprocess	Custom	expand_method	minimal	Enter the feature generation method in the tsfresh package: "minimal" for statistical features only, "comprehensive" for all features.	X	X
Preprocess	Custom	ensure_stationarity	False	If True, checks the stationarity of x covariates and transforms them by taking the square root and first difference if not stationary. Recommended for machine learning models.	X	X
Train	Required	forecaster_name	nbeats	Select the model to use for forecasting. Available value: nbeats.	O	X
Train	Custom	do_validation	True	Whether to divide evaluation data for performance evaluation. Select False if there are too many group keys.	X	X
Train	Custom	cv_numbers	1	The number of divisions for cross-validation. Recommended to set to 1 for experiments.	X	X
Train	Custom	full_train	True	If do_validation is True, whether to train the final model on the entire data. Set to True to reflect the latest trends in the final model.	X	X
Train	Custom	optimize_parameters	False	Whether to run hyper-parameter optimization. Recommended to set to False considering running time if there is a lot of data.	X	X
Train	Custom	use_gpu	False	Whether to use GPU. Recommended to set to True if there is a lot of data and GPU is available.	X	X
Train	Custom	memory_check	False	Function to check memory usage during training and inference.	X	X
Train	Custom	runtime_check	False	Function to check the execution time during training. When memory check is enabled, it affects runtime, so set memory check to False.	X	X
Train	Custom	metric_to_compare	mae	Evaluation metric. Available values: mae, mape, smape, mse, rmse, r2_score	X	X
Train	Custom	model_parameters	{nbeats: {“n_epochs”: 2, “batch_size”: 800,...}}	Parameters related to model training. If not set, the model is trained with default parameters. See the detailed parameter explanation below.	X	X

Detailed Explanation of User Arguments

Input Asset

file_type

Enter the file extension of the input data.

Argument type: Custom
Input type: string
Available values:
- csv (default)
Usage:
- file_type : csv
ui_args: X

encoding

Enter the encoding type of the input data.

Argument type: Custom
Input type: string
Available values:
- utf-8 (default)
Usage:
- encoding : utf-8
ui_args: X

Readiness Asset

y_column

Enter the name of the target column to predict.

Argument type: Required
Input type: string
Available values:
- '' (default)
Usage:
- y_column : target
ui_args: O

time_column

Enter the name of the column containing time information.

Argument type: Required
Input type: string
Available values:
- '' (default)
Usage:
- time_column : time
ui_args: O

time_format

Enter the format of the time information.

Argument type: Required
Input type: string
Available values:
- “%Y-%m-%d” (default)
Usage:
- time_format : “%Y-%m-%d”
ui_args: O

sample_frequency

Enter the frequency of the time information.

Argument type: Required
Input type: string
Available values:
- daily (default)
- yearly
- monthly
- weekly
- daily
- hourly
- minutely
- secondly
Usage:
- sample_frequency : daily
ui_args: O

input_chunk_length

Enter the length of the input time series for the model. Please enter the value based on the unit set in sample_frequency.

Argument type: Required
Input type: integer
Available values:
- 6 (default)
Usage:
- input_chunk_length : 6
ui_args: O

forecast_periods

Enter the length of the time series to predict for the model. Please enter the value based on the unit set in sample_frequency.

Argument type: Required
Input type: integer
Available values:
- 3 (default)
Usage:
- forecast_periods : 3
ui_args: O

groupkey_column

Enter the name of the column containing group key information, if available.

Argument type: Custom
Input type: string
Available values:
- None (default)
Usage:
- groupkey_column : region
ui_args: X

x_covariates

Enter a list of names of x columns that change over time, if available.

Argument type: Custom
Input type: list
Available values:
- [] (default)
Usage:
- x_covariates : []
ui_args: X

static_covariates

Enter a list of names of columns containing unique information for each group, such as franchise names or equipment types, if available.

Argument type: Custom
Input type: list
Available values:
- [] (default)
Usage:
- static_covariates : []
ui_args: X

static_cov_unify_method

If static_covariates are not the same within a group, unify them into one value. Choices are “oldest” (earliest value), “latest” (most recent value), “most_common” (most frequent value).

Argument type: Custom
Input type: string
Available values:
- latest (default)
Usage:
- static_cov_unify_method : latest
ui_args: X

Preprocess Asset

normalizing_method

Enter the data normalization method.

Argument type: Custom
Input type: string
Available values:
- minmax (default)
- z-norm
Usage:
- normalizing_method : minmax
ui_args: X

encoding_method

Enter the encoding method for categorical variables.

Argument type: Custom
Input type: string
Available values:
- onehot (default)
- label
Usage:
- encoding_method : onehot
ui_args: X

linear_interpolation

If True, linear interpolation will fill in any missing values within the time series data for each group. Recommended only if missing values in the middle of data are a problem.

Argument type: Custom
Input type: string
Available values:
- False (default)
- True
Usage:
- linear_interpolation : False
ui_args: X

global_padding_interpolation

If True, pads the time series data for each group to match the minimum and maximum time indices. Recommended only if the start and end times for each group should be identical.

Argument type: Custom
Input type: string
Available values:
- False (default)
- True
Usage:
- global_padding_interpolation : False
ui_args: X

global

_padding_method Enter the padding method: "zero" for zero padding, "mean" for padding with the group mean value, "same" for padding with the earliest and latest values in the group.

Argument type: Custom
Input type: string
Available values:
- zero (default)
- mean
- same
Usage:
- global_padding_method : zero
ui_args: X

global_time_index_begin

Enter the minimum time index, if available. If blank, it defaults to the minimum time index in all groups. Must match the time_format in readiness.

Argument type: Custom
Input type: string
Available values:
- None (default)
Usage:
- global_time_index_begin : None
ui_args: X

global_time_index_end

Enter the maximum time index, if available. If blank, it defaults to the maximum time index in all groups. Must match the time_format in readiness.

Argument type: Custom
Input type: string
Available values:
- None (default)
Usage:
- global_time_index_end : None
ui_args: X

outlier_smoothing

If True, detects outliers in x covariates for each group using the isolationforest method and replaces them with the previous values. Recommended only if outliers affect prediction.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- outlier_smoothing : False
ui_args: X

isolationforest_contamination

The proportion of outliers in the entire time series for the isolationforest model. Typically, values between 0 and 0.3 are used.

Argument type: Custom
Input type: float
Available values:
- 0.001 (default)
Usage:
- isolationforest_contamination : 0.001
ui_args: X

expand_features

If True, generates features for x covariates using the tsfresh package. Recommended for machine learning models, depending on resource availability.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- expand_features : False
ui_args: X

expand_method

Enter the feature generation method in the tsfresh package: "minimal" for statistical features only, "comprehensive" for all features.

Argument type: Custom
Input type: string
Available values:
- minimal (default)
- comprehensive
Usage:
- expand_method : minimal
ui_args: X

ensure_stationarity

If True, checks the stationarity of x covariates and transforms them by taking the square root and first difference if not stationary. Recommended for machine learning models.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- ensure_stationarity : False
ui_args: X

Train Asset

forecaster_name

Select the model to use for forecasting. More models supported by Darts will be added in the future.

Argument type: Required
Input type: string
Available values:
- nbeats (default)
Usage:
- forecaster_name : nbeats
ui_args: X

do_validation

Whether to divide evaluation data for performance evaluation. Select False if there are too many group keys.

Argument type: Custom
Input type: string
Available values:
- True (default)
Usage:
- do_validation : True
ui_args: X

cv_numbers

The number of divisions for cross-validation. Recommended to set to 1 for experiments.

Argument type: Custom
Input type: integer
Available values:
- 1 (default)
Usage:
- cv_numbers : 1
ui_args: X

full_train

If do_validation is True, whether to train the final model on the entire data. Set to True to reflect the latest trends in the final model.

Argument type: Custom
Input type: string
Available values:
- True (default)
Usage:
- full_train : True
ui_args: X

optimize_parameters

Whether to run hyper-parameter optimization. Recommended to set to False considering running time if there is a lot of data.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- optimize_parameters : False
ui_args: X

use_gpu

Whether to use GPU. Recommended to set to True if there is a lot of data and GPU is available.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- use_gpu : False
ui_args: X

memory_check

Function to check memory usage during training and inference.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- memory_check : False
ui_args: X

runtime_check

Function to check the execution time during training. When memory check is enabled, it affects runtime, so set memory check to False.

Argument type: Custom
Input type: string
Available values:
- False (default)
Usage:
- runtime_check : False
ui_args: X

metric_to_compare

Evaluation metric.

Argument type: Custom
Input type: string
Available values:
- mae (default)
- mape
- smape
- mse
- rmse
- r2_score
Usage:
- metric_to_compare : mae
ui_args: X

model_parameters

Parameters related to model training. If not set, the model is trained with default parameters. See the detailed parameter explanation below.

Argument type: Custom
Input type: dictionary
Available values:
- {nbeats: {“n_epochs”: 2, “batch_size”: 800,...}} (default)
Usage:
- model_parameters : {nbeats: {"n_epochs": 2, "batch_size": 800, "dropout": 0,"generic_architecture": True, "num_stacks": 30, "num_blocks": 1, "num_layers": 4, "layer_widths": 256, "activation": "ReLU", "expansion_coefficient_dim": 128, "trend_polynomial_degree": 4}}
ui_args: X

FCST Version: 2.1.0

Overview of experimental_plan.yaml​

Structure of experimental_plan.yaml​

Entering Data Paths (external_path)​

User Parameters (user_parameters)​

Explanation of User Arguments​

What are User Arguments?​

Required Arguments​

Custom Arguments​

Summary of User Arguments​

Default​

ui_args​

Required User Settings​

Detailed Explanation of User Arguments​

Input Asset​

file_type​

encoding​

Readiness Asset​

y_column​

time_column​

time_format​

sample_frequency​

input_chunk_length​

forecast_periods​

groupkey_column​

x_covariates​

static_covariates​

static_cov_unify_method​

Preprocess Asset​

normalizing_method​

encoding_method​

linear_interpolation​

global_padding_interpolation​

global​

global_time_index_begin​

global_time_index_end​

outlier_smoothing​

isolationforest_contamination​

expand_features​

expand_method​

ensure_stationarity​

Train Asset​

forecaster_name​

do_validation​

cv_numbers​

full_train​

optimize_parameters​

use_gpu​

memory_check​

runtime_check​

metric_to_compare​

model_parameters​