Building the Pipeline
To register the modeling code created by the user as an AI Solution, the created modeling code must be converted into an ALO format. This process requires seven major modifications.
Follow the guide below to convert the modeling code for creating an AI Solution.
Developing the Pipeline
This section explains the process of converting user-written modeling code into an ALO format. Users do not need a deep understanding of ALO v3 and can easily and conveniently convert the code with minimal modifications to the logic code.
1. Add a pipeline argument to the function definition
Contents such as the data storage path are provided through the pipeline, along with logger functionality.
Example)
def preprocess(): → def preprocess(pipeline: dict): # Add pipeline to all functions
The logger functionality provided by ALO can be used as follows.
Example) "train" can be modified to the content the user wants to log.
...
logger = pipeline['logger'] # ALO syntax
logger.debug("train") # "train" can be modified
...
2. Modify the part where specific paths are loaded in the existing code to be received through the pipeline
Example: Existing code → ALO format code
def train(pipeline: dict):
...
pd.read('a.csv') → pd.read(pipeline['dataset']['workspace']"/a.csv")
...
def inference(pipeline: dict):
...
pd.read('b.csv') → pd.read(pipeline['dataset']['workspace']"/b.csv")
...
3. To pass arguments between functions, use the return function
Example) Argument passing between functions is handled by function a's return → pipeline['a']['result']
when called in function b. Return types support dict, str, int, etc.
def preprocess(pipeline: dict):
logger = pipeline['logger']
logger.info("preprocess.")
logger.info(".")
return {"output": "preprocess is done"}
def train(pipeline: dict, x_columns=[], y_column=None, n_estimators=100):
logger = pipeline['logger']
logger.debug("train")
...
preprocess_check = pipeline['preprocess']['result'] # {"output: "preprocess is done"}
...
4. Modify the part where the model is saved and loaded
Example) This guide depends on the type of model.
### If the model is not pickle ###
def train(pipeline: dict):
...
# model save
model_path = pipeline['model']['workspace']
tf.save_model(model_path + "model.pb")
...
def inference(pipeline: dict):
...
# model load
model_path = pipeline['model']['workspace']
load_model(model_path)
### If the model is pickle ###
def train(pipeline: dict):
...
# model save
pipeline['model']['file_name'] = model
...
def inference(pipeline: dict):
...
# model load
model = pipeline['model']['file_name']
5. Return the variables you want to save in the result
Example) inference function return
return {
'extraOutput': '',
'summary': {
'result': f"#survived:{num_survived} / #total:{num_total}",
'score': round(survival_ratio, 3),
'note': "Score means titanic survival ratio",
'probability': {"dead": avg_proba_dead, "survived": avg_proba_survived}
}
}
6. (optional) allable arguments written in the experimental_plan.yaml file can be used in the function
Example) Usage of x_columns
### General usage ###
## titanic.py ##
def train(pipeline: dict, x_columns=[], y_column=None, n_estimators=100):
...
X = pd.get_dummies(df[x_columns])
...
### Using x_columns written in yaml file ###
## experimental_plan.yaml ##
train:
def: titanic.train
argument:
x_columns: [ 'Pclass', 'Sex', 'SibSp', 'Parch']
y_column: Survived
## titanic.py ##
def train(dict, pipeline: dict, x_columns=[], y_column=None, n_estimators=100)
...
X = pd.get_dummies([pipeline['train']['argument']['x_columns']])
...
7. Any other cases where a file save path is required (v2: get_extra_output_path())
Example)
def train(pipeline: dict, x_columns=[], y_column=None, n_estimators=100):
...
pipeline['train']['external_path'] = 'True'
...
Once completed, AI Solution users can directly execute the AI Solution through the 'alo run' CLI command.