Version: Next

Creating Assets

Updated 2024.05.05

Once you have a good understanding of experimental_plan.yaml, you need to understand each asset installed in the assets folder after installing the Titanic example solution and running ALO's main.py. Follow the guide below to create these assets from scratch.

Topics

Developing Assets

Developing Assets

For example, if you want to connect just two steps, input and train, to form a train pipeline, create input and train folders under the assets folder and create asset_input.py and asset_train.py files respectively.
Also, create a requirements.txt in each of the input and train folders listing the dependencies needed for each step.

./{solution_name}/assets/
                        └ input/asset_input.py
                               /requirements.txt
                        └ train/asset_train.py
                               /requirements.txt

Next, copy the template below into asset_input.py and write code to read the data in the run function. As seen in the Titanic example, since the input step is the first step to read data in the ML Pipeline, do not call the self.asset.load_data() API.
For the necessary ALO APIs while writing the run function, refer to the Appendix: What is ALO API page.
Warning: The last folder name in the path obtained with self.asset.get_input_path() may differ between the solution development environment and the actual operating environment, so do not hardcode the folder name. Develop to read files by traversing the path obtained from self.asset.get_input_path() with os.listdir(), etc.

# asset_[step_name].py

# -*- coding: utf-8 -*-
import os
import sys
from alolib.asset import Asset

sys.path.append(os.path.dirname(os.path.abspath(__file__)))

#--------------------------------------------------------------------------------------------------------------------------
#    CLASS
#--------------------------------------------------------------------------------------------------------------------------
class UserAsset(Asset):
    def __init__(self, asset_structure):
        super().__init__(asset_structure)
        ## Load user_parameters for this Asset written in experimental_plan.yaml as a dict
        self.args = self.asset.load_args()
        ## Load config information passed from the previous Asset in this Asset as a dict
        self.config = self.asset.load_config()

    @Asset.decorator_run
    def run(self):
        ## Get the path to read input data
        input_path = self.asset.get_input_path()
        ##### Add code to read data from input_path ####
                     ## Your Code ##
        ################################################

        ## Pass data and config to the next Asset
        ## The following example passes the read data to the next asset and the config is passed without changes
        output_config = self.config

        self.asset.save_data(output_data)
        self.asset.save_config(output_config)

#--------------------------------------------------------------------------------------------------------------------------
#    MAIN
#--------------------------------------------------------------------------------------------------------------------------
if __name__ == "__main__":
    ua = UserAsset(envs={}, argv={}, data={}, config={})
    ua.run()

For Assets that are not the first in the ML Pipeline but come after (e.g., asset_train.py), copy the template below and develop it according to the purpose of the Asset based on the data loaded via self.asset.load_data() in the run function.

# asset_[step_name].py

# -*- coding: utf-8 -*-
import os
import sys
from alolib.asset import Asset

sys.path.append(os.path.dirname(os.path.abspath(__file__)))

#--------------------------------------------------------------------------------------------------------------------------
#    CLASS
#--------------------------------------------------------------------------------------------------------------------------
class UserAsset(Asset):
    def __init__(self, asset_structure):
        super().__init__(asset_structure)
        ## Load user_parameters for this Asset written in experimental_plan.yaml as a dict
        self.args = self.asset.load_args()
        ## Load config information passed from the previous Asset in this Asset as a dict
        self.config = self.asset.load_config()
        ## Load data information passed from the previous Asset in this Asset as a dict
        self.data = self.asset.load_data()

    @Asset.decorator_run
    def run(self):
        ##### Add code to train an AI model based on the loaded self.data ####
                                 ## Your Code ##
        ##############################################################

        self.asset.save_data(output_data)
        self.asset.save_config(output_config)

#--------------------------------------------------------------------------------------------------------------------------
#    MAIN
#--------------------------------------------------------------------------------------------------------------------------
if __name__ == "__main__":
    ua = UserAsset(envs={}, argv={}, data={}, config={})
    ua.run()

Once the Asset codes for input, train, etc., are completed, develop the inference pipeline in the same way. The input asset can be shared between the train pipeline and the inference pipeline if they read the same data format.
Then, write the args and requirements for each step in the user_parameters and asset_source sections of solution/experimental_plan.yaml and run python main.py to check the normal operation of training and inference.
Through repeated experiments, minimize the args part exposed to AI Solution users and optimize and enhance the code considering scalability. After completing the experiments, create your git repository for each asset, navigate to each asset folder, and push to git as follows:

cd {solution_name}/assets/[step_name]
git init
git add ./
git commit -m "Initial Commit"
git remote add origin [my_git_url]
git checkout -b v1.0.0
git push origin v1.0.0

cd ..
rm -rf [step_name]

Since you executed rm -rf [step_name] in the {solution_name}/assets/ path, the asset folder has been removed from the current working path.
Once you have pushed all assets to git in this way, change the source/code part of the asset_source section in solution/experimental_plan.yaml from local to the git address and write the git branch name.

#experimental_plan.yaml

asset_source:
    ...
    - inference_pipeline:
        ...
        - step: output
          source:
            code: [my_git_url]
            branch: v1.0.0
            requirements:
              - requirements.txt

Finally, create your git repository for the AI Solution in advance, navigate to the {solution_name}/solution/ path where experimental_plan.yaml is located, and push the contents of the solution folder to git as you did for the Asset.

cd {solution_name}/solution/
git init
git add ./
git commit -m "Initial Commit"
git remote add origin [my_solution_git_url]
git checkout -b v1.0.0
git push origin v1.0.0

Once this is completed, AI Solution users will be able to use it right away by installing the AI Solution git in the same path as ALO main.py as shown below and running python main.py.

git clone -b {my_branch} {my_git_url} solution

Developing Assets ​

Developing Assets