버전: docs v25.02

Pipeline 구축

Updated 2024.05.05

타이타닉에 예제에 대한 전체적인 흐름 파악이 되었다면, 타이타닉과 유사한 AI Contents의 Pipeline 구축을 아래 experimental_plan.yaml 가이드를 따라 진행합니다.

Topics

파이프라인 생성하기

파이프라인 생성하기

학습 파이프라인과 추론 파이프라인 제작 과정이 크게 다르지 않기 때문에 아래 experimental_plan.yaml에 대한 내용을 잘 이해한다면 파이프라인 제작이 가능합니다.

./solution 폴더 생성 및 experimental_plan.yaml 제작 하기

ALO 설치 (ALO 설치하기) 가 완료가 된 상태에서 터미널에 solution 폴더 생성 및 experimental_plan.yaml 를 작성 합니다. 작성 방식은 experimental_plan.yaml template 을 copy 하는 것으로 시작 합니다.

mkdir solution
cd solution
vi experimental_plan.yaml  ## vim 이외의 다른 파일 편집기를 사용해도 무방합니다.
## 아래 experimental_plan_template 을 copy 하고, yaml 파일을 저장 합니다.

아래 experimental_plan.yaml 템플릿을 복사하여 사용 합니다. 본 예제에서는 load_train(inference)_data_path 를 solution 폴더 하위 경로로 설정 하였기 때문에, 해당 위치에 학습 및 추론에 필요한 데이터를 옮겨 둡니다. 만약, 별도의 특정 위치에 데이터가 있을 경우 경로를 변경 합니다. 로컬 경로의 절대 주소 및 상대 주소를 지원하고, S3 경로도 지원하므로 매뉴얼 참조 (Appendix : What is Experimental Plan)하여 상황에 맞게 작성 합니다.

신규로 제작할 AI Contents 의 name & version 을 작성 합니다. 이는 추후 AI Conductor 에서 UI로 표현되며, 이력 관리에 사용 됩니다. 참고로 S3 접근 시엔 AWS 접근권한이 필요하므로 aws configure --profile {PROFILE NAME} 으로 access key 설정해둔 {PROFILE NAME}을 experimental_plan.yaml의 aws_key_profile에 작성합니다.

user_parameters 부분에서는 train_pipline과 inference_pipline의 각 step 명과 각 step의 args (arguments) 들을 작성합니다. 추후 asset 코드 개발 시, self.asset.load_args() API를 통해 각 step에 기입된 args들이 dict 형태로 반환됩니다. asset_source 부분에서는 현재 새로운 AI Contents를 개발 중이므로, code 부분은 local로 기입해줍니다. 모든 개발이 완료된 후 각 asset들을 고유의 git repository로 push하고 해당 git 주소 및 branch 명을 기입해줍니다.

ui_args는 추후 솔루션 등록 완료 후 EdgeConductor UI 상에서 사용자가 값을 조정 가능한 파라미터로써, 작성에 대한 상세 내용은 Write UI Parameter 페이지를 참고합니다.

## experimental_plan_template ....
name: My-Solution
version: 1.0.0

external_path:
    - load_train_data_path: ./solution/sample_data/train_data/
    - load_inference_data_path: ./solution/sample_data/inference_data/
    - save_train_artifacts_path:
    - save_inference_artifacts_path:
    - load_model_path:

external_path_permission:
    - aws_key_profile:

user_parameters:
    - train_pipeline:
      - step: input
        args:
          - x_columns:
            y_column:
		    ui_args:
      - step: train
        args:
		    ui_args:
## 하단의 ui_args_detail에 작성한 내용이 있다면 해당 argument를 EdgeConductor UI로 사용자에게 변경가능하도록 노출하고 싶을 때 아래에 argument 명 작성
#			- arg1
#			- arg2
    - inference_pipeline:
      - step: input
        args:
          - x_columns:
            y_column:
	 	    ui_args:
      - step: inference
        args:
		    ui_args:
      - step: output
        args:
		    ui_args:

asset_source:
    - train_pipeline:
      - step: input
        source:
          code: local # 혹은 {input Asset git 주소}
          branch: # {git branch 명}
          requirements:
            - pandas==1.5.3

      - step: train
        source:
          code: local
          branch: # {git branch 명}
          requirements:
            - scikit-learn
			- requirements.txt

    - inference_pipeline:
      - step: input
        source:
          code: local # 혹은 {input Asset git 주소}          
		  branch: # {git branch 명}
          requirements:
            - pandas==1.5.3 --force-reinstall

      - step: inference
        source:
          code: local
          branch: # {git branch 명}
          requirements: []

      - step: output
        source:
          code: local
          branch: # {git branch 명}
          requirements: []

control:
    - get_asset_source: once
    - backup_artifacts: True
    - backup_log: True
    - backup_size: 1000
    - interface_mode: memory
	- check_resource: False
    - save_inference_format: tar.gz

ui_args_detail:
## EdgeConductor UI에서 사용자가 파라미터 변경 필요할 경우 해당 파라미터에 대한 아래 내용 작성
#    - train_pipeline:
#        - step:
#          args:
#	- inference_pipeline:
#        - step:
#          args:

파이프라인 생성하기 ​

./solution 폴더 생성 및 experimental_plan.yaml 제작 하기​

파이프라인 생성하기

./solution 폴더 생성 및 experimental_plan.yaml 제작 하기