Tabular Classification/Regression (TCR) for ALO-ML v3
What is Tabular Classification/Regression?
TCR stands for Tabular Classification/Regression, which is an AI content for solving tabular data classification and regression problems. TCR provides a variety of machine learning models to solve classification and regression problems. TCR selects machine learning models with the know-how that data analysts have spent years solving various classification and regression problems, and finds the best parameters for each model and mounts them. With each model with its know-how, TCR compares the performance between models to find the best one. 1 of TCR's strengths is its ease of use. By entering a few parameters into TCR's experimental_plan.yaml file and running ALO, the user can generate the desired classification and regression model on the input data. In addition to modeling, TCR has a variety of features for Tabular data. Following the pipeline of TCR, we first check whether the data is suitable for modeling, the data that passes is automatically pre-processed, and finally the classification and regression modeling to generate the final model and forecast results. Since the inspection, pre-processing, and HPO functions of TCR are automated without any additional settings, users can model easily and easily without having to set many parameters. Additionally, TCR creates a template file for model experiments that can be added to the TCR to add a separate machine learning model to the TCR, making it easy to mount the new model and run the HPO with the existing model
When to use Tabular Classification/Regression?
TCR can be used for a variety of classification and regression modeling using tabular data. Regardless of the field, if there are multiple variables and label columns in the tabular data, TCR can be applied. TCR can be used in the following areas:
- Finance: Used to classify a customer's credit rating, forecast a company's default, and more. For example, if a customer's personal information, transaction history, credit history, and a label that represents the customer's credit rating exist, you can create a model to classify the customer's credit rating. Alternatively, you can create a regression model that analyzes the company's financial information, market trends, and more to predict default.
- Healthcare: Create models that classify the presence or absence of a specific disease (e.g., cancer, diabetes, etc.) using a patient's medical history, genetic information, and vital signs as inputs. This is very helpful for early detection and treatment of diseases.
- Marketing: Used for segmentation classification, customer churn forecast, ad effectiveness forecast, etc. For example, if you use a customer's purchase history, website visit history, personal information, etc. as variables, and if there is group label data for each customer, you can create a model to classify the customer's group. This can be used to create customer management and marketing strategies.
- Public sector: Used for crime forecasts, traffic forecasts, election outcome forecasts, and more. For example, you can use a region's demographics, past crime history, and economic conditions as inputs to create a model that classifies the likelihood of crime in a particular area.
Below are just two of the many real-world examples of real-world TCR in action.
Bolt Fastening Inspection
Bolt Fastening Inspection is a solution that analyzes torque and angle data generated during the bolting process to determine whether a bolt is properly tightened. The Bolt Fastening Inspection was developed as a TCR and is currently operating on 18 processes at the LG Magna Ramos plant's bolting line with high risk of bolt mixing.
Customer index development
TCR is used in the development of various customer indices. 'Customer Satisfaction Index' and 'Customer Experience Delivery Index' to discover complaints from potential customers, and 'ThinQ Home Customer Index' to improve the experience of customers using ThinQ have been developed as TCR, and TCR is embedded as a basic model in the customer index platform and is actively used in the development of customer indexes.
Key Features
Automate modeling with AutoML features
Data analysts don't have to worry about modeling anymore. TCR automates the process required for modeling and provides AutoML capabilities so that data analysts can focus on data discovery and analysis. With AutoML capabilities, TCR selects the right hyperparameters for model-specific data and selects the optimal model. TCR also includes the top 5 machine learning models most often used by data analysts, as well as a set of hyperparameters for each model. With TCR, data analysts can easily apply frequently used models to their data and select the best model through performance comparison.
Equipped with class imbalance processing and data preprocessing know-how
TCR is equipped with data inspection and missing value processing functions developed based on the practical know-how of data analysts. Without you having to specify the preprocessing methodology you need for modeling, TCR will determine the column type and data missing percentage of your data and apply the appropriate pretreatment methodology for your modeling. Users only need to input information about the data into TCR, and they can apply the data analyst's know-how to data inspection, preprocessing, and modeling
Rich modeling experiments without coding
TCR provides experimental parameters selected by advanced data analysts for a variety of modeling experiments. Users can refer to TCR's user arguments guide and enter various pre-processing and modeling experimental conditions into the experimental_plan.yaml and run TCR with the desired settings. By simply writing the experiment parameters to the YAML file, you can create various test cases from preprocessing to modeling to proceed with the experiment. Modeling experiments no longer require coding. Find the best model for your data by setting the parameters of TCR.
Quick Start
Installation
- Install ALO. Read More: Use AI Contents
- Install the content using the git address below.
- git url: https://github.com/mellerikat-aicontents/Tabular-Classification-Regression.git
- Installation code: git clone https://github.com/mellerikat-aicontents/Tabular-Classification-Regression.git -b v3.0.0 solution (run inside the ALO installation folder)
Required parameter settings
- Edit the data path below in 'solution/experimental_plan.yaml' to your user path.
train:
dataset_uri: [train_dataset/] # Change to user data path
inference:
dataset_uri: Change to inference_dataset/# User Data Path
- Enter 'x_columns' and 'y_column' for the train data in the 'argument' of 'function: readiness'
readiness:
def: pipeline.readiness
argument:
x_columns: [input_x0,input_x1,input_x2,input_x3]
y_column: target
- If you set only 1 and 2 above and run ALO, you can create a classification or regression model! If you want to set up advanced parameters to create a more data-specific model, please refer to the page on the right. Read More: TCR AI Parameter
Execution
- Go to the path where ALO is installed in the terminal and run the 'alo run' command. Read More: Use AI Contents
Topics
TCR Version: 3.0.0 & ALO Version: 3.0.0