Skip to main content
Version: Next

VAD Features

Updated 2025.05.12

Feature Overview

VAD pipeline

The pipeline of AI Contents consists of a combination of functions, which are functional units. The train pipeline consists of 3 functions, and the Inference pipeline consists of 4 functions.

Train pipeline

Input - Readiness -  Train

Inference pipeline

Input -  Readiness -  Inference

input

When VAD is trained, it operates as a tabular form of data called Ground Truth data, consisting of the path of the image and the correct answer label. VAD can only be trained with normal images, so if there are only normal images, you can write the label name as a name that indicates normal (e.g., OK, good, normal, etc.). The input function reads the Ground Truth data and passes it to the next function. If there is no file with information about the image in the inference stage, it will generate it with only the path of the image and pass it to the following function for inference. As you can see in the diagram above, it is passed as 'train' for training pipelines and 'inference' for inference pipelines.

readiness

Inspect the data quality of your Ground Truth data. If there is a missing value in the image path item or the answer sheet item, a warning is displayed and the missing value is excluded to proceed with learning. Also, check the appropriate argument settings based on the composition of the data, and change them in some cases.

modeling(train)

In VAD, the 'train function' reads the Ground Truth data passed from the 'input' and uploads the training data and validation data to the pytorch dataloader. According to the model name created by the user in 'experimental_plan.yaml' in advance, a FastFlow or PatchCore model supported by VAD is created and trained. When training is completed, the 'output.csv' file with the forecast result displayed if there is abnormal data in the verification data, the confusion matrix, and the performance table and graph that change when the classification report and 'anomaly_threshold' are changed. In the case of the performance table according to the 'anomaly_threshold', you can check the table and set the appropriate threshold value when inferring because you may need a model that maximizes specific performance depending on the AIOps operation situation.

modeling(inference)

The Inference Function reads the image based on the path of the Inference data passed from the Input Function, retrieves the best model saved in the previous training step, inferes the labels, and stores them. If you're experimenting and the answer label is in the file, save the performance. It also saves the 'inference_summary.yaml' file for operations.

The information required for each task may vary. Currently, if inference data is inferred for a single image, result stores the inference label, score stores the model's certainty (1-shanon entropy), result stores the total type-specific count, and score stores the average of the model's certainty. result, so it currently stores up to 32 characters.



Tips for using

Preparing Ground Truth Files and Image Data

Put your Ground Truth (GT) data and images in the same folder. Copy GT files and images from ALO to the execution environment in one step. GT data must consist of an image path (image_path) item and a label item. If you want to separate the training/validation dataset, create a new entry by separating the data into 'train' and 'valid' and write the name of the item in the 'train_validate_column'. If there are other items in the GT data, they will not be used for analysis. At this time, Lee Miji Doctor's Iz is recommended to be constant for all data. If you have images of different sizes, the performance may be reduced, and it is recommended to keep them at a certain size in advance by cropping or resize if possible. Data should be organized with only healthy data, or most of the healthy data and a small number of anomalies, and you can use several kinds of anomalies. For example, if you want to distinguish between cats and other animals, you can prepare about 90% of the cat images and the remaining 10% of the data of completely different types of anomalies, such as dogs, rats, and stuffed cats. Of course, if you prepare abnormal data that may occur in the actual operating environment, you can prevent performance degradation in the operating situation.

Choose how to set an anomaly detection master control point

VAD can only be trained with normal data, but if there is a small number of anomalies, a more rigorous anomaly detection model can be created. Strictly speaking, abnormal data is not used for learning, but it is used to control the 'anomaly_threshold', which is a number to determine whether there is an abnormality. Even if you do not set it specifically, if you expect a serious impact on performance, you will automatically change the setting method, but if you set it according to the situation, you can specify the method suitable for the VAD operating situation. If you have only normal data, the 'threshold_method' setting method provided by VAD is 'Percentile'. If you have very few anomalies, you should set it to 'Percentile' as well. It depends on the number of normal images, but if you have a little bit of anomaly data beyond at least 5, you can use the 'F1' method to find the 'anomaly_threshold' that suits you. You can use 'percentile' in your training argument to determine the level of anomaly_score to determine if there are only healthy images.

Choose a model

VAD supports two models: FastFlow and PatchCore. PatchCore typically performs better and takes less time to train than FastFlow, but it takes longer to infer and requires slightly more RAM memory than FastFlow. You can choose according to your operational infrastructure environment.

Setting learning conditions

There are three main learning conditions that can be set in VAD: image size ('img_size'), batch size ('batch_size'), and maximum number of training times ('max_epochs'). For fast and lightweight inference, VAD performs a resize according to the image size previously entered. In the case of FastFlow, the image shape required by the model is fixed, so it is fixed at 256 pixels wide and 256 pixels vertically. In the case of PatchCore, it can be entered in a rectangular form because it is inferred by cutting the patch internally. In the case of batch size, the faster the training is completed and the more stable the training is, but it consumes a lot of RAM memory in the training infrastructure. The maximum number of training is an argument that is only required in FastFlow, and deep learning algorithms repeat training for a preset number of training times, and if the number of training times is insufficient, under-fitting phenomenon occurs and sufficient performance cannot be secured. The longer it increases, the longer the training time, and since there is an EarlyStopping function, the training is completed if the performance is secured before the specified number of times.



Function Details

This is a detailed explanation of how to utilize the various features of Visual Anomaly Detection.

Train pipeline: input function

Read Ground Truth file

Read the Ground Truth file and determine the image path and label in the 'input function'.

Train pipeline: readiness function

Check data quality

Read the Ground Truth file, check the column name entered as argument, and remove the missing value. It also checks Ground Truth to see if the image actually exists in the image path and checks the labels for healthy images. VAD divides the training/validation dataset and uses the data present in the validation dataset to perform training in an optimized manner. If there is no entity name that separates the training/validation dataset, create it, and if so, check if it is separated in the way required by VAD. VAD can be used when only normal images are obtained or when it consists of a large number of normal images and a very small number of abnormal images. For 'readiness', change the normal/abnormal ratio for the validation data to the appropriate user argument ('threshold_method' and 'monitor_metric') based on the minimum number of abnormal data.

Train pipeline: modeling(train) function

Load image file

Read the image from the data in the "image_path" within Ground Truth.

Training parameters

If there is anomaly data in the validation data, the 'anomaly_threshold' that determines whether there is an anomaly is trained and set in a way that maximizes the F1-score. ten thousand