VC Input and Artifacts
Data Preparation
Preparing Training Data
- Prepare image data in .png or .jpg format with consistent shape (e.g., 1024x1024 pixels, 3 channels).
- Create a ground truth file in tabular form containing image paths and corresponding labels.
- Ensure each label type has at least 100 images to build a stable model.
- Currently, multi-class classification is supported, but multi-label classification (one image with multiple labels) is not.
- The ground truth file should be formatted as shown below.
- Place the ground truth file and image files in the same folder.
- For inference data, prepare the ground truth file similarly to the training data, or place the images in a single folder. (If there is no ground truth file for inference data, the path to the images will be used to generate an internal file.)
Example of GroundTruth.csv Training Dataset
| label | image_path |
|---|---|
| label1 | ./image1.png |
| label2 | ./image1.jpeg |
| label1 | ./image2.jpeg |
| ... | ... |
Example of Input Data Directory Structure
- Ground truth files can be multiple files but must have consistent column names.
./{train_folder}/
└ train_data1.csv
└ train_data2.csv
└ train_data3.csv
└ image1.png
└ image1.jpeg
└ image2.jpeg
./{inference_folder}/
└ data_a.csv
└ data_b.csv
└ image_test1.png
└ image_test1.jpeg
└ /{folder1}/
└ image_test2.jpeg
Example of Input Data Directory Structure (When there is no CSV file for inference data)
./{inference_folder}/
└ image_test1.png
└ image_test1.jpeg
└ /{inference_folder}/
└ image_test2.jpeg
Data Requirements
Mandatory Requirements
The input data must meet the following conditions:
| Index | Item | Spec. |
|---|---|---|
| 1 | Single label per image | Yes |
| 2 | Adherence to class folder format | Yes |
| 3 | Number of images per class (without duplicates) | 10-10,000 |
| 4 | Total number of classes | 0-20 |
| 5 | Fixed number of channels: 3 (RGB) or 1 (grayscale) | Yes |
| 6 | Image resolution | 32x32-1024x1024 pixels |
| 7 | Inference interval for new images | ≥ 10 seconds |
| 8 | Training interval | ≥ 12 hours |
Additional Requirements
These conditions ensure minimal performance. If not met, the algorithm will still run, but performance may be affected:
| Index | Item | Spec. |
|---|---|---|
| 1 | Unique image names regardless of class | Yes |
| 2 | Number of images per class (without duplicates) | 100-800 |
| 3 | Total number of classes | 0-20 |
| 4 | Image resolution | 224x224-1280x720 pixels |
| 5 | Region of Interest (ROI) size | ≥ 10x10 pixels (224x224 basis) |
| 6 | Fixed product position/orientation/distance from camera | Rotation ± 3°, Translation ≤ 1 pixel (224x224 basis) |
| 7 | Sharp image focus | Yes |
| 8 | No new untrained classes during AI operation | Yes |
| 9 | Consistent image capture environment during training/validation/operation (brightness, lighting, background) | Yes |
Artifacts
Running the training/inference pipeline generates the following artifacts:
Train Pipeline
./cv/train_artifacts/
└ models/train/
└ model.h5
└ model.tflite
└ params.json
└ output/
└ prediction.csv
└ extra_output/train/
└ eval.json
└ confusion.csv
Inference Pipeline
./cv/inference_artifacts/
└ output/inference/
└ prediction.csv
└ {imagefilename}.png OR {imagefilename}_xai_class{predicted_class}.png
└ extra_output/inference/
└ eval.json
└ confusion.csv
└ xai_result/
└ {imagefilename}_xai_class{predicted_class}.png
└ score/
└ inference_summary.yaml
The detailed descriptions of the artifacts are as follows:
model.h5
The trained Keras model file.
model.tflite
The trained TensorFlow Lite model suitable for embedded environments.
params.json
A JSON file containing parameters used during training.
prediction.csv
Contains the model's predictions for the training data, with the following columns:
- label: Ground Truth label
- pred_label: Predicted label
- prob_{label}: Probability for each label
confusion.csv
A confusion matrix (Wikipedia) of the Ground Truth and predicted labels, saved if label information is present in the inference pipeline.
eval.json
A classification report (scikit-learn) representing the training performance, saved if label information is present in the inference pipeline.
{imagefilename}.png OR {imagefilename}_xai_class{predicted_class}.png
If do_xai is True, saves the image with highlighted XAI areas. If do_xai is False, saves the original image. Viewable in the edge viewer.
inference_summary.yaml
A summary of the inference results displayed in the Mellerikat's edge viewer. Includes fields like date, file_path, note, probability, result, score, and version. In VC, result indicates the predicted label and score indicates the confidence level (0-1).
VC Version: 1.5.1