This project aims to provide a scalable model to train YOLOv8 on a custom dataset. It also includes a comparison between different YOLOv8 models (nano, small, medium) using ClearML. The project can be broken down into the following main components/procedures:
- Image and annotations Augmentation to scale the needed dataset
- Converting from the Pascal-VOC formatted dataset to YOLO format
- Splitting the dataset into train, val, test
- Visualizing the dataset Images
- Base-dataset Collection: Collect a small-sized dataset (e.g., 25 images) for generating the augmented training dataset.
- Annotation in Pascal VOC format: Annotate the target objects (ones, half pounds) in the collected dataset.
- Augment the images and annotations: Use
ImageAugmentor.pyto augment the images and their annotations. - Visualize the dataset with the bounding boxes: Use
ImageVisualizer.pyto visualize the images and bounded boxes. - Convert the dataset to YOLO format: Use
YOLO_format_generator.pyto prepare the dataset for YOLO models. This script also generatesdataset.yamlcontaining all the needed info about the dataset. - Generate the test data: Use
InferenceDataGenerator.py. - Train the models: Use
custom_model_training.ipynb. Train three models (Nano, Small, Medium) and log the results to ClearML. The models were trained with different configurations. - Test on inference data: Use
custom_model_validation_prediction.ipynbto test the models.
As the repository doesn’t contain all the folders for of the project neither the dataset(due to size limitation) , Here is my project structure for clarification.
imagesfolder: contains the raw images before augmentation or annotationsannotated_dataset_vocfolder: contains images after annotations in Pascal VOC formataugmented_dataset_vocfolder: contains images after augmentation for images and annotationsyolo_format_datasetfolder: contains images after augmentation for images and annotations in YOLO formattest_datasetfolder: contains augmented images along with annotations in Pascal VOC formattest_dataset_yolofolder: contains augmented images along with annotations in YOLO formathelpers: contains helper classes and modules for augmenting, preparing, and splitting the dataset- Notebooks for training and validating the models
First, set up a virtual environment and install the required modules from requirements.txt.
ImageAugmentor.py: Runpython ImageAugmentor.py --helpto see the usage.ImageVisualizer: Modify the target directory and provide info in thevisualize_images_in_directoryfunction.YOLO_format_generatorandInferenceDataGenerator: Modifysource_directoryandoutput_directory.
Provide the path for dataset.yaml to load the model and initialize ClearML if needed.
- Yolov8 nano: final box loss 0.2616, mAP of 0.995
- Yolov8 small: final box loss 0.2608, mAP of 0.995
- Yolov8 medium: final box loss 0.583, mAP of 0.958
The nano and small models performed well despite running for only 30 epochs , batch size of only 8 and low image size. The medium model performed less better but was trained for only 5 epochs batch size of only 2 and also low image size.
The epoch time for the medium model is significantly longer than for the nano and small models. This is resonable due to the difference in complexity , number of layers and depth of each model #### Parameters Comaprison of the Parameter size for each model. There are siginifcant differences between each model where nano model come with less than 5 million , small model with less than 15 million and the medium model larger than 25 millionsComparison for the models precision.
- Nano Model : Speed: 0.6ms preprocess, 106.7ms inference, 0.0ms loss, 0.2ms postprocess per image
- small Model : Speed: 0.9ms preprocess, 278.5ms inference, 0.0ms loss, 0.2ms postprocess per image
- medium model : Speed: 0.7ms preprocess, 320.8ms inference, 0.0ms loss, 0.7ms postprocess per image




