YOLO v7 training #3

June 07, 2024

Since my last YOLO v7 training seemed to be short, in terms of epochs, I took the last network weights as initial weights for a continuous training procedure. This procedure was very similar to the last one, with the number of epochs being the only change:

Training: 56 000 images
Validation: 14 000 images
Total: 70 000 images
Train - Valid Ratio: 80% - 20%

Trained with more 200 epochs, added to the previous trained 100:

epochs: 300
batch size: 64
initial learning rate: 0.01 ( lr0 )
final OneCycleLR learning rate ( lrf ): 0.1 ( lr0*lrf )

After more than 46 hours of training, the two Loss metrics (Box and Objectness) graphs looked like this:

Evaluating the previous graphs, it is clear that the separation of the 2 training procedures (epoch 100) resulted in a high disturbance on the loss curves, making it achieve highest (worst) values.

The same occurrence can be depicted on Precision curves:

It was expected that the final, stable, Precision and mAP values were around 0.95 but, as we can see, they tended to converge towards 0.8...

What exactly does this mean?

Should I try to train YOLOv7 with a known, online available, dataset? This would clarify if the "mistake" is on the dataset side, or on the training procedure side.

Should I come back to my last successful YOLO training and test?

Search This Blog

Human-Robot Collaboration

YOLO v7 training #3

Comments

Post a Comment

Popular posts from this blog

Remote Control of UR10e via MoveIt (ROS)

Installing External Control URCap on robot Teach Pendant

UR10e control architecture