YOLO v7 training #3
Since my last YOLO v7 training seemed to be short, in terms of epochs, I took the last network weights as initial weights for a continuous training procedure. This procedure was very similar to the last one, with the number of epochs being the only change:
- Training: 56 000 images
- Validation: 14 000 images
- Total: 70 000 images
- Train - Valid Ratio: 80% - 20%
Trained with more 200 epochs, added to the previous trained 100:
- epochs: 300
- batch size: 64
- initial learning rate: 0.01 ( lr0 )
- final OneCycleLR learning rate ( lrf ): 0.1 ( lr0*lrf )
After more than 46 hours of training, the two Loss metrics (Box and Objectness) graphs looked like this:
Evaluating the previous graphs, it is clear that the separation of the 2 training procedures (epoch 100) resulted in a high disturbance on the loss curves, making it achieve highest (worst) values.The same occurrence can be depicted on Precision curves:
It was expected that the final, stable, Precision and mAP values were around 0.95 but, as we can see, they tended to converge towards 0.8...
What exactly does this mean?
Should I try to train YOLOv7 with a known, online available, dataset? This would clarify if the "mistake" is on the dataset side, or on the training procedure side.
Should I come back to my last successful YOLO training and test?
Comments
Post a Comment