Neural Network Analysis
With the new dataset collected on 1 August 2022 (see this post), we optimized some parameters to achieve a proper and valid neural network.
On a Universe of 2558 samples (640 pull, 642 push, 632 shake, 644 twist). Where 1790 for training (1253 train, 537 validation) and 768 for testing.
Searched combinations:
n_hidden_layers = 2 or 3
neurons_in_layer = 32 or 64
dropout = 0.2 or 0.5
activation_function = relu or selu
Fixed values:
model_optimizer = Adam
learning_rate = 0.001
max_epochs = 300
batch_size = 96
model_loss = sparse_categorical_crossentropy
kernel_regularizer_per_layer = L1
early_stopping = val_loss ; patience = 30
model configurations: 160
searched time: 16m55s
Best Result
Epoch 218/300:
loss: 0.7502 - accuracy: 0.8859
val_loss: 0.7890 - val_accuracy: 0.8566
Using 1253 samples for training and 537 for validation
Confusion Matrix
Using 768 new samples for predicting
10 best results (optimization)
Analyzing the Confusion Matrix, we can see that the forth interaction (TWIST) had much more mismatched predictions than the other three interactions. Because of this, we can suspect that this specific movement could be ambiguous...
Let's now exclude the TWIST interaction from our study by creating a new neural network, with only 3 output neurons; and removing all twist samples from our dataset.
Study without TWIST
On a Universe of 1914 samples (640 pull, 642 push, 632 shake). Where 133 for training (937 train, 401 validation) and 575 for testing.
Optimization
Searched combinations:
Fixed values:
model_optimizer = Adam
learning_rate = 0.001
max_epochs = 300
model_loss = sparse_categorical_crossentropy
kernel_regularizer_per_layer = L1
early_stopping = val_loss ; patience = 30
model configurations: 1008
searched time: 2h59m44s
BEST Result
Epoch 223/300
Using 937 samples for training and 401 for validation
Confusion Matrix
Using 575 new samples for predicting
10 best results (optimization)
Comments
Post a Comment