Can't use a Validation set when training a sequence-to-sequence BiLSTM Classification model

2 visualizzazioni (ultimi 30 giorni)
I am trying to train a sequence-to-sequnce classifcation model, and i use a BiLSTM layer, with Data and labels, X and Y respectively. I am getting the following error:
Error using trainNetwork (line 184)
Training and validation responses must have the same categories. To view the categories of the
responses, use the categories function.
Error in my_DNN_script (line 146)
[net , netInfo] = trainNetwork(X,Y,layers,options);
Caused by:
Error using nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame (line
213)
Training and validation responses must have the same categories. To view the categories of the
responses, use the categories function.
--------------------------------------------------------------------------
I set a breakpoint at the corresponding line in nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame (line 213).
function iAssertClassNamesAreTheSame(trainingCategories, validationCategories)
% iHaveSameClassNames Assert that the class names for the training and
% validation responses are the same.
trainingClassNames = categories(trainingCategories);
validationClassNames = categories(validationCategories);
if ~isequal(trainingClassNames, validationClassNames)
error(message('nnet_cnn:trainNetwork:TrainingAndValidationDifferentClasses'));
end
end
the problem appears to be an ordering problem. I printed the outputs of the variables trainingClassNames and validationClassNames . The number of classes is the same, but the order is different
>> trainingClassNames =
13×1 cell array
{'2' }
{'3' }
{'4' }
{'5' }
{'6' }
{'7' }
{'8' }
{'9' }
{'10'}
{'11'}
{'12'}
{'1' }
{'0' }
>> validationClassNames
validationClassNames =
13×1 cell array
{'0' }
{'1' }
{'3' }
{'4' }
{'5' }
{'6' }
{'7' }
{'8' }
{'9' }
{'10'}
{'11'}
{'12'}
{'2' }
I modifed the function nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame:
I used the function reordercats to do so:
function iAssertClassNamesAreTheSame(trainingCategories, validationCategories)
% iHaveSameClassNames Assert that the class names for the training and
% validation responses are the same.
trainingClassNames = categories(reordercats(trainingCategories));
validationClassNames = categories(reordercats(validationCategories));
if ~isequal(trainingClassNames, validationClassNames)
error(message('nnet_cnn:trainNetwork:TrainingAndValidationDifferentClasses'));
end
end
With this modification, the trainnet function ran without an error.
Could you please tell me if this modification should be flowless, of if it could be leading to a hidden, wrong training behaviour
  1 Commento
Steve Philbert
Steve Philbert il 12 Ott 2023
I am training a classification network with k-fold cross-validation and ran into this same error. When the training and validation categories are reordered, as shown above, their values are equal.

Accedi per commentare.

Risposta accettata

Ruth
Ruth il 11 Dic 2023
Hi Mahdi,
As you have noted, the order of the categories in the training and validation data must be the same to avoid an error.
The order of the categories must be the same in the training and validation data to correctly because this is used when calculating the validation loss and accuracy for each category. Therefore, reordering the categories immediately before the function checks the order will avoid the error but lead to a silent wrong answer.
Instead, please ensure that the category order is the same before training the model.
For example, if the your training labels are called "TTrain" and the validation labels are called "TValidation", you can execute:
TValidation = reordercats(TValidation, categories(TTrain));
To make sure the order of the categories are identical. You must do this before you call the trainingOptions function.
Alternatively, you can modify the way you partition the data to make sure the category order is preserved. Without example code, it's hard to say exactly how to do this, but generally if you convert all your labels to a single categorical array before partitioning, that should preserve the order of the categories.
Thanks,
Ruth

Più risposte (0)

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by