Azzera filtri
Azzera filtri

Categorical to Numeric problem

13 visualizzazioni (ultimi 30 giorni)
Stephen Gray
Stephen Gray il 8 Gen 2024
Commentato: Cris LaPierre il 11 Gen 2024
Hi
I have a table that has numeric and categorical items in it. I have converted the catergorical items to numeric using the unique() function which works very well and I can then feed the matrix into an NN for training. The problem is when I feed new data to get results, I don't know how to make sure the converted categirical data in the new table matches ther numbers in the training data. i.e. if a categorical field in the training data is converted to the number 5, how do I make sure if that categorical data is in the new data, that it gets assigned the same number? I'm begining to think it may be a manual thing
SPG

Risposta accettata

Hassaan
Hassaan il 8 Gen 2024
% Example Training Data (Categorical)
training_categorical_data = {'cat', 'dog', 'fish', 'dog', 'cat'};
% Convert Categorical Data to Numeric for Training
[unique_categories, ~, numeric_categories] = unique(training_categorical_data);
category_to_number_map = containers.Map(unique_categories, num2cell(1:length(unique_categories)));
numeric_training_data = cell2mat(values(category_to_number_map, num2cell(training_categorical_data)));
% Training Process with numeric_training_data
% [Your neural network training code goes here]
% Example New Data (Categorical)
new_categorical_data = {'dog', 'cat', 'bird'};
% Convert New Categorical Data to Numeric Using Training Mapping
numeric_new_data = zeros(size(new_categorical_data));
for i = 1:length(new_categorical_data)
if isKey(category_to_number_map, new_categorical_data{i})
numeric_new_data(i) = category_to_number_map(new_categorical_data{i});
else
% Handle unseen categories, e.g., assign a special number or ignore
numeric_new_data(i) = NaN; % Assign NaN for unseen categories
end
end
% Now, numeric_new_data is ready for use with the trained model
% [Your prediction or evaluation code goes here]
  • The training data training_categorical_data is a cell array of categorical strings. This is converted to numeric_training_data using a mapping (category_to_number_map).
  • The new data new_categorical_data is then converted using the same mapping. Unseen categories (like 'bird' in this example) are handled separately; here, I've assigned NaN to them, but you can choose another method as appropriate.
  • You'll need to insert your specific neural network training and prediction code where indicated. The numeric_training_data and numeric_new_data arrays are what you'd use for training and prediction, respectively.
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
  4 Commenti
Stephen Gray
Stephen Gray il 10 Gen 2024
OK, using dictionary instead and it's working so far.
Stephen Gray
Stephen Gray il 11 Gen 2024
OK. I've got it to work now using dictionaries. Both this answer and the next one helped me get it working. AS yours includes how to use new data to I'll mark it as the answer. Thanks both for answering.

Accedi per commentare.

Più risposte (1)

Cris LaPierre
Cris LaPierre il 8 Gen 2024
Spostato: Cris LaPierre il 8 Gen 2024
Could you provide more details about your NN? I would think you should be able to pass categorical data into your network without having to convert it to numeric first.
If not, then I'd look into creating a dictionary, where you pass in the categorical value, and it returns the numberic value.
A = categorical({'medium' 'large' 'small' 'medium' 'large' 'small'});
names = unique(A)
names = 1×3 categorical array
large medium small
values = (1:length(names));
d = dictionary(names,values)
d = dictionary (categorical --> double) with 3 entries: large --> 1 medium --> 2 small --> 3
A(4)
ans = categorical
medium
x = d(A(4))
x = 2
  4 Commenti
Stephen Gray
Stephen Gray il 9 Gen 2024
Unfortunately not. The code part is
InpsM = table2cell(Inps);
OutsM =table2cell(Outs);
InpsM=InpsM';
OutsM=OutsM';
net=feedforwardnet([96,48,24]);
net.trainFcn = 'trainlm';
net.inputs{1}.processFcns = {'mapstd'};
net=train(net,InpsM,OutsM,'useParallel','yes');
The error I get is
Error using nntraining.setup>setupPerWorker
Inputs X{1,1} is not numeric or logical.
Error in nntraining.setup (line 77)
[net,data,tr,err] = setupPerWorker(net,trainFcn,X,Xi,Ai,T,EW,enableConfigure);
Error in network/train (line 336)
[net,data,tr,err] = nntraining.setup(net,net.trainFcn,X,Xi,Ai,T,EW,enableConfigure,isComposite);
Error in untitled (line 52)
net=train(net,InpsM,OutsM,'useParallel','yes');
SPG
Cris LaPierre
Cris LaPierre il 11 Gen 2024
Found this, albeit on the trainnetwork page and not train, but it appears to still be applicable.
"To train a network using categorical features, you must first convert the categorical features to numeric."

Accedi per commentare.

Categorie

Scopri di più su Sequence and Numeric Feature Data Workflows in Help Center e File Exchange

Prodotti


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by