How to provide input without datastore to multiple input deep neural network?
Mostra commenti meno recenti

I have used the network shown in fig which takes 2 inputs namely video input(no. of images) & second is mfcc of audio signal of same image. I have used fileDatastore commands to store training data and validation data. Would you please guide how to provide training and validation data without filestore? I already have data in 4-D array.
Please provide solution
My aim is to generate mfcc from lip images. i have trained network with lip images & corresponding mffcc then output of both networks are added together and provided to 3rd neural network as shown in fig. I trained the network. But I am unable to find output of network i.e. generated mfcc.
Please guide how to find mffcc from network output.
Also i have combined frames of all videos together then applied images as a input. Instead of that can I provide input as a video signal.
clear all;
close all;
clc;
files={'AVDIGITS_S1_0_01.mp4';'AVDIGITS_S1_0_02.mp4';'AVDIGITS_S1_0_03.mp4';'AVDIGITS_S1_0_04.mp4';'AVDIGITS_S1_0_05.mp4';...
'AVDIGITS_S1_1_02.mp4';'AVDIGITS_S1_1_03.mp4';'AVDIGITS_S1_1_04.mp4';'AVDIGITS_S1_1_05.mp4';
};
mfcc_files={'S1_0_01_mfcc.mp4.avi';'S1_0_02_mfcc.mp4.avi';'S1_0_03_mfcc.mp4.avi';'S1_0_04_mfcc.mp4.avi';'S1_0_05_mfcc.mp4.avi'; ...
'S1_1_02_mfcc.mp4.avi';'S1_1_03_mfcc.mp4.avi';'S1_1_04_mfcc.mp4.avi';'S1_1_05_mfcc.mp4.avi'}
numFiles = numel(files);
index2=1;
for mm=1:numFiles
video = readVideo(files{mm});
fprintf("Reading Video file %d of %d...\n", mm, numFiles)
[v1 v2 v3 v4]=size(video);
audio = readVideo(mfcc_files{mm});
fprintf("Reading Audio file %d of %d...\n", mm, numFiles)
frame_cnt(mm)=v4;
for ii=1:v4
comb_video=video(:,:,:,ii);
comb_audio=audio(:,:,ii);
all_vid_frames(:,:,:,index2)=uint8(comb_video);
all_audio_frames(:,:,:,index2)=comb_audio;
index2=index2+1;
end
end
labels1=categorical([zeros(1,209) ones(1,196)]);
idxTrain =[1:121 357:405];
for kk=1:length(idxTrain)
ind1=idxTrain(kk);
vid_sequencesTrain(:,:,:,kk) = all_vid_frames(:,:,:,ind1);
vid_labelsTrain(kk) = labels1(ind1);
audio_sequencesTrain(:,:,:,kk) = all_audio_frames(:,:,:,ind1);
audio_labelsTrain = labels1(ind1);
end
idxValidation = [122:356];
for kk=1:length(idxValidation)
ind2=idxValidation(kk);
vid_sequencesValidation(:,:,:,kk) = all_vid_frames(:,:,:,ind2);
vid_labelsValidation(kk) = labels1(ind2);
audio_sequencesValidation(:,:,:,kk) = all_audio_frames(:,:,:,ind2);
audio_labelsValidation(kk) = labels1(ind2);
end
[v1 v2 v3 v4]=size(vid_sequencesTrain)
[a1 a2 a3 a4]=size(audio_sequencesTrain)
imgCells = mat2cell(vid_sequencesTrain,v1,v2,v3,ones(v4,1));
imgCells2 = reshape(imgCells,[v4 1 1]);
audioCells = mat2cell(audio_sequencesTrain,a1,a2,a3,ones(a4,1));
audioCells2 = reshape(audioCells,[a4 1 1]);
labelCells = arrayfun(@(x)x,vid_labelsTrain,'UniformOutput',false);
combinedCells = [imgCells2 audioCells2 labelCells'];
%% validation
[vv1 vv2 vv3 vv4]=size(vid_sequencesValidation)
[aa1 aa2 aa3 aa4]=size(audio_sequencesValidation)
imgCellsvald = mat2cell(vid_sequencesValidation,vv1,vv2,vv3,ones(vv4,1));
imgCells2vald = reshape(imgCellsvald,[vv4 1 1]);
audioCellsvald = mat2cell(audio_sequencesValidation,aa1,aa2,aa3,ones(aa4,1));
audioCells2vald = reshape(audioCellsvald,[aa4 1 1]);
labelCells2vald = arrayfun(@(x)x,audio_labelsValidation,'UniformOutput',false);
combinedCellsvald = [imgCells2vald audioCells2vald labelCells2vald'];
%
save('traingData_10April_2023.mat','combinedCells', 'combinedCellsvald');
filedatastore = fileDatastore('traingData_10April_2023.mat','ReadFcn',@load);
trainingDatastore = transform(filedatastore,@rearrangeData);
layers1 = [
imageInputLayer([v1 v2 3],'Name','imageinput')
convolution2dLayer(3,16,'Padding','same','Name','conv_1')
batchNormalizationLayer('Name','BN_1')
reluLayer('Name','relu_1')
fullyConnectedLayer(2,'Name','fc11')
additionLayer(2,'Name','add')
transposedConv2dLayer(3,16,'Name','deconv1');
batchNormalizationLayer('Name','BN_2')
reluLayer('Name','relu_2')
transposedConv2dLayer(3,16,'Name','deconv2');
batchNormalizationLayer('Name','BN_3')
reluLayer('Name','relu_3')
averagePooling2dLayer(2,'Stride',2)
fullyConnectedLayer(2,'Name','fc12')
softmaxLayer('Name','softmax')
classificationLayer('Name','classOutput')];
lgraph = layerGraph(layers1);
layers2 = [imageInputLayer([a1 a2 a3],'Name','vinput')
fullyConnectedLayer(2,'Name','fc21')];
lgraph = addLayers(lgraph,layers2);
lgraph = connectLayers(lgraph,'fc21','add/in2');
plot(lgraph)
options = trainingOptions('adam', ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise',...
'MaxEpochs',100, ...
'MiniBatchSize',512, ...
'Verbose',false, ...
'Plots','training-progress',...
'Shuffle','never',...
'ValidationData',trainingDatastore, ...
'ValidationFrequency',1);
net = trainNetwork(trainingDatastore,lgraph,options);
Risposta accettata
Più risposte (0)
Categorie
Scopri di più su Deep Learning Toolbox in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!