denoise speech in deep learning-trainNetwork

Question

neal paze il 26 Ago 2021

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1440934-denoise-speech-in-deep-learning-trainnetwork

Commentato: neal paze il 5 Set 2021

I have a question in the example--Denoise Speech Using Deep Learning Networks(https://ww2.mathworks.cn/help/audio/ug/denoise-speech-using-deep-learning-networks.html?s_tid=srchtitle_denoise%20deep_1).

My question is in deep learning part , before deep learning ，we need reshape predictors and targets to the dimensions expected by the deep learning networks.

Code: predictors=reshape(predictors,size(predictors,1),size(predictors,2),1,size(predictors,3));

targets = reshape(targets,1,1,size(targets,1),size(targets,2));

If size(predictors)=[129 8 544] size(predictors)=[129 544],So after reshape , size(predictors)=[129 8 1 544] size(predictors)=[1 1 129 544].

The first deep learning method is Fully Connected Layers.

Code：layers = [

imageInputLayer([numFeatures,numSegments])

fullyConnectedLayer(1024)

batchNormalizationLayer

reluLayer

fullyConnectedLayer(1024)

batchNormalizationLayer

reluLayer

fullyConnectedLayer(numFeatures)

regressionLayer

];

denoiseNetFullyConnected = trainNetwork(trainPredictors,trainTargets,layers,options);

The second deep learning method is Fully Convolutional Layers.

Code:

layers = [imageInputLayer([numFeatures,numSegments])

convolution2dLayer([9 8],18,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer

repmat( ...

[convolution2dLayer([5 1],30,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer

convolution2dLayer([9 1],8,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer

convolution2dLayer([9 1],18,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer],4,1)

convolution2dLayer([5 1],30,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer

convolution2dLayer([9 1],8,"Stride",[1 100],"Padding","same")

batchNormalizationLayer

reluLayer

convolution2dLayer([129 1],1,"Stride",[1 100],"Padding","same")

regressionLayer

];

denoiseNetFullyConvolutional = trainNetwork(trainPredictors,permute(trainTargets,[3 1 2 4]),layers,options);

my question is about trainNetwork ,one is trainTargets(1 1 129 544),the other is permute(trainTargets,[3 1 2 4]) ,which is [129 1 1 544].I can not understand the different.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Shivam Singh il 2 Set 2021

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1440934-denoise-speech-in-deep-learning-trainnetwork#answer_779369

Apri in MATLAB Online

The shape of the predicted output by this training network and the target should be same for computing loss and other learnable parameters. So, here in this case (suppose the training examples be n):

Shape of trainPredictors is [129, 8, 1, n]

Shape of trainTargets is [1, 1, 129, n]

The shape of the predicted output and other intermediate layer activations can be seen using the “analyzeNetwork(layers)”. In this, you can check the “Activations” of the “regressionoutput’, to know the shape of the predicted output.

Shape of predicted output in case of fully connected network is [1, 1, 129, n]. Thus, the "trainTargets" are used as it is.

denoiseNetFullyConnected = trainNetwork(trainPredictors,trainTargets,layers,options);

Shape of predicted output in case of fully convolutional network is [129, 1, 1, n]. Thus, the shape of "trainTargets" is permuted as:

denoiseNetFullyConvolutional = trainNetwork(trainPredictors,permute(trainTargets,[3 1 2 4]),layers,options);

2 Commenti
Mostra NessunoNascondi Nessuno

neal paze il 4 Set 2021

Thanks!

neal paze il 5 Set 2021

I'm sorry. I need to interrupt you again. As I studied the official document in more detail, there was a slight problems.

As you mentioned, "Shape of predicted output in case of fully connected network is [1, 1, 129, n]. Thus, the "trainTargets" are used as it is. DenoiseNetFullyConnected = trainNetwork (trainPredictors trainTargets, the layers, the options)." But the actual result I ran was n*129(single),which is different of "trainTargets" . Can you help me analyze the reason for this? However, "DenoiseNetFullyConvolutional = trainNetwork (trainPredictors permute (trainTargets, [1 2 3 4]), the layers, the options)." This result is exactly the same as train targets. and is 4-D (double), i.e. [129, 1, 1, n].

Here are result of analyzeNetwork(layers),the first picture is Net of FullyConnected，the other is Net of FullyConvolutional. Why is the “Activations” of the “regressionoutput’ not displayed, but a blank?

I hope you can take the time to answer it for me. Thank you very much.

Accedi per commentare.

denoise speech in deep learning-trainNetwork

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Community Treasure Hunt

denoise speech in deep learning-trainNetwork

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno