i want to use LSTM based audio network to work with Live audio

Question

Arslan Munim il 27 Lug 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio

Commentato: Arslan Munim il 28 Set 2022

Hello Matlab team,

I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.

Can you guide me or help me with that?

Regards,

Arslan Munaim

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

jibrahim il 27 Lug 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1016040

Apri in MATLAB Online

Hi Arslan,

There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:

% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Pass to network
    scores = streamingClassifier(frame,M,S);
    % Use the scores any way you want
end

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

Arslan Munim il 28 Lug 2022

Modificato: Arslan Munim il 28 Lug 2022

Hi jibrahim,

Thanks for your reply, I tried using streamingClassifier. however I am trying to use extract function instead of extractFeatures function (because of dependenices issues) however with extract function I can only use one feature at a time. however I trained network with 11 features.

Can you please how i can use extract function in streamingClassifier? I am attaching code for your reference:

windowLength = 512;

overlapLength = 0;

aFE = audioFeatureExtractor('SampleRate',44100, ...

'Window',hamming(windowLength,'periodic'),...

'OverlapLength',overlapLength,...

'spectralCentroid',true, ...

'spectralCrest',true,...

'spectralDecrease',true, ...

'spectralEntropy',true,...

'spectralFlatness',true,...

'spectralFlux',true,...

'spectralKurtosis',true,...

'spectralRolloffPoint',true,...

'spectralSkewness',true,...

'spectralSlope',true,...

'spectralSpread',true);

features = extract(aFE , audioIn)

%%%%%%%%%features = extractFeatures(audioIn);

% Normalize

features = ((features - M')./S');

[net, scores] = predictAndUpdateState(net,features);

jibrahim il 28 Lug 2022

Apri in MATLAB Online

Hi Arslan,

The extract function should also return 11 features. For example, if you replace the eixsting function extractFeatures with this modified function, things should work the same:

function featureVector = extractFeatures2(x)
%#codegen
persistent afe
if isempty(afe)
    windowLength = 512;
    overlapLength = 0;
    afe = audioFeatureExtractor('SampleRate',44100, ...
        'Window',hamming(windowLength,'periodic'),...
        'OverlapLength',overlapLength,...
        'spectralCentroid',true, ...
        'spectralCrest',true,...
        'spectralDecrease',true, ...
        'spectralEntropy',true,...
        'spectralFlatness',true,...
        'spectralFlux',true,...
        'spectralKurtosis',true,...
        'spectralRolloffPoint',true,...
        'spectralSkewness',true,...
        'spectralSlope',true,...
        'spectralSpread',true);
end
featureVector = extract(afe,x);
end

The size of featureVector will be 1-by-11, each element in the vector representing one of your features.

Notice I declared afe as persistent. This is to ensure the audio feature extractor is not recreated every time you call this function in your loop. the extractor goes through some one-time setup computations when you first call it. No need to waste time repeating those.

jibrahim il 2 Ago 2022

Apri in MATLAB Online

Hi Arslan,

Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
                              Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    while buff.NumUnreadSamples >= 512
        frame = read(buff,512);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    write(buffSRC,frame);
    frame = read(buffSRC,frameLength);
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Arslan Munim il 9 Ago 2022

Hi jibrahim,

Thankyou for your support, it was very helpful.

Now I want to use multiple mics for prediction can you please give me some idea how i can use streaming classifier with 3 or 4 mics of the predicition.

Thanks and have a nice day.

Regards,

Arslan

Accedi per commentare.

Answer 2

jibrahim il 9 Ago 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1023635

Hi Arslan,

audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.

This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.

23 Commenti
Mostra 21 commenti meno recentiNascondi 21 commenti meno recenti

Arslan Munim il 17 Ago 2022

Modificato: Walter Roberson il 19 Ago 2022

Apri in MATLAB Online

Hi jibrahim,

I try to read data from multiple mic but it is giving me this error everytime i try to use multiple mic, I am trying to read frame from each Microphone and send that data to streaming classifier to predict the output but it giving me error always on frame1 = adr1()

Error using audioDeviceReader/setup

A given audio device may only be opened once.

Error in audioDeviceReader/setupImpl

Error in multipleMic (line 10)

frame1 = adr1() - Show complete stack trace

adr1 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (4- USB PnP Sound Device)",BitDepth="16-bit integer");
adr2 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (USB PnP Sound Device)",BitDepth="16-bit integer");
% These statistic value should come from your training...
% M = 0;
% S = 1;
while 1
    % Read a frame of data from microphone
    frame1 = adr1()
    frame2 = adr2()  
    % Pass to network
    [class] = streamingClassifier2(frame1,frame2,M,S)
    % Use the scores any way you want
end
function [class] = streamingClassifier2(frame1,frame2,M,S)
% This is a streaming classifier function 
persistent net; 
if isempty(net)
    net = coder.loadDeepLearningNetwork('net.mat');
end
% Extract features using function
%features = extract(aFE , audioIn)
features1 = extractFeatures2(frame1);
features2 = extractFeatures2(frame2);
% Normalize 
features1 = ((features1 - M)./S).';
features2 = ((features2 - M)./S).';
% Classify
[class] = classify(net,{features1,features2});
%[net, scores] = classify(net,feature)
end

jibrahim il 20 Ago 2022

OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.

Arslan Munim il 28 Set 2022

Hi again,

I am trying to train my network, with lowering BitsPerSample to 8 before it was 16 BitsPerSample. Every time i try to start training model it throw warning (given below) and terminates.

I try it with different sample rate but it gives same error everytime. I tried to change my layer structure, changing InitialLearnRate',0.001 but still i am getting same warning.

Warning: Training stopped at iteration 1 because training loss is NaN. Predictions using the output network might contain NaN values.

Model:

layers = [ ...

sequenceInputLayer(size(trainingFeatures{1},1))

lstmLayer(100,"OutputMode","sequence")

dropoutLayer(0.1)

lstmLayer(100,"OutputMode","last")

fullyConnectedLayer(5)

softmaxLayer

classificationLayer];

miniBatchSize = 30;

validationFrequency = floor(numel(trainingFeatures)/miniBatchSize);

options = trainingOptions("adam", ...

"MaxEpochs",100, ...

"MiniBatchSize",miniBatchSize, ...

"Plots","training-progress", ...

"Verbose",false, ...

"Shuffle","every-epoch", ...

"LearnRateSchedule","piecewise", ...

"LearnRateDropFactor",0.1, ...

"LearnRateDropPeriod",20,...

'InitialLearnRate',0.001,...

'ValidationData',{validationFeatures,adsValidation.Labels}, ...

'ValidationFrequency',validationFrequency);

Regards,

Arslan

Accedi per commentare.

i want to use LSTM based audio network to work with Live audio

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (2)

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

23 Commenti
Mostra 21 commenti meno recentiNascondi 21 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

i want to use LSTM based audio network to work with Live audio

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (2)

5 Commenti Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

23 Commenti Mostra 21 commenti meno recentiNascondi 21 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

23 Commenti
Mostra 21 commenti meno recentiNascondi 21 commenti meno recenti