Different MFCC obtained from audioFeatureExtractor and MFCC function

Question

Fabiano Guimaraes il 17 Giu 2024

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2129396-different-mfcc-obtained-from-audiofeatureextractor-and-mfcc-function

Commentato: Fabiano Guimaraes il 19 Giu 2024

Risposta accettata: MathWorks Audio Toolbox Team

Hi,

I'm trying to use the "audioFeatureExtractor" and the MFCC function to get the MFCC data from an audio sample but noticed the coefficients are different. I´m assuming some default settings are different between these codes but cannot figure exactly what the difference is. Could you please help? Please find below a simple script to provide more detail. I'm comparing "MFCC1" MFCC2". I´ve tried several .wav and .m4a files but the MFCCs were never the same so I´m just using a generic "xxxxxxx" for file name.

[audioIn,fs] = audioread("xxxxxxx");

win1 = hamming(round(0.03*fs),"periodic");

win2 = round(0.015*fs);

aFE = audioFeatureExtractor(SampleRate=fs,Window=win1,OverlapLength=win2,mfcc=1);

features = extract(aFE,audioIn);

idx = info(aFE);

MFCC1 = features(:,idx.mfcc);

MFCC2 = mfcc(audioIn,fs,"LogEnergy","ignore","Window",win1,"Overlaplength",win2,"NumCoeffs",13);

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

MathWorks Audio Toolbox Team il 18 Giu 2024

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2129396-different-mfcc-obtained-from-audiofeatureextractor-and-mfcc-function#answer_1473801

Apri in MATLAB Online

The mfcc function follows the historically popular Auditory Toolbox implementation by Slaney. In this case, the mel bandpass filters are spaced linearly until 1 kHz and logarithmically thereafter. They also have a start point at 133.33 etc. Hz. The defaults spacing of the mel bands in the audioFeatureExtractor object follow the O'Shaughnessy formula. The default audioFeatureExtractor formulation is a bit more common now, especially for the mel spectrogram intermediate step.

What follows is one way to make the two implementations approximately equal. An alternative way to the below is to modify the mfcc function by setting the bandedges option.

Get the bandedges of the the Slaney implementation that the mfcc function uses.

bE = slaneybandedges();

Define your input and parameters.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");
win1 = hamming(round(0.03*fs),"periodic");
overlapLength = round(0.015*fs);

Get the default output of the mfcc function

mfcc_output = mfcc(audioIn,fs,LogEnergy="ignore");

Create an audioFeatureExtractor object and set the options to extract the same feature as mfcc function.

aFE = audioFeatureExtractor(SampleRate=fs, ...

Window=hamming(round(0.03*fs),"periodic"), ...

OverlapLength=round(0.02*fs), ...

mfcc=true, ...

FFTLength=numel(win1));

setExtractorParameters(aFE,"melSpectrum", ...

MelStyle="slaney", ...

SpectrumType="magnitude", ...

WindowNormalization=false, ...

FilterBankDesignDomain="linear", ...

FilterBankNormalization="bandwidth", ...

NumBands=40, ...

FrequencyRange=[bE(1),bE(end)])

features = extract(aFE,audioIn);

idx = info(aFE);

afe_output = features(:,idx.mfcc);

coeffToInspect = 1;

plot(afe_output(:,coeffToInspect),'bo'),hold on

plot(mfcc_output(:,coeffToInspect),'r*'),hold off

rms(afe_output(:)-mfcc_output(:))

ans = 2.3120e-04

Supporting Function

function bE = slaneybandedges()
% Default band edges as defined by the documentation for the
% Auditory Toolbox.
factor = 133.33333333333333;
bE = zeros(1,42);
for ii = 1:13
    bE(ii) = factor + (factor/2)*(ii-1);
end
for ii = 14:42
    bE(ii) = bE(ii-1)*1.0711703;
end
end

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Fabiano Guimaraes il 19 Giu 2024

Thank you very much for the clear answer.

Accedi per commentare.

Different MFCC obtained from audioFeatureExtractor and MFCC function

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Different MFCC obtained from audioFeatureExtractor and MFCC function

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti