designAuditoryFilterBank

Design auditory filter bank

collapse all in page

Syntax

filterBank = designAuditoryFilterBank(fs)

filterBank = designAuditoryFilterBank(fs,Name=Value)

[filterBank,Fc,BW] = designAuditoryFilterBank(___)

Description

filterBank = designAuditoryFilterBank(fs) returns a frequency-domain auditory filter bank, filterBank.

example

filterBank = designAuditoryFilterBank(fs,Name=Value) specifies options using one or more name-value arguments.

example

[filterBank,Fc,BW] = designAuditoryFilterBank(___) returns the center frequency and bandwidth of each filter in the filter bank. You can use this output syntax with any of the previous input syntaxes.

Examples

collapse all

Create Default Auditory Filter Bank

Open Live Script

Call designAuditoryFilterBank with a specified sample rate to design the default auditory filter bank.

fs = 44.1e3;
fb = designAuditoryFilterBank(fs);

The default filter bank consists of 32 triangular bandpass filters spaced evenly on the mel scale between 0 and fs/2 Hz.

numBands = size(fb,1)

numBands = 
32

designAuditoryFilterBank is intended for frequency-domain filtering. By default, designAuditoryFilterBank assumes a 1024-point DFT and returns a half-sided frequency-domain filter bank with 513 points.

numPoints = size(fb,2)

numPoints = 
513

Design Mel-Based Auditory Filter Bank

Open Live Script

Read in audio and convert it to a one-sided power spectrum.

[audioIn,fs] = audioread("Laughter-16-8-mono-4secs.wav");

win = hamming(1024,"periodic");
noverlap = 512;
fftLength = 1024;
[S,F,t] = stft(audioIn,fs, ...
               "Window",win, ...
               "OverlapLength",noverlap, ...
               "FFTLength",fftLength, ...
               "FrequencyRange","onesided");
PowerSpectrum = S.*conj(S);

Design a mel-based auditory filter bank. Plot the filter bank.

numBands = 32;
range = [0,4000];
normalization = "bandwidth";

[fb,cf] = designAuditoryFilterBank(fs, ...
                                   "FFTLength",fftLength, ...
                                   "NumBands",numBands, ...
                                   "FrequencyRange",range, ...
                                   "Normalization",normalization);

plot(F,fb.')
grid on
title("Mel Filter Bank")
xlabel("Frequency (Hz)")

Figure contains an axes object. The axes object with title Mel Filter Bank, xlabel Frequency (Hz) contains 32 objects of type line.

To apply frequency domain filtering, perform a matrix multiplication of the filter bank and the power spectrogram.

X = fb*PowerSpectrum;

Visualize the power-per-band in dB.

XdB = 10*log10(X);

surf(t,cf,XdB,"EdgeColor","none");
xlabel("Time (s)")
ylabel("Frequency (Hz)")
zlabel("Power (dB)")
view([45,60])
title('Mel-Based Spectrogram')
axis tight

Figure contains an axes object. The axes object with title Mel-Based Spectrogram, xlabel Time (s), ylabel Frequency (Hz) contains an object of type surface.

Design Bark-Based Auditory Filter Bank

Open Live Script

Read in audio and convert it to a one-sided power spectrum.

[audioIn,fs] = audioread("RockDrums-44p1-stereo-11secs.mp3");

win = hann(round(0.03*fs),"periodic");
noverlap = round(0.02*fs);
fftLength = 2048;

[S,F,t] = stft(audioIn,fs, ...
               "Window",win, ...
               "OverlapLength",noverlap, ...
               "FFTLength",fftLength, ...
               "FrequencyRange","onesided");
PowerSpectrum = S.*conj(S);

Design a Bark-based auditory filter bank. Plot the filter bank.

numBands = 32;
range = [0,22050];
normalization = "area";
designDomain = "linear";

[fb,cf] = designAuditoryFilterBank(fs, ...
    "FrequencyScale","bark", ...
    "FFTLength",fftLength, ...
    "NumBands",numBands, ...
    "FrequencyRange",range, ...
    "Normalization",normalization, ...
    "FilterBankDesignDomain",designDomain);

plot(F,fb.');
grid on
title("Bark Filter Bank")
xlabel("Frequency (Hz)")

Figure contains an axes object. The axes object with title Bark Filter Bank, xlabel Frequency (Hz) contains 32 objects of type line.

To apply frequency domain filtering, perform a matrix multiplication of the filter bank and the left and right power spectrograms.

X = pagemtimes(fb,PowerSpectrum);

Visualize the power-per-band in dB.

XLdB = 10*log10(X(:,:,1));
XRdB = 10*log10(X(:,:,2));

surf(t,cf,XLdB,"EdgeColor","none");
xlabel("Time (s)")
ylabel("Frequency (Hz)")
view([0,90])
title("Bark-Based Spectrogram (Left Channel)")
axis tight

Figure contains an axes object. The axes object with title Bark-Based Spectrogram (Left Channel), xlabel Time (s), ylabel Frequency (Hz) contains an object of type surface.

surf(t,cf,XRdB,"EdgeColor","none");
xlabel("Time (s)")
ylabel("Frequency (Hz)")
view([0,90])
title("Bark-Based Spectrogram (Right Channel)")
axis tight

Figure contains an axes object. The axes object with title Bark-Based Spectrogram (Right Channel), xlabel Time (s), ylabel Frequency (Hz) contains an object of type surface.

Design ERB-Based Auditory Filter Bank

Open Live Script

Read in audio and convert it to a one-sided power spectrum.

[audioIn,fs] = audioread("NoisySpeech-16-22p5-mono-5secs.wav");

win = hann(round(0.04*fs),"periodic");
noverlap = round(0.02*fs);
fftLength = 1024;

[S,F,t] = stft(audioIn,fs, ...
               "Window",win, ...
               "OverlapLength",noverlap, ...
               "FFTLength",fftLength, ...
               "FrequencyRange","onesided");
PowerSpectrum = S.*conj(S);

Design an ERB-based auditory filter bank. Plot the filter bank.

numBands = 32;
range = [0,11025];
normalization = "bandwidth";

[fb,cf] = designAuditoryFilterBank(fs, ...
    "FrequencyScale","erb", ...
    "FFTLength",fftLength, ...
    "NumBands",numBands, ...
    "FrequencyRange",range, ...
    "Normalization",normalization);

plot(F,fb.');
grid on
title("ERB Filter Bank")
xlabel("Frequency (Hz)")

Figure contains an axes object. The axes object with title ERB Filter Bank, xlabel Frequency (Hz) contains 32 objects of type line.

To apply frequency-domain filtering, perform a matrix multiplication of the filter bank and the power spectrogram.

X = fb*PowerSpectrum;

Visualize the power-per-band in dB.

XdB = 10*log10(X);
surf(t,cf,XdB,"EdgeColor","none");
xlabel("Time (s)")
ylabel("Frequency (Hz)")
view([0,90])
title("ERB-Based Spectrogram")
axis tight

Figure contains an axes object. The axes object with title ERB-Based Spectrogram, xlabel Time (s), ylabel Frequency (Hz) contains an object of type surface.

Input Arguments

collapse all

`fs` — Sample rate of filter design (Hz)
positive scalar

Sample rate of filter design in Hz, specified as a positive scalar.

Data Types: single | double

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: FrequencyScale="mel"

`FrequencyScale` — Frequency scale
`"mel"` (default) | `"bark"` | `"erb"`

Frequency scale used to design the auditory filter bank, specified as "mel", "bark", or "erb".

Data Types: char | string

`FFTLength` — Number of DFT points
`1024` (default) | positive integer

Number of points used to calculate the DFT, specified as a positive integer.

Data Types: single | double

`NumBands` — Number of bandpass filters
positive integer

Number of bandpass filters, specified as a positive integer. The default number of bandpass filters depends on the FrequencyScale:

If FrequencyScale is set to "bark" or "mel", then NumBands defaults to 32.
If FrequencyScale is set to "erb", then NumBands defaults to ceil(hz2erb(FrequencyRange(2))-hz2erb(FrequencyRange(1))).

Data Types: single | double

`FrequencyRange` — Frequency range over which to design auditory filter bank (Hz)
`[0 fs/2]` (default) | two-element row vector

Frequency range over which to design auditory filter bank in Hz, specified as a two-element row vector of monotonically increasing values in the range [0, fs/2].

Data Types: single | double

`Normalization` — Normalize filter bank
`"bandwidth"` (default) | `"area"` | `"none"`

Normalization technique used on the weights of the filter bank:

"bandwidth" –– The weights of each bandpass filter are normalized by the corresponding bandwidth of the filter.
"area" –– The weights of each bandpass filter are normalized by the corresponding area of the bandpass filter.
"none" –– The weights of the filters are not normalized.

Data Types: char | string

`OneSided` — Design one-sided or two-sided filter bank
`true` (default) | `false`

Design a one-sided or two-sided filter bank, specified as either true or false.

Data Types: logical

`FilterBankDesignDomain` — Domain in which filter bank is designed
`"linear"` (default) | `"warped"`

Domain in which filter bank is designed, specified as either "linear" or "warped". Set the filter bank design domain to "linear" to design the bandpass filters in the linear (Hz) domain. Set the filter bank design domain to "warped" to design the bandpass filters in the warped (mel or Bark) domain.

Dependencies

This parameter only applies if FrequencyScale is set to "mel" (default) or "bark".

Data Types: char | string

`MelStyle` — Mel style
`"oshaughnessy"` (default) | `"slaney"`

Mel style, specified as "oshaughnessy" or "slaney".

Dependencies

This parameter only applies if FrequencyScale is set to "mel".

Data Types: char | string

Output Arguments

collapse all

`filterBank` — Auditory filter bank
column vector | matrix

Auditory filter bank, returned as an M-by-N matrix, where M is the number of bands (NumBands), and N is the number of frequency points of a one-sided spectrum (ceil(FFTLength/2)).

Data Types: double

`Fc` — Center frequencies of bandpass filters (Hz)
row vector

Center frequencies of bandpass filters in Hz, returned as a row vector with NumBands elements.

Data Types: double

`BW` — Bandwidth of bandpass filters (Hz)
row vector

Bandwidth of bandpass filters in Hz, returned as a row vector with NumBands elements.

Data Types: double

Algorithms

The mel filter bank is designed as half-overlapped triangles equally spaced on the mel scale. The mel scale can be in the O'Shaughnessy style, which follows [1], or the Slaney style, which follows [5].

The Bark filter bank is designed as half-overlapped triangles equally spaced on the Bark scale. [2]

The ERB filter bank is designed as gammatone filters [4] whose center frequencies are equally spaced on the ERB scale. [3]

References

[1] O'Shaughnessy, Douglas. Speech Communication: Human and Machine. Reading, MA: Addison-Wesley Publishing Company, 1987.

[2] Traunmüller, Hartmut. "Analytical Expressions for the Tonotopic Sensory Scale." Journal of the Acoustical Society of America. Vol. 88, Issue 1, 1990, pp. 97–100.

[3] Glasberg, Brian R., and Brian C. J. Moore. "Derivation of Auditory Filter Shapes from Notched-Noise Data." Hearing Research. Vol. 47, Issues 1–2, 1990, pp. 103–138.

[4] Slaney, Malcolm. "An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank." Apple Computer Technical Report 35, 1993.

[5] Slaney, Malcolm. "Auditory Toolbox: A MATLAB Toolbox for Auditory Modeling Work." Technical Report, Version 2, Interval Research Corporation, 1998.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Version History

Introduced in R2019b

expand all

R2023b: Support for Slaney-style mel scale

Set the MelStyle name-value argument to "slaney" to use the Slaney-style mel scale.

R2020b: `designAuditoryFilterBank` scaling changed for ERB filter banks

The half-sided ERB filter bank returned from designAuditoryFilterBank is now scaled by 2. This change provides consistent results when applying one-sided or two-sided filtering, without requiring multiplications in the processing loop.

designAuditoryFilterBank

Syntax

Description

Examples

Create Default Auditory Filter Bank

Design Mel-Based Auditory Filter Bank

Design Bark-Based Auditory Filter Bank

Design ERB-Based Auditory Filter Bank

Input Arguments

fs — Sample rate of filter design (Hz) positive scalar

Name-Value Arguments

FrequencyScale — Frequency scale "mel" (default) | "bark" | "erb"

FFTLength — Number of DFT points 1024 (default) | positive integer

NumBands — Number of bandpass filters positive integer

FrequencyRange — Frequency range over which to design auditory filter bank (Hz) [0 fs/2] (default) | two-element row vector

Normalization — Normalize filter bank "bandwidth" (default) | "area" | "none"

OneSided — Design one-sided or two-sided filter bank true (default) | false

FilterBankDesignDomain — Domain in which filter bank is designed "linear" (default) | "warped"

Dependencies

MelStyle — Mel style "oshaughnessy" (default) | "slaney"

Dependencies

Output Arguments

filterBank — Auditory filter bank column vector | matrix

Fc — Center frequencies of bandpass filters (Hz) row vector

BW — Bandwidth of bandpass filters (Hz) row vector

Algorithms

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Version History

R2023b: Support for Slaney-style mel scale

R2020b: designAuditoryFilterBank scaling changed for ERB filter banks

See Also

`fs` — Sample rate of filter design (Hz)
positive scalar

`FrequencyScale` — Frequency scale
`"mel"` (default) | `"bark"` | `"erb"`

`FFTLength` — Number of DFT points
`1024` (default) | positive integer

`NumBands` — Number of bandpass filters
positive integer

`FrequencyRange` — Frequency range over which to design auditory filter bank (Hz)
`[0 fs/2]` (default) | two-element row vector

`Normalization` — Normalize filter bank
`"bandwidth"` (default) | `"area"` | `"none"`

`OneSided` — Design one-sided or two-sided filter bank
`true` (default) | `false`

`FilterBankDesignDomain` — Domain in which filter bank is designed
`"linear"` (default) | `"warped"`

`MelStyle` — Mel style
`"oshaughnessy"` (default) | `"slaney"`

`filterBank` — Auditory filter bank
column vector | matrix

`Fc` — Center frequencies of bandpass filters (Hz)
row vector

`BW` — Bandwidth of bandpass filters (Hz)
row vector

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

R2020b: `designAuditoryFilterBank` scaling changed for ERB filter banks