How to filter breath noise in audio?

6 visualizzazioni (ultimi 30 giorni)
wei sun
wei sun il 12 Lug 2022
Commentato: Mathieu NOE il 15 Lug 2022
In the attachment are the original audio files and the MATLAB filter files used. I tried low-pass filtering and band-pass filtering. The effect is not obvious. This noise is mainly heavy breathing sound. How can I filter this breathing sound and save the speaking sound completely (Chinese or English)?
  5 Commenti
Jonas
Jonas il 13 Lug 2022
do you want to remove it only in this sound or do you want to do this automatically for multiple files?
wei sun
wei sun il 13 Lug 2022
remove or attenuate this noise.

Accedi per commentare.

Risposta accettata

Mathieu NOE
Mathieu NOE il 13 Lug 2022
Modificato: Mathieu NOE il 13 Lug 2022
hello
i opted for a strategy based on the spectrogram content. I noticed that the "breathing" sections are characterized by a strong spectrogram output below 100 Hz (red dots) which is not the case for the "speaking" sections
I worked on channel 1 as channel 2 is clipped (distorded)
so I simply reduced the volume (here - 30 dB) for the segments that goes from the local minima just before and after each red dot
(you can also put directly zero if you prefer - see options in the code)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% spectrogram dB scale
spectrogram_dB_scale = 80; % dB range scale (means , the lowest displayed level is XX dB below the max level)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load signal
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[signal,Fs] = audioread('original.wav');
dt = 1/Fs;
[samples,channels] = size(signal);
% select channel (if needed)
channels = 1;
signal = signal(:,channels);
signal_filtered = signal;
% time vector
time = (0:samples-1)*dt;
%% decimate (if needed)
% NB : decim = 1 will do nothing (output = input)
decim = 40;
if decim>1
signal_decim = decimate(signal,decim);
Fs_decim = Fs/decim;
end
samples_decim = length(signal_decim);
time_decim = (0:samples_decim-1)*1/Fs_decim;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FFT parameters
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
NFFT = 512; %
OVERLAP = 0.75;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% display : time / frequency analysis : spectrogram
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[sg,fsg,tsg] = specgram(signal_decim,NFFT,Fs_decim,hanning(NFFT),floor(NFFT*OVERLAP));
% FFT normalisation and conversion amplitude from linear to dB (peak)
sg_dBpeak = 20*log10(abs(sg))+20*log10(2/length(fsg)); % NB : X=fft(x.*hanning(N))*4/N; % hanning only
% saturation of the dB range :
min_disp_dB = round(max(max(sg_dBpeak))) - spectrogram_dB_scale;
sg_dBpeak(sg_dBpeak<min_disp_dB) = min_disp_dB;
% plots spectrogram
figure(2);
imagesc(tsg,fsg,sg_dBpeak);colormap('jet');
axis('xy');colorbar('vert');grid on
df = fsg(2)-fsg(1); % freq resolution
title(['Spectrogram / Fs = ' num2str(Fs) ' Hz / Delta f = ' num2str(df,3) ' Hz ']);
xlabel('Time (s)');ylabel('Frequency (Hz)');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% extract SG (dB) values from 0 to 100 hz (loud level in this freq range is
% breath sound
ind = find(fsg<=100);
fsg_breath = fsg(ind);
sg_dB_breath = sg_dBpeak(ind,:);
max_dB = max(sg_dB_breath,[],1);
max_dB = max_dB-min(max_dB); % shift the dB values to positive values for good working islocalmax
% select peaks above +25 dB and neighboring local mins
% find local maxima
[tf, P] = islocalmax(max_dB,'MinProminence',25);
x_peak = tsg(tf);
y_peak = max_dB(tf);
% find local minima
[tm, P] = islocalmin(max_dB);
x_min = tsg(tm);
y_min = max_dB(tm);
figure(3);plot(tsg,max_dB,x_peak,y_peak,'dr',x_min,y_min,'dk');
title('Spectrogram max dB value vs Time');
xlabel('Time (s)');ylabel('Max dB value');
% set to zero the data that are defined by the local mins just before
% and after the high peaks
for ck = 1:numel(x_peak)
% search x_min just before
dist = x_min - x_peak(ck);
ind_bef = find(dist<0,1,'last');
x_min_bef = x_min(ind_bef);
ind_aft = find(dist>0,1,'first');
x_min_aft = x_min(ind_aft);
% now zero time signal between these two time indexes
ind = find(time>=x_min_bef & time<=x_min_aft);
% signal_filtered(ind) = 0; % option 1 : zero
signal_filtered(ind) = signal_filtered(ind)/30 ; % option 2 : 30 dB attenuation
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% display : time domain plot
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
figure(1),
subplot(2,1,1),plot(time,signal,'b');grid on
title(['Time plot / Fs = ' num2str(Fs) ' Hz / raw data ']);
xlabel('Time (s)');ylabel('Amplitude');
subplot(2,1,2),plot(time,signal_filtered,'b');grid on
title(['Time plot / Fs = ' num2str(Fs) ' Hz / filtered data ']);
xlabel('Time (s)');ylabel('Amplitude');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% export signal
audiowrite('filtered.wav',signal_filtered,Fs); % audiowrite(filename,y,Fs,varargin)
  8 Commenti
wei sun
wei sun il 15 Lug 2022
Ok thank you, I have been taught, the FFT of the entire segment does take up a lot of computing power, and it will introduce a lot of invalid information。
Mathieu NOE
Mathieu NOE il 15 Lug 2022
the saving in computation is proportionnal to the applied decimation factor (here 40) so I don't think it's negelctable especcially if you want to apply the code to longer wav files
but of course you can remove the decimation operation if you feel bad about it

Accedi per commentare.

Più risposte (0)

Prodotti


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by