Advice for Audio classifier based on Voice Activity Detection

2 visualizzazioni (ultimi 30 giorni)

lsteng il 22 Mag 2015

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/218181-advice-for-audio-classifier-based-on-voice-activity-detection

Commentato: Md Sahidullah il 4 Giu 2015

I am writting a program to classify recorded audio phone calls files (wav) which contain atleast some Human Voice or Non Voice (only DTMF, Dialtones, ringtones, noise). I tried implementing simple VAD (voice activity detector) using ZCR (zero crossing rate) & calculating Energy, but these parameters confuse with DTMF, Dialtones files with Voice.

I also tried implementing a machine learning based approach using SVM (Support Vector Machine) and MFCC coefficients. The results were worse than previous approach.

I need someone to advice me little on this domain, I have no previous experience in machine learning or AI. I am willing to put in good amount of time in this domain.

I am comfortable working in MATLAB, scipy, numpy, scikit-learn, python.

Thank you

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Md Sahidullah il 4 Giu 2015

Hi! You can try some unsupervised technique. For speech and non-speech discrimination, I have found Bi-Gaussian modeling is very much effective, especially for noisy environment for speaker recognition.

http://arxiv.org/abs/1210.0297

You can even try some different clustering approaches with MFCC as the front-end for the classification of your audio segments.

Hope it helps. Thanks Sahid

Accedi per commentare.

Accedi per rispondere a questa domanda.