Advice for Audio classifier based on Voice Activity Detection

2 visualizzazioni (ultimi 30 giorni)
I am writting a program to classify recorded audio phone calls files (wav) which contain atleast some Human Voice or Non Voice (only DTMF, Dialtones, ringtones, noise). I tried implementing simple VAD (voice activity detector) using ZCR (zero crossing rate) & calculating Energy, but these parameters confuse with DTMF, Dialtones files with Voice.
I also tried implementing a machine learning based approach using SVM (Support Vector Machine) and MFCC coefficients. The results were worse than previous approach.
I need someone to advice me little on this domain, I have no previous experience in machine learning or AI. I am willing to put in good amount of time in this domain.
I am comfortable working in MATLAB, scipy, numpy, scikit-learn, python.
Thank you
  1 Commento
Md Sahidullah
Md Sahidullah il 4 Giu 2015
Hi! You can try some unsupervised technique. For speech and non-speech discrimination, I have found Bi-Gaussian modeling is very much effective, especially for noisy environment for speaker recognition.
You can even try some different clustering approaches with MFCC as the front-end for the classification of your audio segments.
Hope it helps. Thanks Sahid

Accedi per commentare.

Risposte (0)

Categorie

Scopri di più su AI for Audio in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by