Audio Processing Using Deep Learning

Extend deep learning workflows with audio and speech processing applications

Apply deep learning to audio and speech processing applications by using Deep Learning Toolbox™ together with Audio Toolbox™.

Apps

Audio LabelerDefine and visualize ground-truth labels

Functions

audioDatastoreDatastore for collection of audio files

Topics

Introduction to Deep Learning for Audio Applications (Audio Toolbox)

Learn common tools and workflows to apply deep learning to audio applications.

Classify Sound Using Deep Learning (Audio Toolbox)

Train, validate, and test a simple long short-term memory (LSTM) to classify sounds.

Speech Command Recognition Using Deep Learning

This example shows how to train a simple deep learning model that detects the presence of speech commands in audio.

Denoise Speech Using Deep Learning Networks

This example shows how to denoise speech signals using deep learning networks.

Classify Gender Using LSTM Networks

This example shows how to classify the gender of a speaker using deep learning.

Voice Activity Detection in Noise Using Deep Learning

This example shows how to detect regions of speech in a low signal-to-noise environment using deep learning.

Spoken Digit Recognition with Wavelet Scattering and Deep Learning

This example shows how to classify spoken digits using wavelet time scattering paired with a support vector machine and a deep convolutional network based on mel-frequency spectrograms.

Cocktail Party Source Separation Using Deep Learning Networks

This example shows how to isolate a speech signal using a deep learning network.

Sequential Feature Selection for Speech Emotion Recognition

This example shows a typical workflow for feature selection applied to the task of speech emotion recognition.

Keyword Spotting in Noise Using MFCC and LSTM Networks

This example shows how to identify a keyword in noisy speech using a deep learning network.

Acoustic Scene Recognition Using Late Fusion

This example shows how to create a multi-model late fusion system for acoustic scene recognition.

Featured Examples