Main Content

Segmentation

Detect and isolate speech and other sounds

Detect speech and other sounds and locate their start and end times. For streaming applications, use a voice activity detector (VAD) to output the probability that speech is present in a given frame. You can also use speech2text to create time-aligned word labels for speech signals.

Apps

Signal LabelerLabel signal attributes, regions, and points of interest, and extract features

Objects

voiceActivityDetectorDetect presence of speech in audio signal

Functions

enhanceSpeechEnhance speech signal (Since R2024a)
separateSpeakersSeparate signal by speakers (Since R2023b)
detectspeechnnDetect boundaries of speech in audio signal using AI (Since R2023a)
detectSpeechDetect boundaries of speech in audio signal (Since R2020a)
classifySoundClassify sounds in audio signal (Since R2020b)

Blocks

Voice Activity DetectorDetect presence of speech in audio signal