Type yamnetGraph at the Command Window. If the Audio Toolbox support for YAMNet is not installed, then the function provides a link to the download location. To download the model, click the link. Unzip the file to a location on the MATLAB path.

Alternatively, execute the following commands to download and unzip the YAMNet model to your temporary directory.

downloadFolder = fullfile(tempdir,'YAMNetDownload');
loc = websave(downloadFolder,'https://ssd.mathworks.com/supportfiles/audio/yamnet.zip');
YAMNetLocation = tempdir;
unzip(loc,YAMNetLocation)
addpath(fullfile(YAMNetLocation,'yamnet'))

Check that the installation is successful by typing yamnetGraph at the Command Window. If the network is installed, then the function returns a digraph object.

yamnetGraph

Identify Major Categories of Ontology

This example uses:

Open Live Script

Create a digraph object that describes the AudioSet ontology.

ygraph = yamnetGraph

ygraph = 
  digraph with properties:

    Edges: [670×1 table]
    Nodes: [632×1 table]

Visualize the ontology. The ontology consists of 632 separate classes with 670 connections.

p = plot(ygraph);
layout(p,'layered')

Get the name of each sound class. If the sound class has no predecessors, identify it as a major category of the ontology.

nodeNames = ygraph.Nodes.Name;
topCategories = {};
for index = 1:numel(nodeNames)
    pre = predecessors(ygraph,nodeNames{index});
    if isempty(pre)
        topCategories{end+1} = nodeNames{index};
    end
end

Display the categories as an array of strings.

topCategories = string(topCategories)

topCategories = 1×7 string
    "Human sounds"    "Animal"    "Music"    "Natural sounds"    "Sounds of things"    "Source-ambiguous sounds"    "Channel, environment and background"

Highlight and label the top categories on the digraph plot.

highlight(p,topCategories,"NodeColor","red","MarkerSize",8)
labelnode(p,topCategories,topCategories)

Plot Subgraph of Animal Sounds

This example uses:

Open Live Script

Create a digraph object that represents the AudioSet ontology.

ygraph = yamnetGraph;

Use dfsearch to perform a depth-first graph search to identify all audio classes under the class Animal.

animalNodes = dfsearch(ygraph,"Animal");

Use subgraph to create a new digraph object that only includes the identified audio classes. Plot the resulting directed edges graph.

animalGraph = subgraph(ygraph,animalNodes);

p = plot(animalGraph);

p.NodeFontSize = 12;
graphFigure = gcf;
old = graphFigure.Position;
set(graphFigure,'position',[old(1),old(2),old(3)*3,old(4)*3])

Use predecessors to determine all predecessors to the Growling sound. Highlight the predecessors on the plot.

preIDs = predecessors(animalGraph,"Growling")

preIDs = 4×1 string
    "Dog"
    "Cat"
    "Roaring cats (lions, tigers)"
    "Canidae, dogs, wolves"

Use highlight to highlight the Growling node and the predecessors on the plot.

highlight(p,"Growling",'NodeColor','g','MarkerSize',8)
highlight(p,preIDs,'NodeColor','r','MarkerSize',8)

Visualize Sounds Supported by YAMNet

This example uses:

Open Live Script

Create a digraph object that describes the AudioSet ontology. Also return the classes supported by YAMNet. Plot the directed graph.

[ygraph,classes] = yamnetGraph;
p = plot(ygraph);
layout(p,'layered')

YAMNet predicts a subset of the full AudioSet ontology. Display the sound classes that are in the AudioSet ontology but are not possible outputs from the YAMNet network.

audiosetClasses = ygraph.Nodes.Name;
classDiff = setdiff(audiosetClasses,classes)

classDiff = 111×1 string
    "Acoustic environment"
    "Alto saxophone"
    "Background noise"
    "Bass (frequency range)"
    "Bass (instrument role)"
    "Bassline"
    "Bassoon"
    "Battle cry"
    "Bay"
    "Beat"
    "Birthday music"
    "Blare"
    "Booing"
    "Brief tone"
    "Bugle"
    "Cat communication"
    "Cellphone buzz, vibrating alert"
    "Channel, environment and background"
    "Chipmunk"
    "Chord"
    "Clavinet"
    "Clunk"
    "Compact disc"
    "Cornet"
    "Crash cymbal"
    "Cumbia"
    "Deformable shell"
    "Digestive"
    "Domestic sounds, home sounds"
    "Donkey, ass"
      ⋮

Highlight the classes that are not possible outputs from YAMNet.

highlight(p,classDiff,'NodeColor','r')

Analyze one of the major categories.

categoryToAnalyze = "Channel, environment and background";
subsetNodes = dfsearch(ygraph,categoryToAnalyze);
ygraphSubset = subgraph(ygraph,subsetNodes);
classToHighlight = intersect(classDiff,ygraphSubset.Nodes.Name);
pSub = plot(ygraphSubset);
layout(pSub,'layered')
highlight(pSub,classToHighlight,'NodeColor','r')

Visualize Specificity of Sound Classes

This example uses:

Open Live Script

Create a digraph object that describes the AudioSet ontology.

ygraph = yamnetGraph;

Specify a sound class to visualize, and specify the number of predecessors and successors. The available sound classes are only those that are supported as outputs from YAMNet. If you specify more predecessors or successors than those in the ontology, only the predecessors and successors in the ontology are shown.

soundClass = "Growling";
numPredecessors = 3;
numSuccessors = 0;

pred = nearest(ygraph,soundClass,numPredecessors,'Direction','incoming');
suc = nearest(ygraph,soundClass,numSuccessors,'Direction','outgoing');
subClasses = [soundClass;pred;suc];

ygraphSub = subgraph(ygraph,unique(subClasses));
p = plot(ygraphSub);
layout(p,'layered')
highlight(p,soundClass,'Marker','d','NodeColor','red','MarkerSize',6)

Output Arguments

collapse all

`ygraph` — AudioSet ontology graph with directed edges
`digraph` object

AudioSet ontology graph with directed edges, returned as a digraph object.

`classes` — Classes supported by YAMNet
string array

Classes supported by YAMNet, returned as a string array. The classes supported by YAMNet are a subset of the AudioSet ontology.

Tips

Google^® provides a website where you can explore the AudioSet ontology and the corresponding data set: https://research.google.com/audioset/ontology/index.html.

References

[1] Gemmeke, Jort F., et al. “Audio Set: An Ontology and Human-Labeled Dataset for Audio Events.” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2017, pp. 776–80. DOI.org (Crossref), doi:10.1109/ICASSP.2017.7952261.

[2] Hershey, Shawn, et al. “CNN Architectures for Large-Scale Audio Classification.” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2017, pp. 131–35. DOI.org (Crossref), doi:10.1109/ICASSP.2017.7952132.

Version History

Introduced in R2020b

yamnetGraph

Syntax

Description

Examples

Download `yamnetGraph`

Identify Major Categories of Ontology

Plot Subgraph of Animal Sounds

Visualize Sounds Supported by YAMNet

Visualize Specificity of Sound Classes

Output Arguments

`ygraph` — AudioSet ontology graph with directed edges
`digraph` object

`classes` — Classes supported by YAMNet
string array

Tips

References

Version History

See Also

Apps

Blocks

Functions

yamnetGraph

Syntax

Description

Examples

Download yamnetGraph

Identify Major Categories of Ontology

Plot Subgraph of Animal Sounds

Visualize Sounds Supported by YAMNet

Visualize Specificity of Sound Classes

Output Arguments

ygraph — AudioSet ontology graph with directed edges digraph object

classes — Classes supported by YAMNet string array

Tips

References

Version History

See Also

Apps

Blocks

Functions

Download `yamnetGraph`

`ygraph` — AudioSet ontology graph with directed edges
`digraph` object

`classes` — Classes supported by YAMNet
string array