Figure 2. UMAP ordination of the WMD dataset with samples coloured
according to three large taxonomic groups (Mysticete, Odontocete, and
Pinniped). Pinniped sample points were plotted at double size to improve
visualisation.
Within the Mysticete group, only three families contained enough
samples to be considered for further analysis: Balaenopteridae,
Balaenidae, and Eschrichtiidae. In the subsequent UMAP
ordination, Balaenidae samples were almost completely overlapped
with Balaenopteridae vocalizations, close to the plot centre
(Fig. 3). Eschrichtiidae samples, the least represented label
(i.e., the minority label) for the Mysticete , clustered in four
distinct areas of the UMAP plot.
The Odontocete group was dominated by the Physteridaefamily, which represented the majority label for the subset, followed byDelphinidae and Monodontidae (Fig 4). Phocoenidaevocalisations were the minority label, and, similarly toEschrichtiidae, samples belonging to this family formed small
clusters scattered across the UMAP plot area.