摘要
Computational bioacoustics is a relatively young research area,yet it has increasingly received attention over the last decade because it can be used in a wide range of appli-cations in a cost-effective manner.This work focuses on the problem of detecting the novel bird calls and songs associated with various species and individual birds.To this end,variational autoencoders,consisting of deep encoding-decoding networks,are employed.The encoder encompasses a series of convolutional layers leading to a smooth high-level abstraction of log-Mel spectrograms that characterise bird vocalisations.The decoder operates on this latent representation to generate each respective original observation.Novel species/individual detection is carried out by monitoring and thresholding the expected reconstruction probability.We thoroughly evaluate the pro-posed method on two different data sets,including the vocalisations of 11 North American bird species and 16 Athene noctua individuals.
基金
This work was carried out within the project automatIc aNalySis of comPlex evovlIng auditoRy scEnes(INSPIRE)funded by the Piano Sostegno alla Ricerca of University of Milan.