Simple Science

Cutting edge science explained simply

# Quantitative Biology# Quantitative Methods

Understanding Bird Vocalizations Through Sound Analysis

A new method helps estimate bird vocal repertoires using sound analysis.

― 6 min read


Bird Sound AnalysisBird Sound AnalysisMethodologyvocalizations unveiled.New techniques for analyzing bird
Table of Contents

Birds make different sounds that create a collection of vocalizations known as their vocal repertoire. Knowing how big this repertoire is can help us learn about their brain size, the area they claim as their own, and how they interact with each other. However, figuring out how many unique sounds a bird can make can be tricky because it involves analyzing a lot of sound recordings, which can be hard to gather and understand.

To start, researchers take recordings of bird songs, break them down into smaller parts called Syllables, and organize them into groups. There are two main ways to do this. One way is to simply count each unique sound until no new sounds appear. The other approach uses specific algorithms to make the process more reliable and consistent. This paper introduces a method that automatically measures the differences between bird song syllables, making it easier to group them without prior knowledge of what they are.

Introduction to Bird Vocalizations

Bird sounds can be grouped into five main types: elements, syllables, phrases, calls, and songs. Elements are the smallest units of sound, while syllables can consist of one or more elements. Syllables usually last a few hundred milliseconds. Phrases are short combinations of syllables, and calls are short sequences of phrases. Songs are the long, complex vocalizations that we often associate with bird singing.

Take the song of the European Greenfinch as an example. This bird's song lasts more than a minute and is made up of various sounds, including tremolos, repeating tonal units, and specific nasal sounds. Researchers have identified four main types of phrases in its song, including a trill and a nasal "tswee," a sound that occurs roughly 10% of the time. Some believe this "tswee" is innate, suggesting it is part of the bird's genetic heritage.

The size of a bird's repertoire can also be estimated by counting phrases. For the greenfinch, many phrases are repeated at consistent intervals for about half a second. This leads researchers to consider using machine learning techniques to create a system that can estimate the greenfinch's vocal repertoire.

Gathering Bird Sounds

To find the size of a bird's vocal repertoire, researchers begin by recording bird sounds, usually in their natural habitats. These recordings capture not just the target bird's sounds but also noises from other animals, people, and environmental factors like wind and water. Filtering out these extraneous sounds is essential to isolate the songs for analysis.

Wavelet technology, which analyzes signals at various scales, is useful for filtering out noise. In this study, a specific type of wavelet called Daubechies will be used to create high pass filters to eliminate background noise.

Segmentation is the next step, which involves breaking down the audio into meaningful parts. Unlike other methods like Fourier transforms, wavelet analysis does not require a fixed window to avoid losing information. Instead, it can analyze the sound without worrying about discontinuities. The process uses energy detection to find segments of the recording that contain bird songs.

Extracting Features from the Sounds

After segmenting the audio, the next step is Feature Extraction. This means converting audio data into a simplified format called a feature vector, which helps in classifying the sounds. The study focuses on analyzing short time frames of the audio to extract various characteristics, including energy, duration of syllables, and frequency-related properties.

Some features taken from the songs include:

  • Energy (how loud the sound is)
  • Zero Crossing Rate (how often the sound changes from positive to negative)
  • Duration of the Syllable (how long each syllable lasts)
  • Spectral characteristics such as bandwidth and centroid.

Additionally, Mel-Frequency Cepstral Coefficients (MFCCs) will also be used since they are effective for analyzing musical tones.

Choosing the Best Features

Selecting the most relevant features is crucial to improve the accuracy of the analysis. Two methods are considered for this, Variance Threshold and Laplacian Score, which help decide which features are most helpful in distinguishing different types of bird sounds.

Variance Threshold focuses on eliminating features that do not vary much across samples, while Laplacian Score assesses the relevance of features based on how well they preserve local structures in the data.

Clustering the Sounds

Once the relevant features are identified, they can be fed into a clustering algorithm, which groups the syllables together based on the similarities in their features. This study uses the DBSCAN algorithm, which is good at identifying core samples and can separate noise from the actual sounds.

The input data is visualized using a technique called t-Distributed Stochastic Neighbor Embedding (t-SNE), which helps in understanding how the data points relate to each other in a two-dimensional space.

The DBSCAN algorithm can determine how many clusters exist in the data and handle unbalanced classes. It identifies large clusters of similar sounds and marks individual sounds as noise when necessary.

Building the Data Set

To develop and validate the system, a data set is built using recordings from a specific online source focused on sharing bird sounds. Only high-quality recordings are selected to ensure good analysis, resulting in a collection of audio files featuring the European Greenfinch.

The data includes details such as the recording location, quality rating, and type of sound, which helps in organizing and analyzing the recordings.

Assessing Cluster Performance

After the sounds are clustered, the performance of the clusters will be evaluated using metrics such as the Silhouette score, which provides insight into how well the sounds are grouped. A high Silhouette score suggests that the sounds are well matched within their clusters and poorly matched to others.

Fine-tuning the parameters of the clustering algorithm is important to ensure accurate results. By adjusting the minimum number of samples required to define clusters, the researchers can find the optimal number of clusters that represent the vocal repertoire.

Results and Conclusions

After going through the clustering process, researchers found that the vocal repertoire of the European Greenfinch can be estimated by observing the number of syllable classes identified by the clustering algorithm. These classes correspond to the previously observed types of phrases in the bird's song.

The study also revealed that there were many segments identified as noise, highlighting a challenge in accurately estimating repertoire size. Due to the presence of noise in the data, the results may sometimes underestimate or overestimate the bird's vocal capabilities.

The findings can open new avenues for exploring how geographical differences affect bird songs, as previous research suggests variations in vocalizations across regions. Future research can focus on refining the techniques used for filtering noise and feature extraction to improve accuracy.

Overall, the system developed in this study offers a valuable tool for estimating the size of a bird's vocal repertoire, providing a deeper understanding of bird communication and behavior.

Original Source

Title: Estimating the Repertoire Size in Birds using Unsupervised Clustering techniques

Abstract: Birds produce multiple types of vocalizations that, together, constitute a vocal repertoire. For some species, the repertoire size is of importance because it informs us about their brain capacity, territory size or social behaviour. Estimating the repertoire size is challenging because it requires large amounts of data which can be difficult to obtain and analyse. From birds vocalizations recordings, songs are extracted and segmented as sequences of syllables before being clustered. Segmenting songs in such a way can be done either by simple enumeration, where one counts unique vocalization types until there are no new types detected, or by specific algorithms permitting reproducible studies. In this paper, we present a specific automatic method to compute a syllable distance measure that allows an unsupervised classification of bird song syllables. The results obtained from the segmenting of the bird songs are evaluated using the Silhouette metric score.

Authors: Joachim Poutaraud

Last Update: 2023-03-19 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2303.10678

Source PDF: https://arxiv.org/pdf/2303.10678

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

Similar Articles