Post provided by Julius Juodakis
Common solutions to wind noise don’t work with bioacoustics
Bioacoustics has great potential to help us understand animal communities. We already have strikingly futuristic hardware for capturing natural sounds, such as the autonomous Audiomoth or the 5-gram μMoth recorders, and projects making use of it, such as the live-observation WhaleMap, SAFE Project in Borneo, or the Australian Acoustic Observatory, the latter aiming to collect 2 petabytes of sound. Terrestrial acoustics is particularly common for monitoring birds in New Zealand, where I and my supervisor Stephen Marsland wrote this paper. However, analysing this data – and specifically detecting when a target species is calling – is still a challenge due to noise. Wind in particular changes the background sound levels so rapidly and strongly that sound recognition tools fail or mark huge numbers of false detections, a problem that was very apparent to us in Wellington, the windiest capital of the world.
But why should wind noise still be a problem today, when we have almost-intelligent neural networks and almost-self-driving cars, among other advanced technology? There are indeed many solutions to wind noise in other areas, but soon after starting to work with bioacoustics we realised that none of the solutions really work for us. Wind is commonly removed with physical barriers, starting from simple screens or fluffy microphone covers for TV crews, and ending with tubular interference-based noise removers for infrasonic stations. Many consumer devices rely on multiple microphones to subtract and cancel out noise digitally. These approaches are not applicable to our problem: terrestrial recorders need to be simple, lightweight, portable and inexpensive, and furthermore we still wish to analyse the large collections of recordings from devices already deployed up to this point. Digital denoising with neural networks is also becoming very powerful, and can be applied to this problem. However, here we run into training data limitations – unlike with human sounds, there are very few collections of labelled animal sounds sufficient to train such networks. For conservation and research, we are primarily interested in rare species, but also in different life stages or types of calls, and ensuring that algorithms treat them accurately would require huge, highly detailed, expert-curated training datasets that are simply not yet available.
Classical statistics to the rescue
To deal with wind noise we therefore turned to more classical statistical and signal processing methods. Specifically, we use a wavelet packet transform, which divides the sound into short time periods (<1 s), and then into separate frequency components within each period. In the resulting sound spectrum, wind appears a smooth slope across the frequencies, while animal sounds typically stand out with peaks in some narrower frequency bands, like this:
We can fit a line (or a smooth curve; red dashes in the graph) to this spectrum, and subtract the line, leaving only the peaks, corresponding to calls. What is more, these denoised wavelets can be converted to audible sound again. Wavelets were popular for signal analysis in the 1990s, and fell out of fashion later, in favour of more complex methods. To us this was a benefit: wavelet operations are relatively simple and well-studied, so we know how they affect different types of sound based on statistical theory. We can then use this understanding to help design the entire analysis workflow, and also feel much safer about using our method with new, untested sound environments or species.
Wind noise removal helps monitor birds
One way to show the benefits of our method is to look at examples of spectrograms depicting the sound before and after the denoising, such as shown in this example with a wood thrush calling and a wind gust (noisy recording – top, denoised – bottom):
A more direct and objective way is to perform an actual acoustic monitoring survey as it would be done in practice. We collected recordings from a few nights in the bush, capturing the sounds of kiwi, bittern, and a lot of wind and rain. Bittern in particular are hard to spot visually, and they call repetitively at very low frequencies, so mitigating wind noise is really important for their monitoring. We then processed this data in two ways: either just applying an automatic detector to the raw data, or first denoising with our method and then applying the detector. A human expert then reviews the detections. We found that without the denoising, detectors produce 2-10x more false positives, and also mark the true calls inaccurately, which creates a lot of work for the human expert to fix. Thus, our method greatly reduces the human effort needed – or, if you only have a fixed amount of time to spend on a survey, it greatly improves how much data you can analyse, and correspondingly improves your estimates of bird populations.
An essential part of the work in our group is making the methods that we develop available for everyone to use, so that they don’t remain merely theoretical suggestions. We therefore developed AviaNZ, a graphical software package for listening to, annotating, and analysing long sound recordings:
Besides automatic detection and wind noise removal, AviaNZ has been used for looking at individual bird differences, transposing bat calls to the audible range, geolocating bird nests, monitoring environmental noise, and other side projects. We have spent a lot of time in making sure it provides a user-friendly interface, clear warnings and error messages, and a command line interface for advanced users, so we invite you to try it out as well – use it for your species and join in developing it further. AviaNZ is open source and free for all to download here.
You can read the full article ‘Wind-robust sound event detection and denoising for bioacoustics’ here (https://doi.org/10.1111/2041-210X.13928)