Reduced cost of acoustic recorders and increase in storage capacity has resulted in an exponential development of Passive Acoustic Monitoring (PAM) of a very wide range of animals in a few years [1, 2, 3], opening up a new field of research  and responding to an urgent need for biodiversity monitoring . However, data analysis of this emerging big data is now impeded by limited software development and availability [6, 7].
To date, no available software allows researchers to perform PAM on a wide range of taxonomic groups, though numerous classification methods of animal vocalizations have been developed in recent years . Here we present a software toolbox that enables the building of an automatic classifier from available sound reference data, and then to classify recorded sound events into known classes. This toolbox was called Tadarida (which is incidentally a widespread bat genus), for Toolbox for Animal Detection on Acoustic Recordings Integrating Discriminant Analysis. Discriminant analysis is not defined here as Fisher’s LDA but rather any type of automatic classification.
This toolbox was initially developed to support the French bat monitoring scheme “Vigie-Chiro” launched in 2006 . Like other PAM schemes during this period, Vigie-Chiro experienced an exponential increase in recording data. In addition, a large volume of acoustic data has been recorded for other taxa without being identified, especially for bush-crickets which did not benefit from any monitoring scheme [10, 11].
Until now, all available software for bat automatic identification were commercial and based on a very specific sound event detection process that prevented efficient bush-cricket detection. Tadarida development focused on detecting every sound event, even if they are structurally very different (e.g. bats and bush-crickets) and overlapping in time or frequency.
This generic detector enables complex delimitations of the sound events in time and frequency so that the least possible amount of noise, i.e. spectrogram elements that do not contain signal, are included. It then performs a broad range of feature extraction on all detected sound events (DSEs). This first part of the toolbox was named Tadarida-D (for Detection).
Because of a lack of sufficient available reference sound data on bats and bush-crickets, we needed to build our own training dataset for automatic identification (machine learning). However, no available software allowed for sound annotation at the scale of our detected sound events. Massive sound annotation of animal vocalizations was limited to a larger time scale (several seconds) so that most sound reference data contained multiple species with sound events not attributed to a single label with certainty. Although recent developments have significantly improved the machine learning process for multi-label multi-instance data , we circumvent such limitations by developing a Graphical User Interface called Tadarida-L (for Labelling) that allows a user to generate single-label single-instance training data.
The last part of the toolbox, Tadarida-C (for Classification), uses features extracted by Tadarida-D and labels generated through Tadarida-L to build and use a random forest automatic classification that handles strong unevenness in class sample size.
This modular toolbox has been designed for a community of users sharing two interacting roles (e.g. PAM managers and participants). First, “PAM managers” go through the whole process: automatic detection and labelling to build automatic classification. Then, “PAM participants” can use the detection and classification processes to apply the automatic classifier provided by PAM managers. This development resulted in the first open software (1) allowing generic sound event detection (multi-taxa), (2) providing graphical sound labelling at a single-instance level and (3) covering the whole process from sound detection to classification.
Tadarida toolbox consists of 3 software modules (see Fig. 1 for the structure):
TADARIDA-D and TADARIDA-L have been developed in Qt C++, because it combines clean writing, based on the concepts of object-oriented programming, and allows for fast processing.
TADARIDA-D runs faster than real time on recordings even with a basic configuration and under the most demanding conditions: 500 kHz sample rate .wav files, detecting 100 sound events per seconds and extracting 269 features.
These constraints have sometimes led us to focus on solutions that save time at the expense of code readability. For example, we used pointers in a number of cases and tables instead of vectors. This applies to shapesDetects and detectsParameter2 (see below).
Qt 5.5 framework was chosen for its portability across different operating systems. Initially developed for Windows, Tadarida-D has also been successfully deployed on diverse Linux systems (see below).
Tadarida-D uses two external libraries:
To exploit the potential of multi-core computers, Tadarida-D uses parallel computing, launching one to eight threads according to user settings. These threads simultaneously process a fraction of the .wav files.
TadaridaD is a non-graphical application that is executed from the command line in scripts, particularly on servers.
This application handles the sound events detection and features extraction, and is structured around 4 classes:
DetecLaunch operates the input parameters: directories of .wav files to process, number of threads to be executed, etc.
The process first distributes .wav files among the threads. Next, variables that will be used by the threads are initialized (to manage the variables that do not support multithreading) and the threads created (objects of Detec class) with the necessary settings, and then executed.
Detec class retrieves provided settings and creates the object of DetecTreatment class that will handle the output processing (generating features tables).
The run() method (executed by pdetec [i] → start() ;) organizes treatment.
It calls for each .wav file to be processed, the treatOneFile method, which launches CallTreatmentsForOneFile(…) method of DetecTreatment class (see below).
Note that the division of tasks between the Detec and Detectreatment classes avoids a duplication of code in the DetectTreatment class between Tadarida-D and Tadarida-L projects (see below).
After initialization, the CallTreatmentsForOneFile method handles the processing of each .wav file by successively calling the following methods:
Note that shapesDetects defines the master point of each DSE as its element with the highest amplitude value, then DSE are filtered according to the frequency of this master point: 8–250 kHz in HF mode and 0.8–25 kHz in LF mode. Note that the HF range covers every known vocalization of bats and bush-crickets and the LF range covers most other animal vocalization.
Tadarida-L is an extended version of Tadarida-D to handle sound events labelling and reference sound database (RSDB) constitution (in addition to sound events detection and features extraction as done by Tadarida-D).
This application is structured around 4 classes (corresponding to each .cpp file):
This class generates the main window of the software and displays main user settings.
The left half of the window contains all the input settings of the detection process: the directory path of the .wav files to be processed, and optional settings (Fig. 3). It calls threads of the Detec class for .wav files treatment.
The right half of the window displays the labelling interface. It calls the following Fenim class.
Both Detec and Detectreatement classes have the same purpose in Tadarida-L as they have in Tadarida-D (see above). Detectreatment class is strictly identical. Detec class here has one additional function: it handles the generation of supplementary output files (in extended mode): spectrogram images (.jpg) and DSEs data (.da2). These files are used by Fenim class as input information (see Fig. 2).
Fenim class handles the labelling man-machine interface (Fig. 4), involving:
This labelling process allows the user to quickly generate a standardized RSDB that can be shared with other users, and/or used to build their own classifier (Fig. 2).
Note that Tadarida-L has been built in an evolutive way so that both detection and feature extraction process can be drastically changed with no loss of information in users RSDBs.
Tadarida will indeed propose an entire base reprocessing to upgrade and homogenize the database, if the defined RSDB corresponds to an anterior version of the software. New DSEs will be matched according to their previous and new structural data (.da2), see https://github.com/YvesBas/Tadarida-L/blob/master/Manual_Tadarida-L.odt for details.
Tadarida-C handles the classification of DSE, based on features extracted by Tadarida-D and RSDB collected by Tadarida-L, and thus provides the final output of the Tadarida toolbox.
It contains two R-scripts: one for building a classifier (for expert users) and another using it (for end users, see Fig. 2). Tadarida-C has been developed in R because it allows the use and optimisation of the Random Forests (RF) algorithm , with numerous built-in statistical tools (summaries, graphs, etc). Both scripts use randomForest and data.table packages [14, 15]. This latter package is only used through its “rbindlist” function that allows for fast aggregation of large data frames.
The “buildClassif.r” script should be run each time user’s RSDB has been significantly improved or when there is any need for a new classifier (different species list, settings, etc). It first aggregates and merges label data (.eti files, see Tadarida-L) and features data (.ta files) from a specified RSDB (Fig. 2).
This data frame may then be filtered on a species list (if provided) and is finally used to build a series of 50 RFs  of 10 classification trees each. The authors found out that combining trees with different subsampling levels better handled the trade-off between classification error rates on common species (i.e. species with a large number of labelled DSEs in the RSDB) and error rates on rare species. Thus, a gradient of subsampling is set so that the first trees of the series use most of the available DSEs in the RSDB, whereas last trees use an equal number of DSEs for most species. The strength of the subsampling can be tuned through two input settings (see https://github.com/YvesBas/Tadarida-C/blob/master/Manual_Tadarida-C.odt for details on all RF parameters).
The “site” field of the label data is also used for subsampling, each RF using 63% of available sites. This makes the training and testing independent for each tree. To implement this subsampling, authors have modified the randomForest function so that size of sample class (sampsize) could be equal to 0.
The output file is named “ClassifEspHF3.learner” to be used as input in the following script (Fig. 2).
The “TadaridaC.r” script aims to return to the user the list of species present within each treated .wav file, and a confidence index of each automatic identification. For that purpose, it applies the previously built RF classifier (ClassifEspHF3.learner) to any list of .ta files (output from Tadarida-D or Tadarida-L, see Fig. 2).
With the “predict” function from randomForest R package , it first converts features values extracted on each DSE to a matrix of class probabilities named “ProbEsp0” (i.e. the probabilities for each DSE to belong to each potential species).
Most users will not be interested to an identification on such a small scale since many DSEs within a .wav file could come from the same source. Thus, the next commands aim at (1) summarizing this information within a .wav file and (2) defining the number of species that are present within the recording.
This latter objective is accomplished through a simple and robust algorithm functioning in a loop:
Some ancillary data is also computed for end users to get summary information on the species likely to be present in their recordings, where in time and frequency the identified vocalizations occur and how confident the identification is (see https://github.com/YvesBas/Tadarida-C/blob/master/Manual_Tadarida-C.odt for details).
The detection, feature extraction and classification processes have been extensively tested by its automated use on French Bat Monitoring data since October 2015. 296 participants uploaded 12 952 016 wave files on a web portal (www.vigiechiro.herokuapp.com; see [16, 17] for source code), and all files were successfully processed by Tadarida-D and Tadarida-C to get species identification on all sound events recorded. The few bugs that remained after the development phase were then quickly discovered and fixed.
Moreover, Tadarida toolbox has been used to build a specific classifier for the Norfolk Bat Survey, then to process the whole data, i.e. 1.1 million .wav files. Newson et al.  manually checked 328 291 of these files to evaluate classifier performance on bush-crickets, showing variable but sufficient performance to perform large-scale monitoring. Appendix S1 of this latter paper further shows the relative importance of features extracted by Tadarida in species classification. Therefore, both Vigie-Chiro and Norfolk Bat Survey provide valuable proofs of concept that Tadarida fulfils its purpose.
Any upgrade in detection and/or feature extraction processes resulted in the reprocessing of authors’ RSDB (today counting 17,773 .wav files and 863,730 labelled DSEs) that constituted each time a robust test able to detect any new bugs before a release to the users. This reprocessing happened 8 times since the start of the authors’ RSDB constitution (01/08/2014).
All three modules (Tadarida-D, -L and -C) have thus been intensively tested through this production phase. Moreover, use cases of Tadarida HMI are relatively few and then easily tested before each release to the users. For that purpose, each code repository contains a test plan document describing manual test scenarios, and their associated input and output files.
Tadarida-D and -C have been successfully tested on various versions of Windows (both 32- and 64-bit, and from XP to 8.1) and Linux (Ubuntu 14.04LTS and 14.6, Debian 8.0 and Scientific Linux 6.8).
Code sources are the same for both OS, except for one line which is disabled in the Windows version (“typedef int64_t__int64;” in detec.h).
Tadarida-L have been successfully tested on various versions of Windows (from Windows XP to Windows 8.1). It has not been built on Linux yet.
C++ Qt (5.4 and 5.5) for Tadarida-D and Tadarida-L.
Both have been successfully built from Qt 4.8.5 to Qt 5.7.
R 3.3.0 for Tadarida-C.
Memory consumption is approximately 150 MB for threads of Tadarida-D, but is much higher for Tadarida-C. Memory consumption of the “builder” script depends on the RSDB size and the number of defined species. With more than 800 000 DSEs and 91 defined species in current authors’ RSDB, the builder script consumes approximately 15 GB of memory. “TadaridaC” script consumes a lower amount of memory unless a very large number of DSEs is provided in the input (more than 1 million).
All modules have a dedicated manual that details the install procedure and the requirements.
Tadarida-D and Tadarida-L use the “sndfile” and “fftw3” libraries while Tadarida-C use “randomForest” and “data.table” libraries.
Publisher: Yves Bas
Date published: 07/11/16
Licence: LGPL-3.0 – GNU for Tadarida-D and Tadarida-L, GPL-3.0 – GNU for Tadarida-C
Date published: 07/11/16
Tadarida toolbox has been first designed for the specific need of the monitoring of French common bats and bush-crickets (243 end users so far), and is also used to analyse Norfolk Bat Survey data [11, 18]. Both schemes have a longer term objective to monitor a wide range of other acoustically active taxa (birds, frogs, other insects…). Thus, this very generic design makes it useful to any researchers involved in PAM by saving time at several stages of the process (sound event detection, RSDB constitution, etc) and potentially broadening their range of monitored species.
Tadarida may even be used in other intensive research topics in acoustics such as multi-source acoustic detection in urban environments .
Tadarida is made up of three different software packages: Tadarida-D, -L and -C. This modular architecture further broadens the community of researchers likely to be interested in Tadarida toolbox because each package can be used independently. For example, the prolific development of acoustic indices in ecoacoustics can benefit from Tadarida-D by the way it detects and extract numerous features on every animal vocalization in a soundscape, thus allowing measures of alpha and beta diversity [5, 20].
Tadarida is not only modular but also interoperable with other software. For example, since sound events detected by Tadarida-D are precisely located in time and frequency (stored in .ta files), they can easily be linked to sound events detected by other software (e.g. SonoBat™). Thus, a researcher willing to use the feature extraction of such software can nonetheless benefit from the functionalities of Tadarida-L (sound event labelling and RSDB constitution) and/or Tadarida-C (building and using a classifier). To integrate Tadarida-L with other sound detection softwares, all that is needed is to link output tables of both softwares thanks to frequency and time variables, namely Fmin, Fmax, StTime and Dur for Tadarida-L features. To integrate Tadarida-C with external sound detection softwares, all that is needed is (1) to take external software’s feature tables as input (object param3) and (2) to adapt FormulCrit object to external software’s features list.
Tadarida code is built and documented so that third-party developers can also modify or improve it according to their needs. The detectsparameter2 method is especially designed in such a way that other features can easily be added.
Support is not guaranteed for this code, but users are encouraged to contact the authors to discuss specific application possibilities and issues.
The authors would like to thank an anonymous reviewer for useful comments, Christian Kerbiriou and Grégoire Loïs for advices and help in the project management, Stuart Newson, Emmanuel Leblond, Jérôme Landieth and the 251 Vigie-Chiro participants for extensive testing, and finally Stuart Newson for careful rereading of the article.
The authors have no competing interests to declare.
Selby, T H Hart, K M Fujisaki, I Smith, B J Pollock, C J Hillis-Starr, Z et al. (2016). Can you hear me now? Range-testing a submerged passive acoustic receiver array in a Caribbean coral reef habitat. Ecol Evol 6: 4823–35, DOI: https://doi.org/10.1002/ece3.2228
Kalan, A K, Mundry, R, Wagner, O J J, Heinicke, S, Boesch, C and Kuehl, H S (2015). Towards the automated detection and occupancy estimation of primates using passive acoustic monitoring. Ecol Indic 54: 217–26, DOI: https://doi.org/10.1016/j.ecolind.2015.02.023
Froidevaux, J S P, Zellweger, F, Bollmann, K and Obrist, M K (2014). Optimizing passive acoustic sampling of bats in forests. Ecol Evol 4: 4690–700, DOI: https://doi.org/10.1002/ece3.1296
Sueur, J and Farina, A (2015). Ecoacoustics: the Ecological Investigation and Interpretation of Environmental Sound. Biosemiotics 8: 493–502, DOI: https://doi.org/10.1007/s12304-015-9248-x
Harris, S A, Shears, N T and Radford, C A (2016). Ecoacoustic indices as proxies for biodiversity on temperate reefs. Methods Ecol Evol 7: 713–24, DOI: https://doi.org/10.1111/2041-210X.12527
Roch, M A Batchelor, H Baumann-Pickering, S Berchok, C L Cholewiak, D Fujioka, E et al. (2016). Management of acoustic metadata for bioacoustics. Ecol Inform 31: 122–36, DOI: https://doi.org/10.1016/j.ecoinf.2015.12.002
Merchant, N D Fristrup, K M Johnson, M P Tyack, P L Witt, M J Blondel, P et al. (2015). Measuring acoustic habitats. Methods Ecol Evol 6: 257–65, DOI: https://doi.org/10.1111/2041-210X.12330
Stowell, D and Plumbley, M D (2014). Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. Peerj 2: E488.DOI: https://doi.org/10.7717/peerj.488
Azam, C, Le Viol, I, Julien, J-F, Bas, Y and Kerbiriou, C (2016). Disentangling the relative effect of light pollution, impervious surfaces and intensive agriculture on bat activity with a national-scale monitoring program. Landsc Ecol, : 1–13, DOI: https://doi.org/10.1007/s10980-016-0417-3
Jeliazkov, A, Bas, Y, Kerbiriou, C, Julien, J-F, Penone, C and Le Viol, I (2016). Large-scale semi-automated acoustic monitoring allows to detect temporal decline of bush-crickets. Glob Ecol Conserv 6: 208–218, DOI: https://doi.org/10.1016/j.gecco.2016.02.008
Newson, S, Bas, Y, Murray, A and Gillings, S (2017). Potential for coupling the monitoring of bush-crickets with established large-scale acoustic monitoring of bats. Meth Ecol Evol, JAN 27 2017DOI: https://doi.org/10.1111/2041-210X.12720
Briggs, F Lakshminarayanan, B Neal, L Fern, X Z Raich, R Hadley, S J K et al. (2012). Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. J Acoust Soc Am 131: 4640–50, DOI: https://doi.org/10.1121/1.4707424
Breiman, L (2001). Random Forests. Mach Learn 45: 5–32, DOI: https://doi.org/10.1023/A:1010933404324
Leblond, E and Landieth, J (2014–2017). vigiechiro-api. URL https://github.com/Scille/vigiechiro-api.
Landieth, J and Leblond, E (2014–2017). vigiechiro-front. URL: https://github.com/Scille/vigiechiro-front.
Newson, S E, Evans, H E and Gillings, S (2015). A novel citizen science approach for large-scale standardised monitoring of bat activity and distribution, evaluated in eastern England. Biol Conserv 191: 38–49, DOI: https://doi.org/10.1016/j.biocon.2015.06.009
Depraetere, M, Pavoine, S, Jiguet, F, Gasc, A, Duvail, S and Sueur, J (2012). Monitoring animal diversity using acoustic indices: implementation in a temperate woodland. Ecol Indic 13: 46–54, DOI: https://doi.org/10.1016/j.ecolind.2011.05.006