The following is the established format for referencing this article:
Znidersic, E., D. M. Watson, and M. W. Towsey. 2024. A new method to estimate abundance of Australasian Bittern (Botaurus poiciloptilus) from acoustic recordings. Avian Conservation and Ecology 19(1):16.ABSTRACT
Effective conservation management relies on survey methods that accurately represent the biological communities being monitored. Here, we describe a novel approach using long-duration acoustic recordings to estimate abundance of a threatened wetland bird, the Australasian Bittern (Botaurus poiciloptilus). Whereas acoustic monitoring enables a large increase in effort compared to traditional on-site monitoring, e.g., triangulation surveys, it is difficult to estimate the number of individuals of a target species in acoustic recordings. We describe a semi-automated approach to estimate bittern abundance at four sites in the Barmah-Millewa Forest of southern Australia using single-channel, long-duration recordings. Our approach leveraged several known characteristics of bittern calling behavior. We obtained abundance estimates that are larger than those previously found using triangulation surveys at the same site. This is primarily attributed to our ability to find the peak calling hours in a long-duration recording, which does not require the training of a machine-learning call-recognizer. If the method we describe is performed in a consistent, standardized manner, it can identify population trends, which is an important outcome for a threatened species. Our method should be suitable for other furtive wetland species with a similar call structure or frequency range.
RÉSUMÉ
Une gestion efficace de la conservation repose sur des méthodes de dénombrement qui représentent les communautés biologiques suivies avec précision. Dans la présente étude, nous décrivons une nouvelle approche au moyen d’enregistrements de longue durée pour estimer l’abondance d’un oiseau menacé de milieux humides, le Butor d’Australie (Botaurus poiciloptilus). Alors que le suivi par enregistrements permet d’augmenter considérablement l’effort par rapport au suivi traditionnel sur site, par exemple les relevés par triangulation, il est difficile d’estimer le nombre d’individus d’une espèce cible dans les enregistrements. Nous présentons une approche semi-automatique pour estimer l’abondance du butor à quatre sites dans la forêt de Barmah-Millewa, dans le sud de l’Australie, au moyen d’enregistrements à canal unique et de longue durée. Notre approche s’appuie sur plusieurs caractéristiques connues du comportement de chant du butor. Nous avons obtenu des estimations d’abondance plus élevées que celles obtenues précédemment à partir de relevés par triangulation sur le même site. Cette différence est principalement attribuable à notre capacité à trouver les heures de pointe des chants dans un enregistrement de longue durée, ne nécessitant pas d’apprentissage automatique pour reconnaître les chants. Si elle est appliquée de manière cohérente et standardisée, notre méthode peut permettre de déterminer la tendance des populations, résultat important pour une espèce menacée. Notre méthode devrait convenir à d’autres espèces furtives de milieux humides ayant une structure de chant ou une gamme de fréquences similaires.
INTRODUCTION
Conservation management of any species requires a robust survey method that adequately represents the biological communities being monitored. Establishing baseline population estimates or developing a reliable metric reflecting changes in relative abundance or density is essential to detect population changes and evaluate whether management interventions are having any effect. This is of particular importance when managing a threatened species, where data are often scant and the information on behavior, life history, and habitat use needed to estimate determinants of detectability are unknown.
Wetland species worldwide are declining at a rate that exceeds many other ecosystems, and the species they support are rapidly losing their habitat (Finlayson 2012). The logistical complexities in monitoring wetlands can preclude the development of objective monitoring protocols (Conway 2011), resulting in data of varying accuracy that confounds comparisons and constrains rigorous inferences. Wetland birds such as rail, crake, and bittern species are a notoriously difficult group to monitor and some are primarily detected from their calls, with detection reliability affected by temporal calling behavior, time of day, survey effort, and environmental factors (Conway and Gibbs 2011).
The Australasian Bittern (Botaurus poiciloptilus) is listed as endangered under the federal Environmental Protection and Biodiversity Conservation Act 1999 (see https://www.environment.gov.au/cgi-bin/sprat/public/publicthreatenedlist.pl#birds_endangered); as endangered in New South Wales under the Biodiversity Conservation Act 2016 (see https://legislation.nsw.gov.au/view/html/inforce/current/act-2016-063#sch.1-pt.2); and as critically endangered in Victoria under the Flora and Fauna Guarantee Act 1988 (see https://www.environment.vic.gov.au/__data/assets/pdf_file/0036/698571/FFG_Threatened_List_February_2024.pdf). It is found in southeastern and southwestern mainland Australia, Tasmania, and New Zealand (Marchant and Higgins 1990) with occasional records from New Caledonia (Spaggiari et al. 2006). The Australasian Bittern is listed under the International Union for Conservation of Nature as threatened and, like the American Bittern (B. lentiginosus) and the Eurasian Bittern (B. stellaris), its populations are decreasing. The Australian population is estimated at fewer than 1300 birds and approximately 700 remain in New Zealand, with concerns that suitable habitat is lacking, particularly during drought (Herring et al. 2021). Australasian Bitterns occupy shallow wetlands dominated by reed and rushes including Phragmites australis, Juncus ingens (Marchant and Higgins 1990), cumbungi (Typha spp.), swamps with shrubs including Melaleuca and Agonis spp. in southwestern Australia, and rice fields (Herring et al. 2019).
Because of the Australasian Bittern’s cryptic behavior in dense reed beds, detection is primarily through the call of the males. Whereas Williams et al. (2019) demonstrated that the peak calling hours at one site occurred in the early hours of the morning, it is suggested that in Australia the peak vocalization times are within two to three hours on either side of sunrise and sunset (Znidersic and Towsey 2022). This understanding may be biased by survey effort, habitat type, or geographical variations, with little information available on calling frequencies during the night.
Bitterns of the Botaurus genus vocalize with their resonant air sacs, not an open beak, which generate resonance conditions that favour low-frequency sounds such as the so-called boom. The pumps/inhalations that precede the boom are with open beak as the bittern takes in air. However, the following booms are termed a closed-mouth vocalization (Riede et al. 2016). The call consists of a sequence of booms collectively known as a boom-train (Gilbert et al. 1994). In this paper we use the term call to refer to an entire boom-train, which may also include associated pumps/inhalations. We use the terms call and boom-train interchangeably.
Acoustic triangulation is commonly applied to detect and estimate abundance of bittern species (Puglisi et al. 1997). The method involves determining the direction of the call from two separate listening points usually spaced 200 m apart (NSW Department of Planning, Industry, and Environment 2022) or 500 m apart (O’Donnell et al. 2013). Individual birds detected by each observer are identified by compass bearing, estimated distance from the observer, time of call, and the number of booms in the boom-train. In the post-survey analysis, both observer’s data are combined to estimate relative abundance at each site.
Another monitoring technique for bitterns is to use passive acoustic monitors. Gilbert et al. (1994, 2002) and McGregor and Byle (1992) manually identified individual Eurasian Bitterns (B. stellaris) by visual identification of their calls in spectrograms, whereas Bardeli et al. (2010) used a sequence of bandpass filters to achieve successful automated detection of boom-trains. The Australasian Bittern has a similar call frequency and boom-train structure to that of the Eurasian Bittern.
Passive acoustic monitoring is becoming increasingly popular in ecological surveys where species such as the Australasian Bittern are detected primarily by their call. Unlike the ecologist conducting a triangulation survey over an hour on one or multiple days, an acoustic sensor, i.e., autonomous recording unit (ARU), can be deployed in the field for days, months, or years, depending on the power source. Although useful for detecting rare events and minimizing observer bias and disturbance, continuous acoustic monitoring yields large datasets. For those species where little is known about calling behavior, long-duration recordings of weeks or months are needed before absences can be reliably inferred, necessitating an automated approach to analysis, because months and years of recording are impossible to listen to effectively. Call recognizers are widely used to detect vocal species, but determining the number of calling individuals requires multiple acoustic sensors with source-localization software (Frommolt and Tauchert 2014), because single acoustic sensors cannot provide directionality information (Rone et al. 2012). Alternatively, it may be possible, in some cases, to identify individual birds by their idiosyncratic call characteristics using deep learning software (Martin et al. 2022).
Here, we present a new approach for estimating relative abundance from acoustic data, a semi-automated approach using acoustic recordings obtained in the Barmah-Millewa Forest, Australia. We also investigate temporal patterns in the calling behavior of the Australasian Bittern, identify peak vocalization periods, discuss implications for range-wide monitoring of this threatened species, and consider broader application of our approach for monitoring other species detected primarily by vocalizations.
METHODS
Study sites
The Barmah-Millewa Forest is a 66,600-ha floodplain forest in southern Australia dominated by river red gum (Eucalyptus camaldulensis). Located between the townships of Tocumwal (35.8117°S, 145.5648°E), Deniliquin (35.5347°S, 144.9488°E), and Echuca (36.1404°S, 144.7511°E), it includes the Barmah Forest in Victoria and the Millewa Group of Forests in New South Wales. Flood waters run into the Barmah-Millewa Forest from the Murray River, anabranches, and creek lines controlled by a series of flow-regulating structures, occasional environmental water releases resulting in flooding events above normal rainfall. Acoustic recordings were obtained at four different sites within the Barmah-Millewa Forest. Note that the site ID numbers used in this paper are derived from a larger study. The distance between sites ranged between 3.2–12.3 km apart. The sites were acoustically isolated because they are separated by river red gum forests with moderate to dense understory, which attenuated sound propagation between sites.
Acoustic recordings - data collection
Recordings were obtained using Frontier Labs (https://www.frontierlabs.com.au/) acoustic sensors (hereafter ARUs) powered by four rechargeable 18650 Li-ion (3400 mAh) batteries. One acoustic sensor was deployed at each site from 18 November to 1 December 2021 (step 1, Fig. 1). They were affixed to trees with a fabric strap or cable ties and programmed to record continuously (24 x 1 hour .wav files per day) in mono, 16-bit .wav format, at a sampling rate of 44.1 kHz and gain 50 dB. Recording duration was approximately 10 days per site, depending on battery life.
Acoustic recordings - data analysis
We used the open-access software package Ecoacoustics Analysis Programs (Towsey et al. 2018a) to process the .wav files in one-minute segments. Because the Australasian Bittern calls are low frequency (~140 Hz), the recording segments were down-sampled to 5120 samples per second. Using a frame size of 512 samples, this yielded spectrograms with 256 frequency bins, each with a width of 10 Hz, which offers sufficient spectral and temporal resolution for effective visual interpretation of the calls in standard spectrograms.
The Analysis Programs software outputs a set of spectral indices for each one-minute recording segment (step 2, Fig. 1). A spectral index is a vector, each element of which summarizes some aspect of the distribution of acoustic energy in one frequency bin of the one-minute spectrogram. The dimension of the vector therefore equals the number of frequency bins (in this case 256) or one-half the frame size.
In this study, we calculated four spectral indices for each one-minute segment (Sueur et al. 2014, Towsey et al. 2018b). A three-letter code denotes each spectral index.
1. Acoustic Complexity Index (ACI): A measure of the relative change in acoustic intensity (A) in each frequency bin, f, of the amplitude spectrogram:
ACI[f] = ∑i|Aif - Ai-1,f|/∑iAi (1)
where i is an index over all frames in one minute and f is an index over the 256 frequency bins.
2. Temporal Entropy (ENT): A measure of the dispersal of acoustic energy through the frames of each frequency bin. The squared amplitude values in each frequency bin are normalized to unit area and treated as a probability mass function (pmf). The entropy of the pmf vector for frequency bin f is a measure of the energy dispersal through time and is calculated as:
Ht[f] = ∑i log2(pmfif) / log2(N) (2)
where i is an index over all frames and N is the number of frames. To obtain a more intuitive index, we convert Ht[f] to energy concentration:
ENT[f] = 1 - Ht[f] (3)
3. Event Count (EVN): A measure of the number of acoustic events per minute in each frequency bin, f, of the noise-reduced decibel spectrogram. An event is counted each time the bin’s (noise-reduced) decibel value crosses the 3-dB threshold from below.
4. Power Minus Noise (PMN): The maximum decibel value in each frequency bin of the noise-reduced decibel spectrogram. Similar to the signal-to-noise ratio of the entire frequency bin.
The ACI, ENT, and EVN indices were used to prepare long-duration, false-color (hereafter LDFC) spectrograms (step 3, Fig. 1, and Fig. 2; after Towsey et al. 2014). In our experience, this combination produces informative spectrograms when ACI, ENT, and EVN are assigned to the red, green, and blue channels, respectively. A 24-hour LDFC spectrogram is 1440 pixels wide, representing the 1440 minutes in a day, starting and ending at midnight, visualizing the structure of an entire soundscape at a given site over 24 hours. Minutes containing bittern calls were easy to detect in the LDFC spectrogram because of green traces in the 120–150 Hz band (red rectangles, Fig. 2). Note that these green traces indicated that the ENT index was the most sensitive in detecting one-minute recording segments that contain bittern calls.
The bittern index
In order to automate the detection of one-minute recording segments containing bittern calls, we calculated an index (hereafter referred to as the AUBI_Index) derived from the ENT and PMN spectral indices (step 4, Fig. 1). Bittern booms in this study had, on average, the highest amplitude in the 120–150 Hz band and a bandwidth of 30–60 Hz depending on the proximity of the bird to the acoustic sensor. Consequently, we used the ENT index to search for the presence of bittern calls in the 120–150 Hz band (we call this the boom-band). However, to discount broadband sounds, which cross the boom-band but are distributed over a larger proportion of the frequency spectrum, we also required an absence of acoustic activity in the bottom and top sidebands, 80–100 Hz and 170–190 Hz respectively. We allowed for variation in the booms by leaving a buffer-band of 20 Hz between the expected boom-band and sidebands. For each one-minute segment, we calculated an AUBI_ENT score equal to the sum of index values in the expected boom-band (frequency bins 12–14) minus the index values in the sidebands as follows:
ENT(BoomBandScore) = ENT_bin12 + ENT_bin13 + ENT_bin14
ENT(SidebandScore) = ENT_bin08 + ENT_bin09 + (ENT_bin17 * 0.1) + ENT_bin18
AUBI_ENT_Score = ENT(BoomBandScore) - ENT(SidebandScore)
Note that ENT_Bin17 was down-weighted because booms originating close to the acoustic sensor can have acoustic energy in the upper sideband.
Whereas the ENT index is sensitive to the presence of bittern calls, the entropy calculations normalize and consequently remove amplitude information. We therefore calculated an AUBI_Index equal to the maximum of the decibel values in bins 12, 13, and 14, but only if the AUBI_ENT_Score exceeded a threshold of 0.1:
If (AUBI_ENT_Score > 0.1), AUBI_Index = Max(PMN_bin12, PMN_bin13, PMN_bin14)
Otherwise AUBI_Index = 0.0.
This index has decibel units derived from the PMN index. Its value corresponds to the maximum intensity of any sound occurring only within the boom-band in a recording minute. The AUBI_Index values were plotted on ribbon charts, each 1440 pixels wide and 32 pixels high (see step 5, Fig. 1, and Fig. 3). Ribbon charts were concatenated vertically so that one image displayed the output of an entire deployment, typically six to 12 days. These calculations were performed, and charts prepared, with in-house PowerShell scripts (PS Version 7.1.4).
Selecting hours of recording for detailed analysis
Because it was impractical to listen to all the recording hours obtained, we used the AUBI_Index charts to inform the selection of two hours of recording from each of the four sites, eight hours in total. Four of the selected hours were at the same date and time, 19:00h to 20:00h on 22 November 2021, one from each site (blue rectangles, Fig. 3). This time of day was selected to be within the evening hours required by the Barmah Forest Triangulation Survey Protocol (New South Wales Government. Planning, Industry and Environment. 2022). This is the temporal period when bitterns are expected to be maximally calling. We refer to these as the four BTSP hours (step 6a, Fig. 1), where BTSP stands for Barmah Forest Triangulation Survey Protocol. We visually selected the date and hour where the AUBI_Index charts indicated high call rates simultaneously at all four sites (red rectangles, Fig. 3).
The remaining four recording hours were visually selected from the peak density vocalization time at each site (step 6b, Fig. 1). Again, these hours were selected by an examination of the AUBI_Index charts (blue rectangles, Fig. 3). We refer to these as the peak density hours. In all cases, these were in the very early morning, well prior to sunrise, when the sites would have been logistically difficult to access because of darkness and flooding.
Identifying and measuring boom-trains in acoustic recordings
For each of the eight hours of recording, every call/boom-train was identified by the lead author, both visually in standard-scale spectrograms and aurally. The following properties were recorded for each call (step 7, Fig. 1): (1) the start-time of the call, (2) the number of booms in the boom-train, and (3) the signal-to-noise ratio (SNR) of the call.
The call SNR was calculated for the second boom in each boom-train, the first if only one boom was present, using the spectral profile facility in the audio package Audacity (https://www.audacityteam.org/). Decibel values were obtained for the local peak maximum in the 120–150 Hz range. An estimate of background noise for each boom-train was obtained from the minimum decibel value at 130 Hz in the one second preceding the first boom of the boom-train. These local background decibel values were subsequently averaged over the hour. The SNR for each boom-train was obtained by subtracting the averaged background noise value from the second boom maximum. Single boom calls, where it appeared that the boom was the start of a failed call, were excluded from the analysis.
Calculating two calling parameters
Our estimates of abundance depended on the calculation of two call parameters: (1) the refractory period between boom-trains, i.e., a recovery time during which an organ is incapable of repeating the same action; and (2) the maximum variation in call amplitude for the same bird over an extended period (step 8, Fig. 1). We derived values for these two parameters by measuring the call properties of two dominant bitterns labeled birds A and B (Table 1). The two birds, both from site 3, were selected because, first, they called for an extended period within a one-hour recording and, second, their calls had distinct characteristics by which they could be clearly identified as coming from the same individual.
The parameter values were derived by reviewing standard-scale spectrograms. Based on these values, we determined the refractory period between consecutive boom-trains to be two minutes. That is, if the time interval between the end of one boom-train and the beginning of the next was less than two minutes, the two calls were counted as coming from different birds. In addition, the allowed variation in call amplitude was determined by measuring the range of call amplitudes in a sequence of calls from birds A and B. If the difference in amplitude between two consecutive boom-trains exceeded 3 dB, they were counted as coming from different birds.
Estimating the number of calling bitterns
To assist with analysis of the almost 1000 boom-trains in eight hours of recordings, the boom-trains were graphed onto scatter plots with minutes on the x-axis and call-SNR on the y-axis (step 9, Fig. 1). We visually grouped calls based on their properties: number of booms per call, call amplitude, and the interval between calls. Other distinctive call characteristics, such as regularity of the boom intervals within a boom-train and inhalation or pump components assisted in assigning calls to putative individuals (McGregor and Byle 1992, Gilbert et al. 1994, Gilbert et al. 2002).
We used three different methods to estimate the number of calling bitterns in each one-hour recording (steps 10a, 10b, 10c, Fig. 1). These estimates differed in the assumptions made to obtain them:
1. The two-minute estimate (step 10a, Fig. 1): This estimate was derived from the maximum number of boom-trains in any consecutive, i.e., non-overlapping, two-minute period, during the recording hour. The choice of two-minute intervals was based on the previously calculated call-refractory parameter and therefore assumes that the same bittern does not call twice within a two-minute interval. The amplitude parameter rule is not relevant in this estimate. Note that this method underestimates the true number of calling birds because it is unlikely that all birds will call within the same two-minute interval.
2. The sixty-minute estimate (step 10c, Fig. 1): In this estimate, the total number of call groupings derived from the full 60 minutes of recording was taken as the number of calling birds. This could overestimate the true number if one or more birds moved closer or further from the microphone during the hour, creating multiple groupings of calls at different decibel levels. Bitterns are, however, known to stand immobile among reeds for long durations when repeatedly calling (Voisin 1991). Note that the purpose of selecting four BTSP hours at the same time of day was to reduce the likelihood of an overestimate if birds were moving between sites.
3. The ten-minute estimate (step 10b, Fig. 1): The call grouping process (step 9, Fig. 1) was repeated for each of the six consecutive, i.e., non-overlapping, 10-minute segments independently. Thus, six estimates of the number of calling birds were derived from each recording hour and we used the maximum. This estimate reflects a trade-off between two uncertainties: an underestimate if different birds are calling in each two-minute recording segment, and an over-estimate if birds are moving within or between sites over one hour of recording.
RESULTS
Call distribution as determined by the AUBI_Index
Australasian Bitterns were detected at all four sites from the 905 hours of acoustic recordings analyzed. The selection of calling hours for further analysis depended on the AUBI-Index reliably detecting recording minutes containing bittern calls. Like all automated techniques, the bittern index did produce a low background rate of false-positive detections that was site-dependent. These were typically due to distant truck noise and the vagaries of wind. For our sites, a typical background false-positive rate can be observed in Fig. 3. For the six hours between 09:00h and 15:00h on four consecutive days, the error rate, i.e., minutes incorrectly identified as containing bittern calls, was 3%. Our target was, however, to locate one-hour, not one-minute, recording segments, which contained the largest number of bittern calls. In the many thousands of false-color spectrograms and bittern index charts that we have reviewed, a dense cluster of hits (Fig. 3) has always directed us to a one-hour recording with a correspondingly large number of bittern calls as verified by an expert. Likewise, we have never encountered a false-negative hour.
The AUBI_Index revealed peak vocalization periods across all sites that occurred in the pre-dawn period 2:30h and 4:30h and in the post-dusk period 19:00h and 20:00h AEST, UTC+10 (Fig. 3). Calling was not restricted to those periods, however, and varied greatly between sites and days. Wind and rain were clearly identifiable acoustic events in the LDFC spectrograms and, during these events, bittern calling consistently ceased, even during expected peak calling hours. 447 boom-trains/calls were identified in the standard-scale spectrograms of the BTSP recordings (column 2, Table 2) and 528 calls in the peak density recordings (column 3, Table 2).
Calculation of the number of calling bitterns
The boom-trains for each selected hour at each site were displayed on scatter plots (step 9, Fig. 1). An example from the peak density hour at site 6 is shown in Fig. 4. Because of the high number of calls (171 calls in the example; Fig. 4, Table 2), we further prepared separate plots for each set of calls having the same boom count per call. An example from site 6 showing 69 calls, each with four booms, is shown in Fig. 5. Calls were assigned to the same putative individual were grouped using boxes (Figs. 5 and 6) subject to the following constraints: (1) no box should contain two consecutive calls separated by an interval of less than two minutes; (2) majority of calls in a box should fall within a 3-dB band and all within a 6-dB band; and (3) boxes should overlap by no more than 1.5 dB. Secondary call characteristics were used to assign calls where the above conditions were insufficient. We applied three methods to estimate the number of calling birds from the available call plots (steps 10a, 10b, 10c, Fig. 1).
The two-minute estimate
Each one hour of recording was divided into 59 two-minute segments, each overlapping the previous by one minute. The number of complete or partial calls was counted in each segment without regard to number of booms, amplitude, or other call characteristics. The maximum of the 59 values was taken as the bird count. This estimate assumes a refractory period of at least two minutes is required between calls from the same bird. The two-minute estimates were 28 calling individuals for the BTSP hours and 30 calling individuals for the peak density hours (Table 3).
The 60-minute estimate
Calls having the same number of booms were grouped as they appeared in the entire 60-minute recording. For example, in the case of the peak density hour from site 6, we identified nine groupings equating to nine birds (Fig. 5). These groups were based on the previously calculated parameters for refractory period and maximum call amplitude range. Note that in Fig. 5, the calls in the 45–48 dB range were grouped as two birds because there were differences between the distinctive boom or pump characteristics of the two groups. Likewise, the calls in the 40–44 dB range were grouped as two birds because two consecutive calls occurred within the two-minute refractory threshold. Nine birds may be an overestimate for this recording, because it is possible that some birds moved location during the hour, thus changing the amplitude of their detected calls.
Summing over sites, the 60-minute estimates were 60 calling individuals for the BTSP hours and 73 calling individuals for the peak density hours (Table 3). To accommodate the possibility that these values are over-estimates, we derived a 10-minute estimate.
The 10-minute estimate
The one hour of recording was divided into six non-overlapping 10-minute segments and the calls were grouped independently in each segment. We took the maximum of the six values. See an example in the bottom part of Fig. 6 where the minimum number of call groupings in a 10-minute segment was two and the maximum was six. Summing over sites, the 10-minute estimates were 48 calling individuals for the BTSP hours and 58 calling individuals for the peak density hours (Table 3).
The estimated number of calling birds over all four sites in the BTSP recordings ranged from 30–60 birds (Table 3). For the peak density hours, the estimates ranged from 28–73 birds (Table 3). The peak density hours detected more Australasian Bittern in the 10-minute and 60-minute estimates (58 and 73).
DISCUSSION
Ecologists worldwide are in a sprint to monitor species threatened by their rapidly changing ecosystems. One solution for vocal species is to work more efficiently at larger scales, using passive recording devices to monitor multiple sites simultaneously for extended periods. In this study we use acoustic recordings to increase the spatial and temporal scale of monitoring effort for the Australasian Bittern. We collected 905 hours of acoustic recordings from four sites at the Barmah-Millewa Forest. On-site surveys such as the triangulation method, by comparison, are limited by the budgetary, logistical, and safety constraints of extended field work.
One of the challenges with acoustic recordings is to determine the number of calling birds and their locations from a sequence of calls. It typically requires three microphones and time-difference-of-arrival analysis to determine the 2D spatial distribution of call sources. But this can be difficult in a noisy environment, particularly in a wetland where multiple frog, bird, and insect species may be calling at one time. The principal innovation in this study was to demonstrate how estimates of relative abundance of calling bitterns can be derived from one-channel acoustic recordings based on a knowledge of their calling behavior. We obtained three estimates of abundance: the two-minute estimate (28–30 calling birds), the 10-minute estimate (48–58 calling birds), and the 60-minute estimate (60–73 calling birds). These estimates relied on the following assumptions:
1. Individual birds do not begin a boom-train within a two-minute interval from the end of their previous call. This is presumed to be a refractory period as the bird recovers from the exertions of booming. The two-minute value was determined empirically by selecting sequences of boom-trains that could be reliably identified as coming from the same bird (Table 1). The sensitivity of our estimates to the value of this parameter is a subject for future work.
2. Individual bitterns have a set number of booms in their boom-train (Gilbert et al. 1994, Poulin and Lefebvre 2003a). This assumption is not required for the two-minute estimate.
3. The SNR of consecutive boom-trains from the same bird does not vary by more than 3 dB. This value was obtained empirically by measuring boom-train sequences attributable to the same bird (Table 1). This is a significant result because it is relevant to the question of how the recorded call amplitude is affected by the direction in which a calling bird faces relative to the microphone. Assuming no change in the call amplitude at source, our results suggest that some 68% (one standard deviation) of call amplitudes fall within a 3-dB band and 95% within a 6-dB band for a series of calls given by the same bird. Note that 6-dB differences are equivalent to a doubling/halving of source distance from the microphone (Morton 1975). Bitterns are known to remain immobile for long durations when repeatedly calling (Voisin 1991). Once again, this assumption is not required for the two-minute segment approach. Furthermore, any aural survey technique for bitterns requires making assumptions about the effect of varying call direction on received call amplitude.
4. Birds do not move a significant distance during any one measurement interval, thereby changing the received call amplitude. Note that this assumption also applies to some on-site methods, e.g., the triangulation method, but does not apply to our two-minute estimate. In practice the triangulation technique for bitterns requires the assumption that differences in received call volume along a given direction imply different calling individuals (Williams et al. 2019).
The two-minute estimate (28–30 calling birds) is similar to previous estimates obtained by one-hour triangulation surveys at the same sites, i.e., 22 calling birds in 2021–2022 (Znidersic and Towsey 2022) and 29 calling birds in 2022–2023 (Znidersic and Towsey 2023). As noted above, our two-minute estimate is likely to be an underestimate because not all birds will call in the same two-minute interval. Likewise, the triangulation estimates are also likely to be under-estimates for the simple reason that, in most cases, they were not obtained at peak calling times because of difficulties in accessing sites. Nevertheless, the comparison is useful because it suggests that, despite the assumptions required, our estimates of abundance derived from acoustic recordings are consistent with previous estimates obtained by triangulation.
We believe that our different estimates of abundance may be advantageous in different circumstances. An advantage of the two-minute estimate is that it is fast to obtain: it does not require the more time-consuming calculations of call amplitude and boom number. In addition, it reduces dependence on the assumption that birds do not move during the count interval. The two-minute estimate may only be useful in a wetland that is densely populated with calling males and a quick estimate is required. Whereas we acknowledge that the 60-minute estimate could be an overestimate if birds are moving during the hour, it may be appropriate in a small, isolated wetland where the ecologist believes there is reduced significance of bird movement. It has been suggested that the movement of bitterns is minimal during peak calling times because the birds are defending their territories (Teal 1989, Gilbert et al. 2002). The 10-minute estimate is a trade-off between these two uncertainties: the likelihood that different birds are calling in each two-minute recording segment versus the likelihood that birds move within or between sites over one-hour of recording. A ten-minute survey period has previously been used to estimate Great Bittern abundance during on-ground surveys (Poulin and Lefebvre 2003b). By breaking each hour into 10-minute segments, we were able to confirm that bitterns do not call consistently over the hour.
The standard protocol for on-site monitoring of Australasian Bittern assumes that the birds are most likely to call in the pre-dawn and post-dusk hours. An examination of our AUBI_Index charts reveals that, although this is generally true, there was a great variability in actual calling times. It is frequently acknowledged that environmental factors such as wind, cloud-cover, rain, and moon phase influence the detection probability of wetland birds (Spear et al. 1999, Conway and Gibbs 2011), including the Australasian Bittern (Williams et al. 2019) and Great Bittern (Lefebvre and Poulin 2003). A more detailed visual examination of our LDFC spectrograms (Fig. 2) and AUBI_Index charts (Fig. 3) revealed that calling ceased or declined significantly in windy conditions. Even within the eight analyzed hours, there was variation of calling rate.
The triangulation method has been one of the primary monitoring methods applied in Australia and in small New Zealand wetlands for bittern detection along with search counts (O’Donnell et al. 2013, O’Donnell and Williams 2015). The method is simple in theory, but in practice there is the likelihood for significant systematic errors. Sites with complex and loud biophony can easily saturate an observer’s ability to concentrate and identify bittern calls. In a large open wetland, it can be difficult to determine the direction of a boom because of multiple echo paths and duetting bitterns when calls are closely aligned in time. Our estimate of the call direction error is ±15°C in compass bearing, similar to the findings of Lefebvre and Poulin (2003) where estimation on the bearing accuracy for Great Bitterns was ±13.6°C in windless conditions. The triangulation method is possibly more suited to wetlands with a low abundance of bitterns and the capacity to perform surveys over multiple hours and days. As stated by Williams et al., “By bad luck, when monitoring species that are difficult to detect (cryptic), sampling designs become dictated by what is feasible rather than what is desired.” (2018:6839).
Our method contributes a novel approach for monitoring the Australasian Bittern. A significant advantage of our analysis is that it does not require the preparation of a machine-learning call recognizer. The most obvious challenge of acoustic monitoring is that more sound data can be accumulated than can be listened to, necessitating some form of automated processing. The usual procedure is to train a machine-learning classifier to detect the call(s) of interest. But these methods require the construction of training datasets, and it is easy to underestimate the time, expertise, and thoughtful consideration required to prepare datasets that capture the variability and amplitude range of the calls of interest. And once a good dataset has been obtained, considerable time is required to weed out the inevitable false-positives in the output. Instead, our technique depends on calculating an AUBI_Index, which is, in turn, derived from a set of acoustic indices that are calculated at one-minute resolution. These are sufficiently accurate to identify one-hour segments within many hundreds of hours of recording that have a high density of the target call. This enables the ecologist to focus their attention on the final selection of true-positives, a task that requires a skilled appreciation of call details that is typically beyond the capability of standard machine learning methods. Our approach not only obviates the need for preparing training sets but also the need to score potentially thousands of recognizer hits as true- or false-positives. It should be remembered that the advantage of an automated technique depends on the total time required to prepare, process, and confirm the results being less than the manual alternative. By a visual inspection of the AUBI_Index charts (Fig. 3), we were able to immediately identify peak calling hours, i.e., those having a high density of calls. This reduced the scale of manual analysis from approximately 905 hours to eight, thus concentrating the ecologist’s attention on hours with high call-density.
Another significant advantage of acoustic monitoring is the ability to review the same call where there is doubt. Acoustic recordings also offer the possibility of identifying individual bitterns from their distinctive boom-train characteristics that can be better appreciated after repeated playback (Gilbert et al. 1994, Puglisis et al. 1997). The most time-consuming component of our method was calculation of the boom-train SNRs.
The advantage of long-duration acoustic recordings for monitoring bitterns is that they not only tell us when the birds are calling, i.e., presence, but they also reveal an ever-changing rate of calling. What we have attempted to do in this study is to convert calling rate to an estimate of calling-bird count, by leveraging what is known about the characteristics of bitterns’ calling behavior. Although some uncertainties remain, this is also true for other techniques when studying a highly cryptic species. If the technique we describe is performed in a consistent, standardized manner, it can be used to determine population trends, which is an important outcome for a threatened species.
CONCLUSIONS
Long-duration acoustic monitoring offers the possibility of detecting a species during peak calling times, regardless of when this might be. With climatic uncertainty, we can no longer assume that annual calling patterns will remain consistent over a sequence of years. And, of course, temporal vocalization patterns and their peak calling times are likely to be country- and locality-dependent. Acoustic monitoring allows the researcher to cope with such uncertainty. The approach described in this paper is a step forward in monitoring the Australasian Bittern and other species with similar call attributes from single-channel recordings.
RESPONSES TO THIS ARTICLE
Responses to this article are invited. If accepted for publication, your response will be hyperlinked to the article. To submit a response, follow this link. To read responses already accepted, follow this link.
AUTHOR CONTRIBUTIONS
Elizabeth Znidersic made a substantial contribution to the concept and design of the study, the data collection, data analysis and interpretation, drafting the paper, and critical revision of the paper. David Watson contributed to the drafting of the paper and revision. Michael Towsey made a substantial contribution to the concept and design of the study, the data collection, data analysis and interpretation, drafting the paper, and critical revision of the paper.
ACKNOWLEDGMENTS
We acknowledge and thank NSW Parks and Victoria Parks Service for support during this study. We also acknowledge the Traditional Owners of the land this study was conducted on, and their support and guidance.
LITERATURE CITED
Bardeli, R., D. Wolff, F. Kurth, M. Koch, K.-H. Tauchert, and K.-H. Frommolt. 2010. Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letters 31:1524-1534. https://doi.org/10.1016/j.patrec.2009.09.014
Conway, C. J. 2011. Standardized North American marsh bird monitoring protocol. Waterbirds 34:319-346. https://doi.org/10.1675/063.034.0307
Conway, C. J., and J. P. Gibbs. 2011. Summary of intrinsic and extrinsic factors affecting detection probability of marsh birds. Wetlands 31:403-411. https://doi.org/10.1007/s13157-011-0155-x
Finlayson, C. M. 2012. Forty years of wetland conservation and wise use. Aquatic Conservation, Marine, Freshwater Ecosystems 22:139-143. https://doi.org/10.1002/aqc.2233
Frommolt, K.-H., and K.-H. Tauchert. 2014. Applying bioacoustics methods for long-term monitoring of a nocturnal wetland bird. Ecological informatics 21:4-12. https://doi.org/10.1016/j.ecoinf.2013.12.009
Gilbert, G., P. McGregor, and G. Tyler. 1994. Vocal individuality as a census tool: practical considerations illustrated by a study of two rare species. Journal of Field Ornithology 65:335-348.
Gilbert, G., G. A. Tyler, and K. W. Smith. 2002. Local annual survival of booming male Great Bittern Botaurus stellaris in Britain, in the period 1990–1999. Ibis 144:51-61. https://doi.org/10.1046/j.0019-1019.2001.00012.x
Herring, M. W., P. Barratt, A. H. Burbidge, M. Carey, A. Clarke, S. Comer, B. Green, R. Pickering, C. Purnell, A. Silcocks, et al. 2021. Australasian Bittern Botaurus poiciloptilus. Pages 222-224 in S. T. Garnett and G. B. Baker, editors. The Action Plan for Australian Birds 2020. CSIRO Publishing, Melbourne, Australia.
Herring, M. W., W. Robinson, K. K. Zander, and S. T. Garnett. 2019. Rice fields support the global stronghold for an endangered waterbird. Agriculture, Ecosystems & Environment 284:106599. https://doi.org/10.1016/j.agee.2019.106599
Lefebvre, G., and B. Poulin. 2003. Accuracy of bittern location by acoustic triangulation. Journal of Field Ornithology 74:305-311. https://doi.org/10.1648/0273-8570-74.3.305
Marchant, S., and P. J. Higgins. 1990. Handbook of Australian, New Zealand, and Antarctic birds. Volume one - Ratites to Ducks. Oxford University Press, Melbourne, Australia.
Martin, K., O. Adam, N. Obin, and V. Dufour. 2022. Rookognise: acoustic detection and identification of individual rooks in field recordings using multi-task neural networks. Ecological Informatics 72:101818. https://doi.org/10.1016/j.ecoinf.2022.101818
McGregor, P. K., and P. Byle. 1992. Individually distinct bittern booms: potential as a census tool. Bioacoustics 4:93-109. https://doi.org/10.1080/09524622.1992.9753210
Morton, E. S. 1975. Ecological sources of selection on avian sounds. American Naturalist 109:17-34. https://doi.org/10.1086/282971
New South Wales (NSW) Department of Planning, Industry, and Environment. 2022. Request for tender, statement of requirements, TLM-intervention monitoring, Australasian and Australian Little Bittern presence and breeding surveys 2022–23. NSW Department of Planning, Industry, and Environment, Parramatta, Australlia.
O’Donnell, C. F. J. 2011. Breeding of the Australasian Bittern (Botaurus poiciloptilus) in New Zealand. Emu - Austral Ornithology 111:197-201. https://doi.org/10.1071/MU10059
O’Donnell, C. F. J., and E. M. Williams. 2015. Protocols for the inventory and monitoring of populations of the endangered Australasian Bittern (Botaurus poiciloptilus) in New Zealand. Department of Conservation Technical Series 38, Department of Conservation, Wellington, New Zealand.
O’Donnell, C. F. J., E. M. Williams, and J. Cheyne. 2013. Close approaches and acoustic triangulation: techniques for mapping the distribution of booming Australasian Bittern (Botaurus poiciloptilus) on small wetlands. Notornis 60:279-284.
Poulin, B., and G. Lefebvre. 2003a. Optimal sampling of Booming Bitterns Botaurus stellaris. Ornis Fennica 80: 11-20.
Poulin, B., and G. Lefebvre. 2003b. Variation in booming among Great Bitterns Botaurus stellaris in the Camargue, France. Ardea 91:177-181.
Puglisi, L., O. Cima, and N. E. Baldaccini. 1997. A study of the seasonal booming activity of the bittern Botaurus stellaris; what is the biological significance of the booms? Ibis 139: 638-645. https://doi.org/10.1111/j.1474-919X.1997.tb04686.x
Riede, T., C. M. Eliason, E. H. Miller, F. Goller, and G. A. Clarke. 2016. Coos, booms, and hoots: the evolution of closed-mouth vocal behavior in birds. Evolution 70:1734-1746. https://doi.org/10.1111/evo.12988
Rone, B. K., C. L. Berchok, J. L. Crance, and P. J. Clapham. 2012. Using air-deployed passive sonobuoys to detect and locate critically endangered North Pacific right whales. Marine Mammal Science 28:E528-E538. https://doi.org/10.1111/j.1748-7692.2012.00573.x
Spaggiari, J., V. Chartendrault, and N. Barre. 2006. Zones imporantes pour la conservation des oiseaux de Nouvelle-Calédonie. Societe Caledonienne d’Ornithologie, Nouméa, Nouvelle-Calédonie.
Spear, L. B., S. B. Terrill, C. Lenihan, and P. Delevoryas. 1999. Effects of temporal and environmental factors on the probability of detecting California Black Rails Laterallus jamaicensis coturniculus. Journal of Field Ornithology 70:465-480.
Sueur, J., A. Farina, A. Gasc, N. Pieretti, and S. Pavoine. 2014. Acoustic indices for biodiversity assessment and landscape investigation. Acta Acustica United with Acustica 100:772-781. https://doi.org/10.3813/AAA.918757
Teal, P. J. 1989. Movement, habitat use, and behavior of Australasian Bittern Botaurus poiciloptilus in the lower Waikato wetlands. Thesis. University of Waikato, Hamilton, New Zealand.
Towsey, M., A. Truskinger, M. Cottman-Fields, and P. Roe. 2018a. Ecoacoustics Audio Analysis Software, Version v18.03.0.41. Zenodo, CERN European Organization for Nuclear Research, Geneva, Switzerland.
Towsey, M., E. Znidersic, J. Broken-Brow, K. Indraswari, D. M. Watson, Y. Phillips, A. Truskinger, and P. Roe. 2018b. Long-duration, false-color spectrograms for detecting species in large audio datasets. Journal of Ecoacoustics 2:6. https://doi.org/10.22261/JEA.IUSWUI
Towsey, M, L. Zhang, M. Cottman-Fields, J. Wimmer, J. Zhang, and P. Roe. 2014. Visualization of long-duration acoustic recordings of the environment. Procedia Computer Science 29:703-712. https://doi.org/10.1016/j.procs.2014.05.063
Voisin, C. 1991. The herons of Europe. Poyser, London, UK.
Williams, E. M., D. P. Armstrong, and C. F. J. O’Donnell. 2019. Modeling variation in calling rates to develop a reliable monitoring method for the Australasian Bittern Botaurus poiciloptilus. Ibis 161: 260-271.
Williams, E. M., C. F. J. O’Donnell, and D. P. Armstrong. 2018. Cost-benefit analysis of acoustic recorders as a solution to sampling challenges experienced monitoring cryptic species. Ecology and Evolution 8:6839-6848. https://doi.org/10.1002/ece3.4199
Znidersic, E., and M. W. Towsey. 2022. The Living Murray-Intervention Monitoring: Australasian Bittern and Australian Little Bittern presence and breeding surveys, 2021–22. Intervention monitoring in Barmah-Millewa Forest 2021. New South Wales National Parks and Wildlife Service, Parramatta, Australia.
Znidersic, E., and M. W. Towsey. 2023. Australasian Bittern and Australian Little Bittern presence and breeding surveys in Barmah-Millewa Forest 2022–23. Annual triangulation surveys and acoustic monitoring. New South Wales National Parks and Wildlife Service, Parramatta, Australia.
Table 1
Table 1. The calling parameters (total number of booms, interval between consecutive boom trains, and the average signal-to-noise ratio, i.e., SNR, in decibels) of two Australasian Bitterns (Botaurus poiciloptilus; A and B) from site 3, identified by their unique call characteristics.
Call parameter | Bird A | Bird B | |||||||
Total number of boom-trains | 17 calls over 38 minutes | 24 calls over 40 minutes | |||||||
Interval between consecutive boom-trains (minutes) | 2.4 ±0.5 (min 1.9, max 3.7) | 2.2 ±0.5 (min 1.3, max 3.0) | |||||||
SNR average (decibels) | 27 ±1.6 (min 25, max 32) | 33 ± 1.5 (min 30, max 36) | |||||||
Table 2
Table 2. Number of Australasian Bittern (Botaurus poiciloptilus) calls in one hour by site. Column 2: number of bittern calls identified in the four Barmah Triangulation Survey Protocol (BTSP) recordings. Column 3: number of calls identified in the four peak-density recordings.
Wetland site | Australasian Bittern: number of calls in one hour | |
Triangulation Survey Protocol hours 22 November 2021, 19:00h–20:00h |
Peak density hours 22, 25, 30 Nov, 1 Dec 2023 |
|
Site 3 | 212 | 135 |
Site 6 | 171 | 212 |
Site 10 | 47 | 90 |
Site 15 | 17 | 91 |
Total | 447 | 528 |
Table 3
Table 3. Estimates of the number of individual Australasian Bitterns (Botaurus poiciloptilus) detected from all sites. Columns 2, 3, and 4 from the Barmah Triangulation Survey Protocol (BTSP) hours; columns 5, 6, and 7 from the peak density hours.
Triangulation Survey Protocol hours 22 Nov 2021, 19:00h–20:00h |
Peak density hours | ||||||||
Site | Two-minute estimate | 10-minute estimate | 60-minute estimate | Two-minute estimate | 10-minute estimate | 60-minute estimate |
|||
Site 3 | 13 | 20 | 25 | 7 | 15 | 23 | |||
Site 6 | 10 | 14 | 20 | 10 | 17 | 22 | |||
Site 10 | 5 | 10 | 11 | 6 | 12 | 13 | |||
Site 15 | 2 | 4 | 4 | 5 | 14 | 14 | |||
Total | 30 | 48 | 60 | 28 | 58 | 73 | |||