Skip to content
Opens in a new window Opens an external site Opens an external site in a new window
Avian Conservation and Ecology
  • Current Issue
  • About the Journal
    • Our Editors
    • Our History
    • Policies
    • Submissions
    • Contact
  • Open Access Policy
  • Submit an Article
  • Sign In
Icons/Search
Icons/Close
Icons/Search
Home > VOLUME 20 > ISSUE 2 > Article 3 Research Paper

Effects of imperfect detection on inferences from bird surveys

Rigby, E. A., and D. H. Johnson. 2025. Effects of imperfect detection on inferences from bird surveys. Avian Conservation and Ecology 20(2):3. https://doi.org/10.5751/ACE-02826-200203
Download PDF Download icon Download Citation Download icon Submit a Response Arrow-Forward icon
Share
  • Twitter logo
  • LinkedIn logo
  • Facebook logo
  • Email Icon
  • Link Icon
  • Elizabeth A. RigbyORCID, Elizabeth A. Rigby
    U.S. Fish and Wildlife Service
  • Douglas H. JohnsonORCIDDouglas H. Johnson
    Department of Fisheries, Wildlife, and Conservation Biology, University of Minnesota, St. Paul

The following is the established format for referencing this article:

Rigby, E. A., and D. H. Johnson. 2025. Effects of imperfect detection on inferences from bird surveys. Avian Conservation and Ecology 20(2):3.

https://doi.org/10.5751/ACE-02826-200203

  • Introduction
  • Methods
  • Results
  • Discussion
  • Author Contributions
  • Acknowledgments
  • Data Availability
  • Literature Cited
  • abundance; detection; detection probability; indices; point counts; simulation
    Effects of imperfect detection on inferences from bird surveys
    Copyright © by the author(s). Published here under license by The Resilience Alliance. This article is under a Creative Commons Attribution 4.0 International License. You may share and adapt the work provided the original author and source are credited, you indicate whether any changes were made, and you include a link to the license. ACE-ECO-2025-2826.pdf
    Research Paper

    ABSTRACT

    Counts obtained from point count surveys of birds can be treated as an index to bird abundance, but imperfect detectability can complicate inferences about abundance. Detectability-adjusted analysis methods, including double observer, replicated counts, removal, and distance sampling methods, estimate detection as well as abundance but require additional information, with added logistical costs and potentially added sources of error. As a counterpoint to field-based studies, we simulated point counts of birds, modeling birds spatially as moving within territories, modeling song production as an autocorrelated process, and modeling perceptibility as a function of distance to the observer. We simulated counts with parameters reflecting surveys and behavior of Black-throated Blue Warblers (Setophaga caerulescens), analyzed counts using index and detectability-adjusted analysis methods, and then evaluated and compared the performance of analysis methods. Estimates from index methods underestimated true density of birds for all survey types but were highly correlated with true density. Adjusted estimates from distance sampling and removal analysis methods were less biased than index estimates but had reduced correlation with true density. Adjusted estimates from double-observer analysis methods were nearly unchanged from index estimates. Adjusted estimates from replicated-counts analysis methods were susceptible to highly inflated density estimates, resulting in extremely high bias and low correlation with true density. For replicated counts, the maximum count (an index method) produced less biased estimates than N-mixture model estimates. Index methods, while biased, were better correlated with true density than detectability-adjusted methods. If detection is constant and relative abundance is sufficient to meet survey objectives, using an index method is often preferable. For systems with variable detection probability where inference about absolute abundance is necessary or when detection and abundance are both expected to vary across a covariate gradient, practitioners should select detectability-adjusted methods suited to model the source of imperfect detection in their system. Ill-suited detectability-adjusted methods will not improve inference and are no more useful than an index.

    RÉSUMÉ

    Les comptages obtenus à partir de relevés par points d’écoute d’oiseaux peuvent être considérés comme un indice de l’abondance des oiseaux, mais une détectabilité imparfaite peut compliquer les inférences que l’on fait sur l’abondance. Les méthodes d’analyse ajustées pour la détectabilité, y compris les méthodes du double-observateur, de comptage répété, d’élimination et d’échantillonnage fondé sur la distance, permettent de calculer la détection et l’abondance, mais requièrent des informations supplémentaires, avec des coûts logistiques accrus et d’autres sources d’erreur possibles. En opposition aux études sur le terrain, nous avons simulé des points d’écoute d’oiseaux, en modélisant spatialement des oiseaux qui se déplacent à l’intérieur de territoires, la production de chants comme un processus autocorrélé et la perceptibilité comme une fonction de la distance de l’oiseau par rapport à l’observateur. Nous avons simulé des comptages au moyen de paramètres reflétant les relevés et le comportement de Parulines bleues (Setophaga caerulescens), analysé les comptages à l’aide de méthodes d’analyse d’indices ou ajustées à la détectabilité, puis évalué et comparé les performances des méthodes d’analyse. Les estimations tirées des analyses d’indices ont sous-estimé la densité réelle d’oiseaux pour tous les types de relevés, mais elles étaient fortement corrélées avec la densité réelle. Les estimations ajustées selon les méthodes d’analyse de l’échantillonnage fondé sur la distance et d’élimination étaient moins biaisées que les estimations tirées d’indices, mais présentaient une corrélation réduite avec la densité réelle. Les estimations ajustées selon la méthode d’analyse utilisant le double-observateur n’ont pratiquement pas été différentes des estimations obtenues à partir d’indices. Les estimations ajustées issues de la méthodesd’analyse de comptages répétés étaient susceptibles de gonfler fortement les estimations de densité, ce qui entraînait un biais extrêmement élevé et une faible corrélation avec la densité réelle. Pour les comptages répétés, le comptage maximal (une méthode fournissant un indice) a produit des estimations moins biaisées que les estimations réalisées à partir du modèle N-mélange. Les méthodes d’indices, bien que biaisées, étaient mieux corrélées avec la densité réelle que les méthodes ajustées pour la détectabilité. Si la détection est constante et que l’abondance relative suffit pour atteindre les objectifs de l’étude, l’utilisation d’une méthode fournissant un indice est souvent préférable. Pour les systèmes où la probabilité de détection est variable et où il est nécessaire de déduire l’abondance absolue, ou lorsque l’on s’attend à ce que la détection et l’abondance varient toutes deux en fonction d’un gradient de covariables, les praticiens devraient choisir des méthodes ajustées pour la détectabilité adaptées à la modélisation de la source de détection imparfaite dans leur système. Des méthodes tenant compte de la détectabilité mais qui sont mal adaptées n’amélioreront pas l’inférence et ne sont pas plus utiles qu’un indice.

    INTRODUCTION

    Point count bird surveys are commonly used to address a variety of objectives, including abundance estimation and population monitoring (Scott and Ralph 1981). Point counts are ubiquitous in avian monitoring, yet there is significant debate regarding how count information can best be used. Of particular importance is the role played by detection probability p,

    Equation 1 (1)

    where C is the count of birds obtained during a point count survey and N is the actual but unknown number of birds present.

    If p is constant, or if variation in p is small compared to variation in C (Johnson 2008), C can serve as an index to N, which is the basis for index surveys (Dawson 1981, Conroy 1996). The relationship between N and C can become muddied or totally obscured if heterogeneity in p is great or associated with sites being compared (e.g., different detection probabilities in different habitats [Blomberg and Hagen 2020] or opposing trends in abundance and detection associated with the same covariate [Kéry 2008]). If p and N are not independent, C could provide misleading information about N. Index methods do not directly provide information about p, so both p and N are unknown in field surveys unless additional data are collected. Any inferences about N from index methods must therefore rely on an assumed relationship between N and C.

    To better discuss the factors affecting detection probability, p can be broken into parts, as by Nichols et al. (2009). They address detection of birds within a superpopulation (N*), defined as all birds whose territories or home ranges at least partially overlap the area over which inferences will be made (the area of inference). Nichols et al. (2009) decomposed p from Eq. 1 into four parts: ps, the probability that a bird’s territory at least partially overlaps the surveyed area of a survey site; pp, the probability that a bird is present in the surveyed area at the time of the survey given that its territory at least partially overlaps the surveyed area of a survey site; pa, availability, the probability that a bird is available (e.g., vocalizes or is visible) during a survey, given that it is present; and pd, perceptibility, the probability that a bird is detected, given that it is present in the surveyed area, and available during the survey. The expected value for a count (E(C)) during a survey is thus

    Equation 2 (2)

    The product of the individual components of detection (i.e., p in Eq. 1) is the proportion of N that is expected to be counted on a given survey, whereas the actual number observed is a random variable reflecting the stochasticity of the sampling process.

    Detection probability can be affected by a wide variety of factors (Scott and Ralph 1981, Verner 1985), including species (Diefenbach et al. 2003), survey features (length of survey, survey type; Bollinger et al. 1988, Dawson et al. 1995, Cimprich 2009), behavioral factors (singing rate, volume, motion of birds; Wilson and Bart 1985, McShea and Rappole 1997, Alldredge et al. 2007c), environmental factors (precipitation, wind speed, ambient noise, time of day, time within the breeding season, even tides; Robbins 1981a, Wilson and Bart 1985, Zembal and Massey 1987, Rosenberg and Blancher 2005, Pacifici et al. 2008, Rigby and Johnson 2019), and observer effects (hearing ability, skill, distance from the source; Sauer et al. 1994, Alldredge et al. 2007c). Survey methods can be adjusted to reduce variability in detection probability, such as by using only experienced observers (Robbins et al. 1986), training observers (Kepler and Scott 1981), or using a standardized survey length and survey radius (Ralph et al. 1993, 1995, North American Breeding Bird Survey 1998, Matsuoka et al. 2014). Survey methods may also be adjusted to maximize components of detection probability, such as maximizing pa (availability) by conducting surveys when birds are most likely to sing (Robbins 1981b) or maximizing pd (perceptibility) by constricting survey radius such that all available birds can be assumed to be detected (Ralph et al. 1995).

    Two major groups of analysis methods are available for making inferences about abundance: index methods, which assume that counts are a useful index to abundance without explicit estimation of detection, and detectability-adjusted methods, which estimate both detection probability and abundance. The simplest form of an index is the count analysis method, in which the counts from a survey are used as an estimator of abundance. Other indices add information in an effort to reduce bias. A maximum count uses the largest count among repeated visits to a site as an index to abundance. A bounded-count estimator also requires repeated visits, using twice the maximum count minus the second largest count (Robson and Whitlock 1964, Johnson et al. 2007). Common use of indices includes acknowledging their limitations; Seber (1982:458) recommended that users “simply recognize that the estimates are biased and treat them as relative rather than absolute measures of abundance.” Detectability-adjusted methods are not similarly constrained and have developed rapidly over the last two decades (reviewed in Latif et al. 2024) in concert with advances in hierarchical modeling (Royle and Dorazio 2008, Kéry and Royle 2016, 2020).

    The motivation to adjust abundance estimation methods is rooted in the desire for precise, unbiased estimators. In some situations, abundance estimates can result in misleading conclusions if not adjusted for detection, such as when abundance and detection are confounded (Ruiz-Gutiérrez and Zipkin 2011). Adjustment methods are not all made equal, however, and use different data to estimate detection. It is also not clear that detectability-adjusted methods invariably outperform indices. We wanted to investigate the performance of these estimators in a simulated setting so that we could quantify estimator bias and correlation with true abundance when true abundance and detection are known.

    Although there are many detectability-adjusted methods, we focused on four methods that have been used frequently in the point count literature and use different aspects of the detection process to estimate detection probability. Double-observer (also referred to as multiple-observer) analysis methods (Cook and Jacobson 1979, Nichols et al. 2000) use the discrepancies in individual detections between two observers to estimate detection probability. Distance sampling analysis methods (Burnham et al. 1980, Buckland et al. 1993) estimate detection probability as a function of distance from the observer, assuming that detection probability at the observer’s location is 100%. Removal (Farnsworth et al. 2002) and time-of-detection (Alldredge et al. 2007a) analysis methods estimate detection probability by comparing initial detections that occur during different time periods of the survey. The replicated counts analysis method (Royle 2004) uses N-mixture models to estimate detection probability across sites with temporally replicated counts. By estimating components of detection probability (Eq. 2), detectability-adjusted methods theoretically reduce the bias of estimators of N, as compared to index analysis methods, which do not estimate p but instead are typically used to assess changes in relative abundance.

    Within an analysis method, one or more estimators may be used to estimate abundance, where an estimator is defined as a statistic (i.e., a function of the data) that is used to infer the value of an unknown parameter. For example, in the count analysis method, the sum of counts across all sites within a year forms the index estimator of the number of birds present at those sites. It is important to note that the meanings of the quantities estimated by detection and abundance estimators differ among detectability-adjusted analysis methods. Recall that for index methods, heterogeneity in detection probability is not distinguished from abundance (Eq. 2), so counts estimate N*ps pp pa pd. Distance sampling and double-observer analysis methods model detection probability using heterogeneity in the observation process, exhibited by heterogeneity of detection due to distance from the observer for distance sampling methods and discrepancies in the detection of individual birds by observers for double-observer methods. The detection estimator for these methods therefore estimates pd, and the abundance estimator estimates N*ps pp pa. Removal or time-of-detection analysis methods (hereafter called removal analysis methods) model detection probability using temporal heterogeneity within surveys, which incorporates heterogeneity both in availability (e.g., song production) and the observation process. The detection estimator for removal methods therefore estimates pa pd and the abundance estimator estimates N*ps pp. For the replicated counts analysis method, detection probability is modeled using heterogeneity among temporally replicated visits to multiple sites, which includes heterogeneity in observation, availability, and movement of birds within their home ranges. The detection estimator for replicated counts methods therefore estimates pp pa pd and the abundance estimator estimates N*ps.

    When comparing abundance estimates across analysis methods, it is often preferable to discuss the density of birds (D) within the surveyed area (A), where

    Equation 3 (3)

    Density estimates account for any differences among methods in area surveyed, whereas abundance estimates do not. For example, fixed-radius surveys set a maximum distance from the observer beyond which birds are not recorded (Ralph et al. 1995), but distance sampling methods customarily determine a maximum distance for observations by truncating a percentage of the most-distant observations (Buckland et al. 2001). If comparing years of surveys with different numbers of survey sites, annual abundance across sites must be summarized as mean site-abundance or converted to density. To avoid confusion when comparing surveys with different radii or number of sites, we used density rather than abundance to discuss quantification of bird populations.

    Although detectability-adjusted analysis methods are conceptually attractive, there can be drawbacks to using them. Double-observer methods (Nichols et al. 2000) require data collected via multiple field observers making simultaneous observations, and replicated count methods (Royle 2004) require data collected with repeat visits to sites. For both analysis methods, if human effort is kept constant, the number of overall sites visited is reduced compared with simply performing counts, resulting in smaller survey extent. Removal methods (Farnsworth et al. 2002, Alldredge et al. 2007a) typically require the observer to spend more time at each site compared with a count survey, at some logistical cost. Birds also have more opportunities to move during the longer survey periods common in removal survey types, which could bias results (Scott and Ramsey 1981, Granholm 1983, Dawson et al. 1995). Johnson (2008:865) argued that detectability-adjusted methods are “an assumption or a consequence of an assumption” and that their use is not universally preferable to index methods. Detectability-adjusted methods may entail additional potential sources of error, such as when distance must be estimated by observers for distance sampling methods (Alldredge et al. 2007b) or consensus must be reached by multiple observers for double-observer methods (Alldredge et al. 2006). How birds are surveyed has the potential to affect inferences about bird density, but these effects have not been quantified nor have analysis methods been thoroughly compared to one another in different situations.

    Many comparisons among analysis methods have been field studies (e.g., Moore et al. 2004, Forcey et al. 2006, Thompson and La Sorte 2008). The drawback of modeling detection probability in field studies is that the true population is unknown, so the accuracy of estimators is also unknown, and models could be subject to misspecification. By simulating counts and comparing analysis methods, it is possible to compare estimates to a known population, but few such simulation studies have been attempted. Efford and Dawson (2009) assessed bias in abundance estimators by simulating counts and including heterogeneity in detection due to distance from the observer. That study provided valuable information on estimator performance but addressed only one component of detection probability (pd) and one source of detection heterogeneity (distance to the observer). Recent simulations have focused on the performance of N-mixture models when model assumptions are violated (Duarte et al. 2018; Link et al. 2018). Monroe et al. (2019) evaluated performance of dynamic N-mixture models for lek counts of sage-grouse under scenarios with confounded trends in abundance and detectability, including scenarios with linear trends in pp or papd. Until now, no comprehensive simulation has included heterogeneity in all components of detection (ps, pp, pa, and pd).

    We compared estimators for index and detectability-adjusted analysis methods using counts produced by a model of bird surveys that incorporates heterogeneity in detection at multiple levels. We focused on background heterogeneity in detection and note that we did not model opposing trends in detection and abundance, where indices are most vulnerable. We describe the model generally and evaluate the performance of density estimators using parameters reflecting surveys of Black-throated Blue Warblers (BTBW, Setophaga caerulescens). BTBW offer several advantages for this analysis. First, nesting densities have been shown to be correlated with density of shrubs (Steele 1992, 1993, Holmes et al. 2005), which is easily translatable into a model in which a habitat variable affects abundance. BTBW are relatively well-studied, with ample spatial information available (Holmes et al. 2005), so parameters could be estimated from empirical data. BTBW exhibit stable or increasing populations across their range (Holmes et al. 1986, Sauer et al. 2014), making it a non-controversial choice for an example species. Finally, recordings of BTBW songs were used by Pacifici et al. (2008), Alldredge et al. (2007c), and Simons et al. (2007) in their estimations of perceptibility, allowing the parameterization of perceptibility to be based on more empirical data than are available for most species. As a test species, we believe BTBW can represent many passerines and do not present any specific detection challenge.

    METHODS

    Simulation structure

    We simulated surveys for singing male BTBW. The conceptual foundation for this simulation was Eq. 2. Based on this deterministic conceptual model, we developed a stochastic simulation model of the detection process in bird surveys. We used stochastic processes to model heterogeneity in detection at three fundamental levels: spatial arrangement (represented conceptually by ps and pp), availability (pa), and perceptibility (pd). We coded the model in program R (version 3.0.2; R Core Team 2013).

    In discussing bird surveys, we distinguish the survey type (meaning a specific survey scheme of temporal and spatial replication of survey sites and number of observers) from the analysis method. We consider four survey types: (1) double-observer, where counts are conducted with two simultaneous observers; (2) distance sampling, where the observer estimates the distance to each bird counted; (3) removal, where the survey period is split into three time periods, and the observer records the time period in which each bird was first detected; and (4) replicated counts, where counts are conducted at sites visited three times within each season (Table 1). More than one analysis method can often be applied to data from a particular survey type, e.g., count data from a double-observer survey type can be analyzed using an index method or a detectability-adjusted method. Detectability-adjusted methods require data gathered in a specific survey type. For example, to use a replicated counts analysis method (Royle 2004), the data must be collected in a replicated counts survey type, where multiple visits to sites are conducted within a season and during which time the population is assumed to be closed.

    We simulated 30 survey iterations for each survey type. To make reference to the R code easier, we refer to variables by their R object names, in italics. The stochastic model structure included six hierarchical levels: (1) iteration (y = 1, 2, . . . , 30), (2) site, defined as a single point visited and surveyed by the observer(s) (i = 1, 2, . . . , NSurveySites), (3) bird (j = 1, 2, . . . , NBirds.yi), (4) replication, defined as the within-season visit to a survey site (r = 1, 2, . . . , NReps), (5) interval, defined as a 2-second period, akin to the duration of one bird song (k = 1, 2, . . . , NIntervals), and (6) observer (o = 1, 2, . . . , NSimultaneousObservers).

    Simulation parameters

    The number of survey sites, replications, and simultaneous observers, as well as survey length, were survey type-specific and resulted in the same amount of human effort for each survey type (Table 1). We ran 30 simulation iterations for each of the four survey types (double-observer, removal, replicated count, and distance sampling) for a total of 150 simulations.

    Surveys began on ordinal date 150 (i.e., 30 May). For removal surveys, six surveys occurred per day; for all other methods, seven surveys occurred per day. Simulated survey time and travel time between sites varied by survey type (Table 1), with a combined time of LogisticalSurveyTime minutes. Surveys were planned to begin immediately after travel from the previous survey site, starting at dawn with start times PlannedStartTimesAll. To model variation in travel time, the actual start times for surveys (ActualStartTimesAll) were normally distributed, with mean = PlannedStartTimesAll and SD = 5 minutes. For the removal survey type, PlannedStartTimesAll had a range of 0–150 minutes after sunrise. For all other survey types, PlannedStartTimesAll had a range of 0–138 minutes after sunrise.

    Abundance modeling

    We modeled surveys for each site i in a 2000 m x 2000 m Cartesian grid centered on the observer (total area = 400 ha), selected as a scale sufficiently large that birds could move into and out of the observational range of the observer during the survey. We modeled abundance as a function of habitat available at sites, with a beta-distributed proportion of site i covered by habitat (PercentHabitat.yi). The remainder of each site was covered by a matrix of unsuitable habitat, with a proportion equal to 1– PercentHabitat.yi. Mean density of birds in habitat (HabitatDensity.y) was greater than or equal to mean density of birds in matrix (MatrixDensity.y).

    Site-specific abundance was Poisson-distributed (NBirds.yi), modeled as the sum of two Poisson-distributed random variables, BirdsInHabitat.yi and BirdsInMatrix.yi, which were described respectively by parameters LambdaHabitat.yi and LambdaMatrix.yi. LambdaHabitat.yi and LambdaMatrix.yi were each a product of the size of the modeled area around the observer (Area.yi = 400 ha), the density of birds in habitat or matrix (HabitatDensity.y and MatrixDensity.y), and PercentHabitat.yi. Thus,

    Equation 4 (4)

    Abundance parameters

    We used an empirical estimate for density of breeding pairs in BTBW habitat (HabitatDensity.y = 0.534 birds/ha; Holmes et al. 1986) and derived the density of birds in matrix (MatrixDensity.y = 0.00305 birds/ha) from additional empirical estimates (see Appendix 1). We assumed an iteration-specific proportion of the study area was covered by habitat (StudyHabitatProportion.y ~ U(0.7,1)). The proportion of site i in iteration y covered by habitat (PercentHabitat.yi) was drawn from a beta distribution with μ = StudyHabitatProportion.y and θ = 8 (alternative parameterization of the beta distribution from Link and Barker 2010:319). As a result, PercentHabitat.yi had mean = 0.85 (SD = 0.14), and the mean density was 0.454 birds/ha.

    Spatial modeling

    We allowed simulated birds to move within elliptical territories throughout the survey, potentially changing their distance from the observer and affecting their probability of being detected. We modeled locations for each bird j using x- and y-coordinates based on a bivariate normal distribution, resulting in elliptical territories (Jennrich and Turner 1969). Spatial parameters (Spatial.yij) were generated for each bird j, including the center of the territory (CenterX.yij, CenterY.yij), the area of a 95% elliptical density contour (Area.yij), the eccentricity of the ellipse (Ecc.yij), and an angle of rotation (Theta.yij; for distributions governing spatial parameters, see Appendix 1).

    We allowed territories to overlap peripherally (see Appendix 1). For each bird j, we generated an interval-specific location (Location.yijrk) for k = 1 from the bivariate normal distribution, with a corresponding distance from the observer(s) (Distance.yijrk). For each interval k > 1, a Bernoulli-distributed random variable DoesBirdMove.yijrk was generated from the parameter PrBirdMoves.yijrk (the mean probability that the bird moved). If DoesBirdMove.yijrk = 1, a new Location.yijrk was generated from the bivariate normal distribution. If DoesBirdMove.yijrk = 0, then the location remained the same (Location.yijrk = Location.yijr(k-1)).

    Spatial parameters

    We used two empirical estimates to parameterize the model: density of BTBW in habitat HabitatDensity.y = 0.534 birds/ha (Holmes et al. 1986) and mean BTBW territory size MeanTerrArea = 3.6 ha (Sherry and Holmes 1985). We also used standard deviation of territory size SDTerrArea = 1.0 ha. Bird-specific territory size Area.yij was log-normally distributed, where Area.yij ~ lognormal(3.6 ha, 1.0 ha).

    Interval-specific probability of movement (PrBirdMoves.yijrk) was normally distributed but truncated at 0 and 1, PrBirdMoves.yijrk ~ N(0.005, 0.0005). Using these parameters, simulations showed that 36% of birds modeled moved at least once on average during a 3-minute survey, with a mean number of movements = 0.45, and 78% of birds modeled moved at least once on average for a 10-minute survey, with a mean number of movements of 1.5. These movement parameters were chosen to reflect a frequently moving species; for three passerine species, Granholm (1983) found the minimum probabilities of movement within a 10-minute period were 36%, 64%, and 72%.

    Availability modeling

    We modeled bird availability so that simulated birds would have periods of being in and out of singing mode, a state in which vocalization is frequent and songs occur relatively regularly. We modeled bird song as an interval-specific event Sings.yijrk with two possible states (1 = song, 0 = no song). To produce temporal song patterns reflective of breeding males, we incorporated autocorrelation at two scales. Coarse-scale temporal autocorrelation referred to bird j being in or out of singing mode. If the bird was not in singing mode, then bird j necessarily did not vocalize during interval k and Sings.yijrk = 0. If the bird was in singing mode, then Sings.yijrk could be 0 or 1: these vocalizations were modeled with fine-scale temporal autocorrelation.

    We modeled coarse-scale temporal autocorrelation with a first-order Markov process (Stroock 2005) in which the state of bird j in interval k is related to its state in interval k - 1. There were two possible states, being in singing mode (M) or not being in singing mode (NM). The coarse-scale transition matrix described the probabilities of remaining in a state or switching states, given the previous state (Table 2). For example, for a bird that was in singing mode in the previous interval, P(M|M) is the probability of remaining in singing mode, and P(NM|M) is the probability of switching to non-singing mode. Because the rows of any transition matrix sum to 1,

    Equation 5 (5)

    and

    Equation 6 (6)

    Directly parameterizing the time spent in singing mode was undesirable because singing mode is an approximation of a natural phenomenon and difficult to measure in the field (birds that are in singing mode but not currently singing are indistinguishable from birds not in singing mode). Instead, we used empirical estimates describing the probability that a bird sings at least once in a several minute period (e.g., Emlen 1977), which we refer to as the singing probability. We assumed that the number of birds in singing mode that never sing is negligible. We created a Markov process that produced a population of simulated birds with the desired singing probability by making some assumptions about the coarse-scale transition matrix. The steady-state vector [q1.coarse q2.coarse] (also called the limiting or stationary distribution) is the vector of the proportion of time spent in each state (singing mode, q1.coarse, or non-singing mode, q2.coarse) in the long run (i.e., after Markov chain converges to its stationary distribution and is therefore independent of the initial conditions; Stroock 2005). It also represents the average proportion of the population in each state in any given interval, so we used q1coarse as the probability that a bird was in singing mode at the beginning of a survey. The singing probability Z or the probability that a bird j is in singing mode at least once in t intervals is:

    Equation 7 (7)

    Here, (1 - P(NM|NM)t) is the probability that the bird never switches into singing mode (given that it did not begin in singing mode). Eq. 7 therefore describes the relationship between our modeling goal (the singing probability) and the transition probabilities that achieved it. (For additional information about modeling singing probability, see Appendix 1.)

    For birds in singing mode, we modeled fine-scale autocorrelation with a Markov chain to produce the desired pattern of songs and pauses. An interval represented a binary opportunity for a song to occur. The Markov chain had four states: singing (S) and three stages of not singing (NS1, NS2, NS3). Singing was always followed by two intervals without singing (P(NS1|S) = P(NS2|NS1) = 1). After the second interval of non-singing, the bird would either sing or not sing, according to the probabilities in the fine-scale transition matrix (Table 3). We used successive approximation to select transition probabilities that produced pauses in the song pattern with the desired pause length mean and standard deviation.

    Availability parameters

    The parameters for coarse-scale autocorrelation were based on the interval-specific singing probability, PrSing.yijrk, defined as the probability of being in singing mode for at least one 2-second interval within 10 minutes. The maximum singing probability was 0.9, which is representative of warblers during periods of high availability (e.g., Stacier et al. 2006, Robbins et al. 2009), when birds are present on breeding territories (Holmes et al. 2005). PrSing.yijrk was modeled as a function of time of day (Appendix 2). PrSing.yijrk values were based on information from the Breeding Bird Survey (P. Blancher, Environment Canada, personal communication), with range 0.72–0.90 for the period when most simulated surveys occurred (from dawn until surveys completed, approximately 150 minutes later; Appendix 2). PrSing.yijrk values as low as 0.18 were possible for pre-dawn surveys but occurred rarely. Values for the coarse-scale transition matrix (Table 2) were selected via optimization such that Z (Eq. 7) equaled the desired PrSing.yijrk, with constant P(M|M) = 0.98 (Appendix 1, Appendix 3).

    To parameterize the fine-scale transition matrix, we estimated song length by timing songs and pauses to the nearest second for the first minute of four BTBW recordings from the Cornell Lab of Ornithology’ Macaulay Library (1992, catalogue number ML73987; 1994, catalogue number ML76520; 2000, catalogue number ML107391; and 2014, catalogue number ML140091). Mean song length in the recordings was 2.11 seconds (SD = 0.33 seconds), mean pause length was 6.64 seconds (SD = 1.47 seconds). Assuming an equal ratio of songs and pauses (songs having length SongLength and pauses having length PauseLength, both having units in 2-second intervals), the proportion of time spent singing in the recordings was q1.fine = 0.241.

    We selected fine-scale availability parameter values to produce a pattern similar to that observed in the four Macaulay Library recordings. Through successive approximation, we determined values for the fine-scale transition matrix (Table 3). Monte Carlo simulations using the fine-scale transition matrix produced song patterns with mean pause length = 6.3 seconds (SD = 1.25 seconds) and a realized proportion of time spent singing while in singing mode = 0.241 (akin to the empirically estimated q1.fine).

    Perceptibility modeling

    Our simulated observers could differ in their observations due to skill or ability, their judgment about whether a bird was within or outside their survey radius, or the random chance of detecting a bird, given its interval-specific probability of detection. Observer-specific perceptibility was modeled as a Bernoulli-distributed event with probability of detection perceptibility.yijrko. If bird j was detected by observer o during interval k, then Detected.yijrko = 1; if not, Detected.yijrko = 0. Modeling of perceptibility.yijrko was based on a logit link related to distance from the observer (X), with covariates for observer (O) and presence of environmental noise (N). Specifically,

    Equation 8 (8)

    Here, β0 and β1 were the intercept and slope, respectively, β2 was the effect of noise on the slope, and β3 and β4 were the effects of noise and observer, respectively, on the intercept.

    Observers’ estimations of the distance between the observer and birds may include observer error (Alldredge et al. 2007b). We simulated observer-estimated distances (ObsEstimatedDistance.yijrko) stochastically using a normal distribution based on the true distance Distance.yijrk.

    Each bird had an observer-specific count status Count.yijrko, where Count.yijrko = 1 indicated that the bird was counted and Count.yijrko = 0 indicated that it was not counted. If a bird was not detected (Detected.yijrko = 0), then Count.yijrko = 0. If a bird was detected (Detected.yijrko = 1), it was counted only if the observer-estimated distance to the bird (ObsEstimatedDistance.yijrko) was within the survey radius (MaxSurveyDistance). For comparison, we also determined what counts would have been if observers’ estimation of distance were perfect (ClosePerfectDistCounted.yijr).

    Perceptibility parameters

    We parameterized perceptibility.yijrko (Fig. 1) using coefficients from Pacifici et al. (2008), estimated for BTBW in mixed pine-hardwood forest with leaves present. We modeled ambient noise as a replication-specific binary condition, where presence of noise indicated an effect on the slope and intercept equal to that of an added 10 dB brown noise from speakers 5 m from observers (Pacifici et al. 2008). The probability of a replication having that level of ambient noise was PrNoise = 0.15. For the observer effect on the intercept, we assumed that Observer 1 was 0.5 SD better than the average observer and Observer 2 was 0.5 SD worse (observer effects = ±0.410; K. Pacifici, personal communication). For survey types with a single observer, the observer was replication-specific (either Observer 1 or 2).

    For error in distance estimation by observers, we used the overall mean error reported for trained observers (7.6 m, SD = 21.4 m; Alldredge et al. 2007b) for distances ≥ 62.3 m (the mean of distances investigated). For distances < 62.3 m, we assumed the mean and standard deviation of the error decreased as linear functions as distance approached 0 m.

    Analysis methods

    For all survey types, we compared estimates obtained with detectability-adjusted methods to those obtained from index methods. For all detectability-adjusted analysis methods, we estimated abundance across all sites within each iteration and survey type by fitting models and comparing them using AIC (Burnham and Anderson 2002). For removal sampling analysis, we fit models using function multinomPois() in Program unmarked (Fiske and Chandler 2011) in Program R. To fit N-mixture models, we used function pcount() in Program unmarked, modeling abundance as a Poisson-distributed latent variable. For removal and N-mixture analysis methods, we compared three models: a null model (model 1), and models with PercentHabitat.yi as a site-specific covariate affecting either abundance (model 2) or detection (model 3).

    We used function multinomPois() in Program unmarked to obtain adjusted abundance estimates for the double-observer survey type, using the independent-observer approach (Alldredge et al. 2006). We compared a null model, a model with observer effect, and models with an observer effect and with PercentHabitat.yi as a site-specific covariate affecting abundance or detection (four models total). We also used the Nichols et al. (2000) estimator to obtain adjusted abundance estimates using the dependent-observer approach.

    We carried out conventional and covariate distance sampling analysis with Program Distance version 6.2 release 1 (Thomas et al. 2010) and hierarchical distance sampling with function distsamp() in Program unmarked. With Program Distance, we fitted nine conventional distance sampling models (all combinations of three key functions: half-normal, hazard rate, and uniform; and three detectability-adjusted methods: cosine, polynomial, and Hermite; Buckland et al. 2001). We also fitted two models with a site-level covariate for PercentHabitat.yi, using hazard-rate and half-normal key functions. With Program unmarked, we tested three null models using each of the key functions and models for all key functions using PercentHabitat.yi as a covariate affecting abundance or detection (nine models total). We used continuous distance values in Program Distance and used integer values between 0 and the truncation distance for Program unmarked models to approximate continuous values. We fitted models and estimated parameters within iteration using Program unmarked (where one iteration was one simulation run). To fit models and estimate parameters in Program Distance, we used iteration as region and obtained density estimates for each iteration. For Program Distance, we compared models using all iterations and we report model-averaged results calculated using one set of model weights. For all other detectability-adjusted analysis methods, we compared models within iteration and report model-averaged results calculated using iteration-specific model weights.

    For non-distance sampling survey types, we used three survey radii (MaxSurveyDistance = 50 m, 100 m, and 150 m). For the distance sampling survey type, we truncated the most distant 10% of observations to prevent model over-fitting, as suggested by Buckland et al. (2001) and used the iteration-specific truncation distance as MaxSurveyDistance. For all adjusted estimates, we estimated abundance and converted it to density by dividing by the survey area A, where A = π×MaxSurveyDistance2.

    As per Nichols et al. (2009), we defined four abundances and densities that may be useful when providing comparison to estimators. Abundance Ns = N*ps and corresponding density Ds = N*ps / A referred to birds with territories that overlapped the survey radius. We defined site-specific Ns as the number of birds having territories with 95% utilization distributions overlapping the survey radius. Abundance Np = N*ps pp and corresponding density Dp = N*ps pp / A referred to birds present within the survey radius at the beginning of the survey. Abundance Na = N*ps pp pa and corresponding density Da = N*ps pp pa / A referred to birds available, given that they were present within the survey radius at the beginning of the survey. Conceptually, the count C = N*ps pp pa pd referred to birds detected, given that they were available and present within the survey radius at the beginning of the survey. Site-specific C included birds that were detected and estimated by the simulated observer to be within the survey radius at the time of first detection. Thus, C could include birds that moved into the survey radius during the survey or were falsely estimated to be within the survey radius and were not included in Na.

    Assuming a closed population, in which birds do not move among sites and abundance is constant within surveys, the total abundance across sites is Σ Np. For bird surveys for which the objective is to make inference about abundance across sites, we use Np and Dp for comparisons of estimators, with Ds and Da reported for reference. Within each survey type, we report and compare true density, index method density estimates, and detectability-adjusted method density estimates (birds/ha) for the sake of comparison, even though indices are not usually assumed to directly estimate density (e.g., Seber 1982). We calculated true density Dp as true abundance Np divided by the survey area A. We calculated density estimators and true density for each of 30 simulation iterations and report mean and standard deviation calculated across those 30 simulation iterations.

    The primary index method density estimate we used was the index estimator, or the sum of counts across all sites within an iteration, divided by the area surveyed. The index estimator with perfect distance estimation refers to the birds that would have been counted if the observer’s estimation of distance were perfect (i.e., there were no errors in determining if birds were inside or outside the survey radius) and is provided as a comparison to the index estimator to illustrate the effect of distance estimation error on indices. We also report density estimates from two additional indices for the replicated counts survey type. Maximum count density was the sum across sites of all site-specific maximum counts (among the three counts within an iteration) divided by the area surveyed. Bounded count density was the sum of the bounded counts (twice the maximum count, minus the second largest count; Johnson et al. 2007) divided by the area surveyed.

    For all detectability-adjusted methods except Program Distance, we report detection-adjusted density estimates that were model-averaged within each iteration. Program Distance produced one model ranking using data from all 30 iterations, so we report density estimates calculated for each iteration but using model weights for all 30 iterations. For all estimates, we removed outliers and reported the number removed, where outliers were defined as density estimates that were more than three standard deviations away from the mean of remaining estimates.

    To investigate the relationship between estimates and true density, we calculated iteration-specific bias as the difference between estimated density and Dp (negative bias indicated underestimation of true density). We report median bias (with 95% nonparametric bootstrap confidence intervals) to reduce the effect of some density estimates that were inflated. We also calculated Pearson correlation coefficients between density estimators and Ds, Dp, and Da. We calculated bias and correlation coefficients after removing outliers.

    RESULTS

    For one iteration of simulated counts for the double-observer survey type, the Program unmarked model with detection as a function of observer and percentage habitat did not converge for the 150-m survey radius and was removed from the suite of competing models. For one iteration of simulated counts for the removal survey type, the Program unmarked model with detection as a function of percentage habitat did not converge for the 50-m survey radius and was removed from model selection. No more than four estimates (of 30) were removed as outliers for any estimator (Table 4, Appendix 4, Appendix 5).

    Index estimates underestimated true density of birds (Dp) for all survey types and at all survey radii, although bias for index estimates for the removal survey type was lower than for other survey types (Table 4). Index estimates with perfect distance estimation showed similar bias to index estimates, indicating observer error in estimating distance (i.e., errors in determining if birds were inside or outside the survey radius) did not strongly affect index estimator performance.

    For the double-observer survey type, no estimator was less biased than the index estimator (Table 4, Appendix 6, Appendix 7), and the adjusted estimates using the dependent (Nichols et al. 2000) and independent (Alldredge et al. 2006) approaches were nearly unchanged from index estimates. Birds that were available during surveys often sang many times (mean = 9.06 songs, SD = 6.37), so even with the difference modeled in observer skill, a bird detected by one observer was typically detected by both observers (i.e., perceptibility was very high).

    For the removal survey type, the index estimator was less biased than for other survey types (Table 4), and the difference between true Dp and true Da was smaller than for other survey types. Both results indicate a greater proportion of birds within the survey radius were available over the length of the survey as compared to other survey types because of the longer survey period (10 minutes; all other survey types had 3-minute surveys). Adjusted density estimates for the removal survey type were less biased than index estimates for 50-m and 100-m surveys (Table 4, Appendix 4).

    For the replicated counts survey type, adjusted density estimates from N-mixture models were highly inflated for approximately half to two-thirds of simulation iterations (Fig. 2, Appendix 6, Appendix 7). These inflated estimates were common enough that they did not meet the criteria to be removed as outliers (i.e., more than three standard deviations away from the mean of remaining estimates). Inflated density estimates occurred when estimated detection was approximately ≤ 0.06 (Fig. 3). N-mixture model density estimates for 50-m radius surveys were the most inflated; inflation was reduced for surveys with larger radii (Fig. 3, Table 4, Appendix 4). Bounded count and maximum count density estimates generally had less bias than the index estimates for the replicated counts survey type, although the bounded count density estimator overestimated density for 50-m and 100-m radius surveys (Table 4, Appendix 4, Appendix 5).

    For the distance sampling survey type, adjusted density estimates were similar for Program Distance and Program unmarked and slightly less biased than index estimates (Table 4). In Program Distance, the top model as determined by AIC comparison had a uniform key function with a simple polynomial adjustment (Appendix 8). The bulk of the AIC weight was spread across nine models, including seven models with most of the weight and similar estimates and two models with lower weights and higher density estimates (Appendix 8).

    Generally, index estimators were significantly positively correlated with true density. For 100-m and 150-m radius surveys, index estimators were significantly correlated with Ds, Dp, and Da and had the strongest correlation with Da. For 50-m radius surveys, index estimators for all four survey types were significantly correlated only with Dp and Da and had the strongest correlation with Da. A notable exception was the bounded count estimator, which had lower correlation with true densities than the index estimator (Table 4, Appendix 4, Appendix 5).

    Adjusted density estimators had less consistent correlation with true density across survey types than did the index estimator (Table 4). For the double-observer survey type, adjusted density estimates were as correlated with true density as were index estimates (estimates were very similar). For the removal survey type, adjusted density estimates were more weakly correlated with true densities than were index estimates. For the replicated counts survey type, N-mixture model estimates were not significantly correlated with any true density (Ds, Dp, or Da) for 50-m radius surveys and had a significant negative correlation with true density for 100-m and 150-m radius surveys (Table 4, Appendix 4, Appendix 5). This lack of correlation is due to highly inflated N-mixture model density estimates for many simulation iterations (Fig. 2). For the distance sampling survey type, adjusted density estimates were more weakly correlated with true densities than index estimates (Table 4).

    Estimates using the smallest survey radius were generally less biased than estimates using larger radii, but for adjusted estimators only, they were also more likely to have inflated estimates. Surveys with larger radii generally had stronger correlation with true densities.

    DISCUSSION

    In our simulation, analysis methods used to estimate bird density from simulated BTBW counts varied widely in their performance, and detectability-adjusted analysis methods generally did not outperform index analysis methods. Index estimates were biased, but they were also highly correlated with true density (Dp), particularly for surveys with larger radii. The index estimator would therefore track population changes in our simulated scenario well, providing valuable information for management. Among survey types, the removal survey type showed the least bias, largely because the additional time spent surveying allowed a greater proportion of the population to be available to be detected. This advantage could be diminished in real-world removal surveys if birds were more likely to be double-counted because of the longer survey period (double-counting was not included in our model). Compared to index estimates, adjusted estimates in the distance sampling and removal survey types showed a reduction in bias. The maximum count density estimator for the replicated counts survey type also showed a reduction in bias compared to the index estimator. Adjusted density estimates were less strongly correlated with Dp than were index estimates (Table 4), with the exception of adjusted estimates for the double-observer survey type, which did not differ. Adjusted density estimates using N-mixture models for the replicated counts survey type were prone to inflated estimates and high positive bias. Unadjusted counts, although biased, were better correlated with true abundance and would provide better information about changes in abundance than a detectability-adjusted analysis method, assuming abundance and detection were independent of one another.

    Our simulation of bird survey counts is comprehensive and included heterogeneity in detection due to spatial arrangement, availability, and perceptibility of birds. By incorporating all three components and using empirical data to inform model parameters, this model allowed a more thorough investigation into detection probability than has previously been attempted with simulation. Complexity, however, is a double-edged sword because modeling requires making assumptions about the detection process. We modeled birds spatially as remaining in territories, we modeled song production as an autocorrelated process, and we modeled perceptibility as a function of distance to the observer. To the extent that these assumptions are violated, or that the parameters we selected do not represent a particular species of interest, this model will not accurately reflect counts from real bird surveys. Also, any simulation is a simplification. Our model does not include seasonal variation in singing, double-counting, misidentification, false positives, swamping of observers, or effects of the observer (or other birds) on song production or movement. Also, this simulation is a situation where an index might be expected to perform well because abundance and detection probability are not confounded. Recent studies have examined some of these issues through simulation, particularly related to N-mixture model performance (Link et al. 2018, Monroe et al. 2019). Applying our simulation to additional scenarios with a wider range of parameters affecting detection probability would expand the utility of this study’s conclusions and allow closer comparison to other simulations.

    Simulations can provide insight that field studies cannot. Nichols et al. (2009) recommended that analysis methods for estimating abundance be evaluated for situations where assumptions were likely violated, as capture-recapture models were evaluated in the 1970s and 1980s. Our model represents one such evaluation. Because birds moved throughout our simulated surveys, our model allowed violations of the closure assumption (that there was no change in the population of birds within the sample area during a survey), which is assumed across all analysis methods we considered (Nichols et al. 2009). Birds in this simulation moved often (on average, 78% of birds moved at least once within a 10-minute survey period for the removal survey type, and 36% of birds moved at least once within a 3-minute survey period for all other survey types). With a mean (uncompressed) territory size of 3.6 ha (about half the area surveyed by a 150-m radius point count), many birds moved into or out of the survey area, violating the closure assumption. The removal survey type was longer (10 minutes) than all other survey types (3 minutes), allowing more birds to become available, but also allowing more birds to move into or out of the survey radius. Given the BTBW parameters, indices for removal surveys were less biased than all other survey types, an indication that the benefits of increased availability can outweigh the increase in violations of the closure assumption for longer surveys.

    By including observer error and heterogeneity in estimation of distance to birds, our model allowed violations of the assumption that birds are correctly recorded as inside or outside the survey radius, commonly assumed among analysis methods (Nichols et al. 2009), and the distance sampling analysis method’s assumption that distances to birds are estimated accurately (Thomas et al. 2002). Error in observer-estimated distance had mean 7.6 m for distances > 62.3 m (Alldredge et al. 2007b). The mean error for observations was therefore ≤ 12% on average, a relatively mild effect. For index estimators, there was little improvement in bias or correlation with Dp when observer error was omitted (i.e., distance estimation was perfect). At the effect size estimated by Alldredge et al. (2007b), observer error in estimation of distance did not substantially change results.

    For distance sampling and double-observer survey types, the Program unmarked detectability-adjusted estimators were relatively unbiased estimators of Da (Fig. 2). The value of estimating Da, however, may vary situationally. Having good inferences about the density of singing birds is useful only if that density can be related to the total density of birds (i.e., to have an estimate of pa or to be able to make an assumption about pa). Availability is highly variable across time of day, day within season, and mating status (Wilson and Bart 1985, McShea and Rappole 1997, Rosenberg and Blancher 2005), so pa would best be estimated simultaneously with abundance (e.g., Diefenbach et al. 2007) or separately from pd (e.g., Amundson et al. 2014). Doing so may be time-consuming and expensive. Estimators that are better correlated with Dp, such as removal detectability-adjusted estimators, provide clearer inference about the total density of birds.

    For our simulations, where detection was not confounded with abundance, detectability-adjusted methods that were ill-suited to model the detection component responsible for imperfect detection did not improve inference and were no more useful than an index. We found no benefit to using the double-observer detectability-adjusted method because perceptibility was ~1 or ~0 for most birds within the survey period (Fig. 1); only birds within a narrow range of radii had perceptibility such that discrepancies in observation were probable. The double-observer adjusted estimates had similar bias to the index estimates, yet an unaware practitioner might claim that the adjusted estimates were unbiased because he or she used a detectability-adjusted method and detection probability was accounted for. We recommend against assuming that any detectability-adjusted method correctly accounts for detection probability, unless there is evidence or reasoning that the detectability-adjusted method correctly addresses the source(s) of imperfect detection in the system, such as using a removal analysis method for a species with low availability. Combination methods can use more than one detectability-adjusted method to estimate components of detection (e.g., Hostetter et al. 2019), but practitioners should weigh the added logistical costs of implementing such methods, such as adding replicate surveys.

    When selecting an analysis method to estimate abundance, a crucial first step is for practitioners to use their knowledge of the system to consider detection and to consider their objectives. If detectability components can be reasonably assumed to vary independently from abundance and relative abundance is sufficient to meet survey objectives, using an index method is preferable. If detection is low but constant and absolute abundance is of interest, a simple correction factor may be used. For any detection components that are unknown or known to be variable, we recommend that practitioners perform a pilot study to estimate the mean value and variability of each component. For example, if pa were unknown for a species, one could identify a site within hearing range of several territories, then perform long surveys during different times of the day, recording the minimum number of birds heard. If a detection component is highly variable, the best detectability-adjusted method will include that detection component in its detection estimator. However, detectability-adjusted estimators risk over-correcting counts (such as the N-mixture models here) and should be used cautiously if estimated detection is low. An alternative method is to follow Skalski and Robson’s (1992) recommendation to collect information to estimate detection and then to select among competing models that do and do not include adjustment for detection. The downside of this suggestion is that logistical costs of collecting such data will be spent even if an index method is eventually selected. The difficulty of recording the distance to all birds detected, the survey interval in which they are first detected, and/or reconciling observations among observers may not be trivial, especially in multi-species surveys or surveys with high abundance. Still, if the data can be collected accurately and without hindering the counting accuracy of observers, those costs may be worthwhile if detection in a system is not well understood.

    Whenever possible, we recommend standardizing surveys to reduce heterogeneity in detection components. Using longer survey times in our model (as seen in the removal survey type) increased availability and increased the correlation of the index estimator with Dp, although we caution that longer surveys may result in double-counting individuals (which we did not simulate). We recommend performing surveys during times of high availability (e.g., morning surveys during the height of breeding-season singing for most species). If methods such as callback surveys (e.g., Conway and Gibbs 2005, Conway 2011) increase availability, they may increase the correlation of counts with Dp. Callback methods, however, can cause birds to move toward the observer (e.g., Johnson et al. 2014) and should be carefully investigated before use. To the extent possible, we also recommend training and testing observers to increase pd and reduce its variability (Kepler and Scott 1981).

    To varying degrees, adjusted estimators for the replicated counts, removal, and distance sampling survey types were susceptible to inflation due to low estimates of detection (Fig. 3), especially for surveys with the smallest radius. If using detectability-adjusted methods, we recommend removing potential outlier estimates, particularly if an unusually high abundance is estimated for a season with low estimated detection. For the removal and distance sampling survey types, this inflation happened rarely and removing outliers adequately corrected estimates. Inflation was the norm, however, for density estimates from N-mixture models for the replicated counts survey type, a problem also explored by Dennis et al. (2015). Because the replicated counts survey type required repeated visits, the N-mixture model density estimator had a small sample size (20 survey sites) and frequently estimated very low values of p (Fig. 3), resulting in inflated estimates of density. Other studies have reported high bias from N-mixture estimators when detection is low (e.g., Couturier et al. 2013) or when there is unmodeled heterogeneity in abundance or detection (Duarte et al. 2018). Barker et al. (2018) found N-mixture models would only provide inference about absolute abundance when p was assumed to be constant or fully explained by covariates, particularly when data were sparse. They concluded that count data under imperfect detection can only be reliably used as indices. We suggest that future simulations should examine if this detectability-adjusted method performs better with a larger sample size but note that no other analysis method in our study suffered this drawback.

    RESPONSES TO THIS ARTICLE

    Responses to this article are invited. If accepted for publication, your response will be hyperlinked to the article. To submit a response, follow this link. To read responses already accepted, follow this link.

    AUTHOR CONTRIBUTIONS

    D.H.J. conceived the project, provided comments on the simulation structure and analysis, and reviewed the manuscript. E.A.R. developed and coded the simulation, produced and analyzed the data, and wrote the manuscript.

    ACKNOWLEDGMENTS

    Thank you to D. E. Andersen, T. W. Arnold, M. A. Etterson, and J. D. Nichols for comments on design and review of early drafts. Thank you to expert bird surveyors R. M. Dunlap, A. T. Egan, and S. R. Loss, who provided comments on the simulation. Thank you also to M. W. Beck and L. F. Gray, who provided technical advice, and P. J. Blancher and K. Pacifici, who provided parameter estimates from their studies of availability and perceptibility, respectively. We thank the editors and two anonymous reviewers, whose comments improved this manuscript. The findings and conclusions in this article are those of the authors and do not necessarily represent the views of the U.S. Fish and Wildlife Service.

    DATA AVAILABILITY

    R scripts used to generate the data for this manuscript are available via the Harvard Dataverse: https://doi.org/10.7910/DVN/FPC2AF

    LITERATURE CITED

    Alldredge, M. W, K. H. Pollock, and T. R. Simons. 2006. Estimating detection probabilities from multiple-observer point counts. Auk 123:1172-1182. https://doi.org/10.1093/auk/123.4.1172

    Alldredge, M. W., K. H. Pollock, T. R. Simons, J. A. Collazo, and S. A. Shriner. 2007a. Time-of-detection method for estimating abundance from point-count surveys. Auk 124:653-664. https://doi.org/10.1093/auk/124.2.653

    Alldredge, M. W., T. R. Simons, and K. H. Pollock. 2007b. A field evaluation of distance measurement error in auditory avian point count surveys. Journal of Wildlife Management 71:2759-2766. https://doi.org/10.2193/2006-161

    Alldredge, M. W., T. R. Simons, and K. H. Pollock. 2007c. Factors affecting aural detections of songbirds. Ecological Applications 17:948-955. https://doi.org/10.1890/06-0685

    Amundson, C. L., J. A. Royle, and C. M. Handel. 2014. A hierarchical model combining distance sampling and time removal to estimate detection probability during avian point counts. Auk 131:476-494. https://doi.org/10.1642/AUK-14-11.1

    Barker, R. J., M. R. Schofield, W. A. Link, and J. R. Sauer. 2018. On the reliability of N-mixture models for count data. Biometrics 74:369-377. https://doi.org/10.1111/biom.12734

    Blomberg, E. J., and C. A. Hagen. 2020. How many leks does it take? Minimum sample sizes for measuring local-scale conservation outcomes in Greater Sage-Grouse. Avian Conservation and Ecology 15:9. https://doi.org/10.5751/ACE-01517-150109

    Bollinger, E. K., T. A. Gavin, and D. C. McIntyre. 1988. Comparison of transects and circular-plots for estimating Bobolink densities. Journal of Wildlife Management 52:777-786. https://doi.org/10.2307/3800946

    Buckland, S. T., D. R. Anderson, K. P. Burnham, J. L. Laake, D. L. Borchers, and L. Thomas. 2001. Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK. https://doi.org/10.1093/oso/9780198506492.001.0001

    Buckland, S. T., K. P. Burnham, D. R. Anderson, and J. L. Laake. 1993. Distance sampling: estimation of abundance of biological populations. Chapman and Hall, London, UK.

    Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York, New York, USA.

    Burnham, K. P., D. R. Anderson, and J. L. Laake. 1980. Estimation of density from line transect sampling of biological populations. Wildlife Monographs 72:3-202.

    Cimprich, D. A. 2009. Effect of count duration on abundance estimates of Black-capped Vireos. Journal of Field Ornithology 80:94-100. https://doi.org/10.1111/j.1557-9263.2008.00188.x

    Cook, R. D., and J. O. Jacobson. 1979. A design for estimating visibility bias in aerial surveys. Biometrics 35:735-742. https://doi.org/10.2307/2530104

    Conroy, M. J. 1996. Abundance indices. In D. E. Wilson, F. R. Cole, J. D. Nichols, R. Rudran, and M. S. Foster, editors. Measuring and monitoring biological diversity: standard methods for mammals. Smithsonian Institution Press, Washington, D.C., USA.

    Conway, C. J. 2011. Standardized North American marsh bird monitoring protocol. Waterbirds 34:319-346. https://doi.org/10.1675/063.034.0307

    Conway, C. J., and J. P. Gibbs. 2005. Summary of intrinsic and extrinsic factors affecting detection probability of marsh birds. Wetlands 31:403-411. https://doi.org/10.1007/s13157-011-0155-x

    Couturier, T., M. Cheylan, A. Bertolero, G. Astruc, and A. Besnard. 2013. Estimating abundance and population trends when detection is low and highly variable: a comparison of three methods for the Hermann’s tortoise. Journal of Wildlife Management 77:454-462. https://doi.org/10.1002/jwmg.499

    Dawson, D. G. 1981. Counting birds for a relative measure (index) of density. In C. J. Ralph and J. M. Scott, editors. Estimating numbers of terrestrial birds. Studies in Avian Biology 6:12-16. https://sora.unm.edu/sites/default/files/journals/sab/sab_006.pdf

    Dawson, D. K., D. R. Smith, and C. S. Robbins. 1995. Point count length and detection of forest neotropical migrant birds. U.S. Forest Service General Technical Report PSW-GTR-149, Pacific Northwest Research Station, Albany, California, USA. https://research.fs.usda.gov/treesearch/31737

    Dennis, E. B., B. J. T. Morgan, and M. S. Ridout. 2015. Computational aspects of N-mixture models. Biometrics 71:237-246. https://doi.org/10.1111/biom.12246

    Diefenbach, D. R., D. W. Brauning, and J. A. Mattice. 2003. Variability in grassland bird counts related to observer differences and species detection rates. Auk 120:1168-1179. https://academic.oup.com/auk/article/120/4/1168/5562288?login=false7

    Diefenbach, D. R., M. R. Marshall, J. A. Mattice, and D. W. Brauning. 2007. Incorporating availability for detection in estimates of bird abundance. Auk 124:96-106. https://doi.org/10.1093/auk/124.1.96

    Duarte, A., M. J. Adams, and J. T. Peterson. 2018. Fitting N-mixture models to count data with unmodeled heterogeneity: bias, diagnostics, and alternative approaches. Ecological Modelling 374:51-59. https://doi.org/10.1016/j.ecolmodel.2018.02.007

    Efford, M. G., and D. K. Dawson. 2009. Effect of distance-related heterogeneity on population size estimates from point counts. Auk 126:100-111. https://doi.org/10.1525/auk.2009.07197

    Emlen, J. T. 1977. Estimating breeding season bird densities from transect counts. Auk 94:455-468. https://sora.unm.edu/sites/default/files/journals/auk/v094n03/p0455-p0468.pdf

    Farnsworth, G. L., K. H. Pollock, J. D. Nichols, T. R. Simons, J. E. Hines, and J. R. Sauer. 2002. A removal method for estimating detection probabilities from point-count surveys. Auk 119:414-425. https://doi.org/10.1093/auk/119.2.414

    Fiske, I. J., and R. B. Chandler 2011. unmarked: an R package for fitting hierarchical models of wildlife occurrence and abundance. Journal of Statistical Software 43:1-23. https://doi.org/10.18637/jss.v043.i10

    Forcey, G. M., J. T. Anderson, F. K. Ammer, and R. C. Whitmore 2006. Comparison of two double-observer point-count approaches for estimating breeding bird abundance. Journal of Wildlife Management 70:1674-1681. https://doi.org/10.2193/0022-541X(2006)70[1674:COTDPA]2.0.CO;2

    Granholm, S. L. 1983. Bias in density estimated due to movement of birds. Condor 85:243-248. https://doi.org/10.2307/1367261

    Holmes, R. T., N. L. Rodenhouse, and T. S. Sillett. 2005. Black-throated Blue Warbler (Setophaga caerulescens). In A. Poole, editor. Birds of North America. Cornell Lab of Ornithology, Ithaca, New York, USA. https://doi.org/10.2173/bna.87

    Holmes, R. T., T. W. Sherry, and F. W. Sturges. 1986. Bird community dynamics in a temperate deciduous forest: long-term trends at Hubbard Brook. Ecological Monographs 56:201-220. https://doi.org/10.2307/2937074

    Hostetter, N. J., B. G. Gardner, T. S. Sillett, K. H. Pollock, and T. R. Simons. 2019. An integrated model decomposing the components of detection probability and abundance in unmarked populations. Ecosphere 10(3):e02586. https://doi.org/10.1002/ecs2.2586

    Jennrich, R. I., and F. B. Turner. 1969. Measurement of non-circular home range. Journal of Theoretical Biology 22:227-237. https://doi.org/10.1016/0022-5193(69)90002-2

    Johnson, D. H. 2008. In defense of indices: the case of bird surveys. Journal of Wildlife Management 72:857-868. https://wildlife.onlinelibrary.wiley.com/doi/abs/10.2193/2007-294/

    Johnson, D. H., C. E. Braun, and M. A. Schroeder. 2007. The bounded-count method for analysis of lek counts. Pages 25-30 in K. P. Reese and R. T. Bowyer, editors. Monitoring populations of sage grouse. College of Natural Resources Experiment Station Bulletin 88, University of Idaho, Moscow, Idaho, USA. https://wdfw.wa.gov/sites/default/files/publications/01277/wdfw01277.pdf

    Johnson, F. A., R. M. Dorazio, T. D. Castellón, J. Martin, J. O. Garcia, and J. D. Nichols. 2014. Tailoring point counts for inference about avian density: dealing with nondetection and availability. Natural Resource Modeling 27:163-177. https://doi.org/10.1111/nrm.12024

    Kepler, C. B., and J. M. Scott. 1981. Reducing bird count variability by training observers. In C. J. Ralph and J. M. Scott, editors. Estimating numbers of terrestrial birds. Studies in Avian Biology 6:366-371. https://sora.unm.edu/sites/default/files/SAB_006_1981_P366-371%20Part%207%20Reducing%20Bird%20Count%20Variability%20by%20Training%20Observers%20Cameron%20B.%20Kepler%20J.%20Michael%20Scott.pdf

    Kéry, M. 2008. Estimating abundance from bird counts: binomial mixture models uncover complex covariate relationships. Auk 125:336-345. https://doi.org/10.1525/auk.2008.06185

    Kéry, M., and J. A. Royle. 2016. Applied hierarchical modeling in ecology: analysis of distribution, abundance and species richness in R and BUGS: volume 1: prelude and static models. Elsevier, London, UK. https://doi.org/10.1016/j.rmb.2017.03.028

    Kéry, M., and J. A. Royle. 2020. Applied hierarchical modeling in ecology: analysis of distribution, abundance and species richness in R and BUGS: volume 2: dynamic and advanced models. Elsevier, London, UK.

    Latif, Q. S., J. J. Valente, A. Johnston, K. L. Davis, F. A. Fogarty, A. W. Green, G. M. Jones, L. Leu, N. L. Michel, D. C. Pavlacky, E. A. Rigby, C. S. Rushing, J. S. Sanderlin, M. W. Tingley, and Q. Zhao. 2024. Designing count-based studies in a world of hierarchical models. Journal of Wildlife Management 88:e22622. https://doi.org/10.1002/jwmg.22622

    Link, W. A., and R. J. Barker. 2010. Bayesian inference with ecological applications. Academic Press, New York, New York, USA.

    Link, W. A., M. R. Schofield, R. J. Barker, and J. R. Sauer. 2018. On the robustness of N-mixture models. Ecology 99:1547-1551. https://doi.org/10.1002/ecy.2362

    Matsuoka, S. M., C. L. Mahon, C. M. Handel, P. Sólymos, E. M. Bayne, P. C. Fontaine, and C. J. Ralph. 2014. Reviving common standards in point-count surveys for broad inference across studies. Condor: Ornithological Applications 116:599-608. https://doi.org/10.1650/CONDOR-14-108.1

    McShea, W. J., and J. H. Rappole. 1997. Variable song rates in three species of passerines and implications for estimating bird populations. Journal of Field Ornithology 68:367-375. https://sora.unm.edu/sites/default/files/journals/jfo/v068n03/p0367-p0375.pdf

    Monroe, A. P., G. T. Wann, C. L. Aldridge, and P. S. Coates. 2019. The importance of simulation assumptions when evaluating detectability in population models. Ecosphere 10(7):e02791. https://doi.org/10.1002/ecs2.2791

    Moore, J. E., D. M. Scheiman, and R. K. Swihart. 2004. Field comparison of removal and modified double-observer modeling for estimating detectability and abundance of birds. Auk 121:865-876. https://academic.oup.com/auk/article/121/3/865/5561910?login=false

    Nichols, J. D., J. E. Hines, J. R. Sauer, F. W. Fallon, J. E. Fallon, and P. J. Heglund. 2000. A double-observer approach for estimating detection probability and abundance from point counts. Auk 117:393-408. https://doi.org/10.1093/auk/117.2.393

    Nichols, J. D., L. Thomas, and P. B. Conn. 2009. Inferences about landbird abundance from count data: recent advances and future directions. Pages 201-235 in D. L. Thomson, E. G. Cooch, and M. J. Conroy, editors. Modeling demographic processes in marked populations. Springer-Verlag, New York, New York, USA.

    North American Breeding Bird Survey. 1998. Instructions for conducting the North American Breeding Bird Survey. U.S. Geological Survey Patuxent Wildlife Research Center, Laurel, Maryland, USA. https://www.pwrc.usgs.gov/bbs/participate/instructions.html

    Pacifici, K., T. R. Simons, K. H. Pollock. 2008. Effects of vegetation and background noise on the detection process in auditory avian point-count surveys. Auk 125:600-607. https://doi.org/10.1525/auk.2008.07078

    R Core Team. 2023. R: a language and environment for statistical computing, version 3.02. R Foundation for Statistical Computing, Vienna, Austria. https://cran-archive.r-project.org/bin/windows/base/old/3.0.2/

    Ralph, C. J., S. Droege, and J. R. Sauer. 1995. Managing and monitoring birds using point counts: standards and applications. Pages 161-168 in C. J. Ralph, J. R. Sauer, and S. Droege, editors. Monitoring bird populations by point count. U.S. Forest Service General Technical Report PSQ-GTR-149, Pacific Southwest Research Station, Albany, California, USA. https://research.fs.usda.gov/treesearch/31755

    Ralph, C. J., G. R. Geupel, P. Pyle, T. E. Martin, and D. F. DeSante. 1993. Handbook of field methods for monitoring landbirds. U.S. Forest Service General Technical Report PSW-GTR-144. https://research.fs.usda.gov/treesearch/3639

    Rigby, E. A., and D. H. Johnson. 2019. Factors affecting detection probability, effective area surveyed, and species misidentification in grassland bird point counts. Condor: Ornithological Applications 121:3. https://doi.org/10.1093/condor/duz030

    Robbins, C. S. 1981a. Bird activity levels related to weather. In C. J. Ralph and J. M. Scott, editors. Estimating numbers of terrestrial birds. Studies in Avian Biology 6:275-286. https://sora.unm.edu/sites/default/files/SAB_006_1981_P301-310%20Part%206%20Bird%20Activity%20Levels%20Related%20to%20Weather%20Chandler%20S.%20Robbins.pdf

    Robbins, C. S. 1981b. Effect of time of day on bird activity. In C. J. Ralph and J. M. Scott, editors. Estimating numbers of terrestrial birds. Studies in Avian Biology 6:301-310. https://sora.unm.edu/sites/default/files/SAB_006_1981_P275-286%20Part%206%20Effect%20of%20Time%20of%20Day%20on%20Bird%20Activity%20Chandler%20S.%20Robbins.pdf

    Robbins, C. S., D. Bystrak, and P. H. Geissler. 1986. The breeding bird survey: its first fifteen years, 1965-1979. Fish and Wildlife Service Resource Publication 157, U.S. Department of the Interior, Washington, D.C., USA http://pubs.usgs.gov/unnumbered/5230189/report.pdf

    Robbins, M. B., A. S. Nyari, M. Papes, and B. W. Benz. 2009. Song rates, mating status, and territory size of Cerulean Warblers in Missouri Ozark riparian forest. Wilson Journal of Ornithology 121:283-289. https://doi.org/10.1676/08-100.1

    Robson, D. S., and J. H. Whitlock. 1964. Estimation of a truncation point. Biometrika 51:33-39. https://doi.org/10.1093/biomet/51.1-2.33

    Rosenberg, K. V., and P. J. Blancher. 2005. Setting numerical population objectives for priority landbird species. U.S. Forest Service General Technical Report PSW-GTR-191. Pacific Southwest Research Station, Albany, California, USA. https://research.fs.usda.gov/treesearch/31476

    Royle, A. J. 2004. N-mixture models for estimating population size from spatially replicated counts. Biometrics 60:108-115. https://doi.org/10.1111/j.0006-341X.2004.00142.x

    Royle, J. A., and R. M. Dorazio. 2008. Hierarchical modeling and inference in ecology. Elsevier Academic Press, Oxford, UK.

    Ruiz-Gutiérrez, V., and E. F. Zipkin. 2011. Detection biases yield misleading patterns of species persistence and colonization in fragmented landscapes. Ecosphere 2(5). https://doi.org/10.1890/ES10-00207.1

    Sauer, J. R., J. E. Hines, J. E. Fallon, K. L. Pardieck, D. J. Ziolkowski, Jr., and W. A. Link. 2014. The North American Breeding Bird Survey, results and analysis 1966-2012, version 02.19.2014. USGS Patuxent Wildlife Research Center, Laurel, Maryland, USA.

    Sauer, J. R., B. G. Peterjohn, and W. A. Link. 1994. Observer differences in the North American Breeding Bird Survey. Auk 111:50-62. https://doi.org/10.2307/4088504

    Scott, J. M., and C. J. Ralph. 1981. Introduction. In C. J. Ralph and J. M. Scott, editors. Estimating numbers of terrestrial birds. Studies in Avian Biology 6:1-2. https://research.fs.usda.gov/treesearch/31147

    Scott, J. M., and F. L. Ramsey. 1981. In C. J. Ralph and J. M. Scott, editors. Length of count period as a possible source of bias in estimating bird densities. In Estimating numbers of terrestrial birds. Studies in Avian Biology No. 6:409-413. https://sora.unm.edu/sites/default/files/SAB_006_1981_P409-413%20Part%208%20Length%20of%20Count%20Period%20as%20a%20Possible%20Source...%20J.%20Michael%20Scott,%20Fred%20L.%20Ramsey.pdf

    Seber, G. A. F. 1982. The estimation of animal abundance and related parameters, Second edition. Charles Griffin and Company Ltd., London, UK.

    Sherry, T. W., and R. T. Holmes. 1985. Dispersion patterns and habitat responses of birds in northern hardwoods forests. Pages 286-309 in M. L. Cody, editor. Habitat selection in birds. Academic Press, Orlando, Florida, USA.

    Simons, T. R., M. W. Alldredge, K. H. Pollock, J. M. Wettroth. 2007. Experimental analysis of the auditory detection process on avian point counts. Auk 124:986-999. https://doi.org/10.1093/auk/124.3.986

    Skalski, J. R., and D. S. Robson. 1992. Techniques for wildlife investigators: design and analysis of capture data. Academic Press, San Diego, California, USA.

    Stacier, C. A., V. Ingalls, and T. W. Sherry. 2006. Singing behavior varies with breeding status of American Redstarts (Setophaga ruticilla). Wilson Journal of Ornithology 118:439-451. https://doi.org/10.1676/05-056.1

    Steele, B. B. 1992. Habitat selection by breeding Black-throated Blue Warblers at two spatial scales. Ornis Scandinavica 23:33-42. https://doi.org/10.2307/3676425

    Steele, B. B. 1993. Selection of foraging and nesting sites by Black-throated Blue Warblers: their relative influence on habitat choice. Condor 95:568-579. https://sora.unm.edu/sites/default/files/journals/condor/v095n03/p0568-p0579.pdf

    Stroock, D. W. 2005. An introduction to Markov processes. Springer, Berlin, Germany.

    Thomas, L., S. T. Buckland, K. Burnham, D. Anderson, J. Laake, D. Borchers, and S. Strindberg. 2002. Distance sampling. Encyclopedia of Environmetrics. Wiley, West Sussex, UK.

    Thomas, L., S. T. Buckland, E. A. Rexstad, J. L. Laake, S. Strindberg, S. L. Hedley, J. R. B. Bishop, T. A. Marques, and K. P. Burnham. 2010. Distance software: design and analysis of distance sampling surveys for estimating population size. Journal of Applied Ecology 47:5-14. https://doi.org/10.1111/j.1365-2664.2009.01737.x

    Thompson, F. R. III, W. D. Dijak, T. G. Kulowiec, and D. A. Hamilton. 1992. Breeding bird populations in Missouri Ozark forests with and without clearcutting. Journal of Wildlife Management 56:23-30. https://doi.org/10.2307/3808787

    Thompson, F. R. III, and F. A. La Sorte (2008). Comparison of methods for estimating bird abundance and trends from historical count data. Journal of Wildlife Management 72:1674-1682. https://doi.org/10.2193/2008-135

    Verner, J. 1985. Assessment of counting techniques. Current Ornithology 2:247-302. https://doi.org/10.1007/978-1-4613-2385-3_8

    Wilson, D. M., and J. Bart. 1985. Reliability of singing bird surveys: effects of song phenology during the breeding season. Condor 87:69-73. https://digitalcommons.usf.edu/cgi/viewcontent.cgi?article=12084&context=condor

    Zembal, R., and B. W. Massey. 1987. Seasonality of vocalizations by Light-footed Clapper Rails. Journal of Field Ornithology 58:41-48. https://sora.unm.edu/sites/default/files/journals/jfo/v058n01/p0041-p0048.pdf

    Corresponding author:
    Elizabeth Rigby
    elizabeth_rigby@fws.gov
    Appendix 1
    Appendix 2
    Appendix 3
    Appendix 4
    Appendix 5
    Appendix 6
    Appendix 7
    Appendix 8
    Fig. 1
    Fig. 1. Observer-specific perceptibility of a single song (a) and perceptibility given nine opportunities for detection (b) as a function of distance to observer in simulated surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). Mean number of songs produced by birds that sang was 9.06 for double-observer surveys (SD = 6.37 songs). Perceptibility is shown for the Observers 1 and 2 when ambient noise was (dashed lines) and was not (solid lines) present during surveys.

    Fig. 1. Observer-specific perceptibility of a single song (a) and perceptibility given nine opportunities for detection (b) as a function of distance to observer in simulated surveys of Black-throated Blue Warblers (Setophaga caerulescens). Mean number of songs produced by birds that sang was 9.06 for double-observer surveys (SD = 6.37 songs). Perceptibility is shown for the Observers 1 and 2 when ambient noise was (dashed lines) and was not (solid lines) present during surveys.

    Fig. 1
    Fig. 2
    Fig. 2. Bias of density estimates as compared to true densities for simulated surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). Boxplot displays median value (horizontal bar), first and third quartiles (hinges), and the largest and smallest values no further than 1.5 × the inter-quartile range from the hinges (whiskers). We compare index density estimates (Index) to model-averaged adjusted density estimates (Adjusted) for 30 iterations each of the distance sampling, double-observer, replicated counts, and removal survey types. All surveys shown had a survey radius of 100 m except for the distance sampling surveys, where data were right-truncated, eliminating the most distant 10% of observations (Buckland et al. 2001). We display bias as compared to D<sub>p</sub>, the density of birds present at the beginning of the survey, and show D<sub>a</sub> and D<sub>s</sub> (the density of available birds present at the beginning of the survey and the density of birds with territories that overlapped the survey radius, respectively) for comparison.

    Fig. 2. Bias of density estimates as compared to true densities for simulated surveys of Black-throated Blue Warblers (Setophaga caerulescens). Boxplot displays median value (horizontal bar), first and third quartiles (hinges), and the largest and smallest values no further than 1.5 × the inter-quartile range from the hinges (whiskers). We compare index density estimates (Index) to model-averaged adjusted density estimates (Adjusted) for 30 iterations each of the distance sampling, double-observer, replicated counts, and removal survey types. All surveys shown had a survey radius of 100 m except for the distance sampling surveys, where data were right-truncated, eliminating the most distant 10% of observations (Buckland et al. 2001). We display bias as compared to Dp, the density of birds present at the beginning of the survey, and show Da and Ds (the density of available birds present at the beginning of the survey and the density of birds with territories that overlapped the survey radius, respectively) for comparison.

    Fig. 2
    Fig. 3
    Fig. 3. Adjusted density estimates from N-mixture models for the replicated counts survey type, as a function of estimated detection (p), for simulated surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). Density estimates were inflated when detection was estimated < 0.06.

    Fig. 3. Adjusted density estimates from N-mixture models for the replicated counts survey type, as a function of estimated detection (p), for simulated surveys of Black-throated Blue Warblers (Setophaga caerulescens). Density estimates were inflated when detection was estimated < 0.06.

    Fig. 3
    Table 1
    Table 1. Parameter values for simulated surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). Human effort (min) was constant across survey types. Survey time (sites × survey length) and logistical time (20 minutes travel to each site per observer) were used to estimate human effort needed to accomplish surveys.

    Table 1. Parameter values for simulated surveys of Black-throated Blue Warblers (Setophaga caerulescens). Human effort (min) was constant across survey types. Survey time (sites × survey length) and logistical time (20 minutes travel to each site per observer) were used to estimate human effort needed to accomplish surveys.

    Survey type Sites Replications Survey length (min) Simultaneous observers Survey time (min) Logistical time (min) Total (min)
    Double-observer 30 1 3 2 180 1200 1380
    Removal 46 1 10 1 460 920 1380
    Replicated counts 20 3 3 1 180 1200 1380
    Distance sampling 60 1 3 1 180 1200 1380
    Table 2
    Table 2. The coarse-scale transition matrix was used to model autocorrelation for birds entering and leaving singing mode in simulated surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). Transition probabilities for birds in singing mode were fixed, with P(M|M) = 0.98 and P(NM|M) = 0.02. Transition probabilities for birds not in singing mode, P(NM|NM) and P(M|NM), varied to produce Z (Eq. 6 and Eq. 7) that equaled the desired PrSing.yijrk (the interval-specific probability of a bird being in singing mode for at least one 2-second interval within 10 minutes).

    Table 2. The coarse-scale transition matrix was used to model autocorrelation for birds entering and leaving singing mode in simulated surveys of Black-throated Blue Warblers (Setophaga caerulescens). Transition probabilities for birds in singing mode were fixed, with P(M|M) = 0.98 and P(NM|M) = 0.02. Transition probabilities for birds not in singing mode, P(NM|NM) and P(M|NM), varied to produce Z (Eq. 6 and Eq. 7) that equaled the desired PrSing.yijrk (the interval-specific probability of a bird being in singing mode for at least one 2-second interval within 10 minutes).

    State at interval k
    Bird j is in singing mode in interval k. Bird j is not in singing mode in interval k.
    State at interval k-1 Bird j was in singing mode in interval k-1. P(M|M) P(NM|M)
    Bird j was not in singing mode in interval k-1. P(M|NM) P(NM|NM)
    Table 3
    Table 3. Fine-scale transition probabilities were used to model the autocorrelation of singing for simulated Black-throated Blue Warblers (<em>Setophaga caerulescens</em>) while in singing mode. The pattern of fine-scale singing was produced using one state (S) in which the bird sang (Sings.yijrk = 1) and three states (NS1, NS2, NS3) in which it did not sing (Sings.yijrk = 0). Values represent the probability of transitioning to the given state in interval k, given the previous state in interval k – 1.

    Table 3. Fine-scale transition probabilities were used to model the autocorrelation of singing for simulated Black-throated Blue Warblers (Setophaga caerulescens) while in singing mode. The pattern of fine-scale singing was produced using one state (S) in which the bird sang (Sings.yijrk = 1) and three states (NS1, NS2, NS3) in which it did not sing (Sings.yijrk = 0). Values represent the probability of transitioning to the given state in interval k, given the previous state in interval k – 1.

    State in Interval k
    S NS1 NS2 NS3
    State in Interval k-1
    S 0 1 0 0
    NS1 0 0 1 0
    NS2 0.08 0 0 0.92
    NS3 0.80 0 0 0.20
    Table 4
    Table 4. Index method and detection-adjusted density estimates, bias, Pearson correlation coefficients, and true density (birds/ha) for simulated bird surveys of Black-throated Blue Warblers (<em>Setophaga caerulescens</em>). For adjusted-density estimates, we report model-averaged results from program unmarked for all survey types; we report additional adjusted-density estimates calculated with Program Distance and the dependent-observer approach (Nichols et al. 2000) for comparison. Results for the double-observer (double), removal, and replicated counts (replicated) survey types are shown for 100-m radius surveys (for 50 and 150-m surveys, see Appendix 1, Appendix 2). Data for the Distance sampling (distance) survey type were right-truncated, eliminating the most distant 10% of observations (Buckland et al. 2001), with mean truncation distance = 158 m. Bias was calculated as the difference between the density estimate and the density of birds present at the beginning of the survey, D<sub>p</sub>, and is presented as the median difference, the 95% nonparametric bootstrap confidence interval (CI) of the median, and the percentage of D<sub>p</sub>. Correlation (r) was calculated for estimators in relation to D<sub>p</sub>. Mean and standard deviation for density estimates and true density were calculated across 30 simulation iterations and four outliers were removed from the adjusted density estimate for the distance sampling survey type. Significant correlation coefficients (according to a t-distribution with 28 degrees of freedom) are noted: *P < 0.05, **P < 0.01, ***P < 0.001.

    Table 4. Index method and detection-adjusted density estimates, bias, Pearson correlation coefficients, and true density (birds/ha) for simulated bird surveys of Black-throated Blue Warblers (Setophaga caerulescens). For adjusted-density estimates, we report model-averaged results from program unmarked for all survey types; we report additional adjusted-density estimates calculated with Program Distance and the dependent-observer approach (Nichols et al. 2000) for comparison. Results for the double-observer (double), removal, and replicated counts (replicated) survey types are shown for 100-m radius surveys (for 50 and 150-m surveys, see Appendix 1, Appendix 2). Data for the Distance sampling (distance) survey type were right-truncated, eliminating the most distant 10% of observations (Buckland et al. 2001), with mean truncation distance = 158 m. Bias was calculated as the difference between the density estimate and the density of birds present at the beginning of the survey, Dp, and is presented as the median difference, the 95% nonparametric bootstrap confidence interval (CI) of the median, and the percentage of Dp. Correlation (r) was calculated for estimators in relation to Dp. Mean and standard deviation for density estimates and true density were calculated across 30 simulation iterations and four outliers were removed from the adjusted density estimate for the distance sampling survey type. Significant correlation coefficients (according to a t-distribution with 28 degrees of freedom) are noted: *P < 0.05, **P < 0.01, ***P < 0.001.

    Survey type Estimator Estimate Bias r True density (Dp)
    Mean SD Median Percent 95% CI Mean SD
    Double Index 0.222 0.046 -0.202 -48% -0.223, -0.165 0.388 * 0.437 0.067
    Double Adjusted density 0.224 0.047 -0.201 -48% -0.232, -0.163 0.347 0.437 0.067
    Double Nichols et al. (2000) density 0.222 0.046 -0.202 -48% -0.223, -0.165 0.388 * 0.437 0.067
    Removal Index 0.314 0.041 -0.111 -26% -0.125, -0.093 0.583 *** 0.427 0.057
    Removal Adjusted density 0.385 0.069 -0.048 -12% -0.097, -0.014 0.198 0.427 0.057
    Replicated Index 0.167 0.037 -0.255 -61% -0.279, -0.231 0.643 *** 0.421 0.070
    Replicated Adjusted density 10.8 9.37 11.1 2295% 2.01, 21.26 -0.358 0.421 0.070
    Replicated Maximum count density 0.333 0.053 -0.088 -22% -0.106, -0.061 0.583 *** 0.421 0.070
    Replicated Bounded count density 0.533 0.083 0.106 26% 0.064, 0.143 0.345 0.421 0.070
    Distance Index 0.144 0.022 -0.293 -67% -0.328, -0.271 0.714 *** 0.431 0.053
    Distance Adjusted density 0.213 0.051 -0.226 -53% -0.266, -0.213 0.555 ** 0.431 0.053
    Distance Program distance adjusted density 0.237 0.021 -0.205 -45% -0.252, -0.185 0.587 *** 0.431 0.053
    Click and hold to drag window
    ×
    Download PDF Download icon Download Citation Download icon Submit a Response Arrow-Forward icon
    Share
    • Twitter logo
    • LinkedIn logo
    • Facebook logo
    • Email Icon
    • Link Icon

    Keywords

    Click on a keyword to view more articles on that topic.

    abundance; detection; detection probability; indices; point counts; simulation

    Submit a response to this article

    Learn More
    See Issue Table of Contents

    Subscribe for updates

    * indicates required
    • Submission Guidelines
    • Submit an Article
    • Current Issue
    • Open Access Policy
    • Find back issues
    • Journal Policies
    • About the Journal
    • Find Features
    • Contact

    Resilience Alliance is a registered 501 (c)(3) non-profit organization

    Online and Open Access since 2005

    Avian Conservation and Ecology is now licensing all its articles under the Creative Commons Attribution 4.0 International License

    Avian Conservation and Ecology ISSN: 1712-6568