Abstract Passive acoustic recording is a cost‐effective method for monitoring vocal animals. Within this field, there is an increasing focus on automated detection algorithms for counting calls and estimating call density (in space and time). For accurate interpretation of such results, it is important to understand and correct biases introduced by automated detectors. Often such characterisations are made by having a human observer annotate a portion of a dataset under the assumption that their annotations represent the ground truth. Here, we present a workflow fusing two prior methods on this topic as a step towards more integrated, holistic and robust results. First, we used a closed‐population capture–recapture model to characterise the performance of two or more detectors, regardless of whether they were humans or algorithms, with an adjudicated subset of detections from all of the observers providing a common ground truth. Adjudicated detections were used to estimate false positive rates of detectors, and adjudicated true positive detections were then used to model relationships between probability of detecting calls and signal‐to‐noise ratio. Then we used these models to estimate call densities via a Monte Carlo simulation. To illustrate the workflow, we applied it to a real‐world dataset of Antarctic blue whale calls recorded over a year on their high latitude feeding grounds. A subset of this dataset was annotated by three different human observers and the full year dataset analysed by two automated detectors. This provided a means to compare and contrast results under the assumption that each human observer is the ground truth vs. adjudicated capture–recapture methods. Adjudicated capture–recapture methods provided more consistent call densities across different human observers and automated detectors, compared to those derived assuming detections from individual human observers were the ground truth. Capture–recapture methods also readily accommodate all available information from multiple observers and/or detectors, if more than one of either is available. Whilst not replacing manual annotation for automated detector development, adjudication significantly reduced the amount of labour required for characterisation of multiple automated detectors, offering a more efficient approach for call density estimation from large acoustic datasets.
Miller et al. (Wed,) studied this question.