Automatic detection and results cleaning
Trained classifiers for each focal species were used to scan through the entire two-year audio dataset for each site and automatically detect all occurrences of the target vocalizations. As compared with other acoustic clustering software, Kaleidoscope tends to produce false positives (Knight et al. 2017), so all automatic positive detections were manually verified for accuracy and false positives were removed (Table 1). Because all positive detections were verified by a trained human observer, the accuracy of the detections used in the results was estimated at 100%. The presence or absence of each focal species was manually assessed in 1,000 audio files and compared with classifier outputs to estimate the false negative rate and the recall rate (Table 1). False negative rate was the percentage of audio recordings where the focal species was present but not detected by the classifier. Recall rate is the proportion of all vocalizations that were successfully detected by the classifier (true positives/(true positives + false negatives)).