Recogniser evaluation and detection of false positives
Recogniser performance was evaluated on a manually verified and balanced subsample of 1-minute sound recordings that were categorised for each species’ presence or absence. The sample size of evaluation files varied among species. Any detection returned in a recording where the species was present was taken to be a true positive detection. All other detections were deemed false positives. For each template, we quantified the number of call detections in sound files where the species was present (true positive count; Count TP) and absent (false positive count; Count FP) and the number of sound files in which the presence/absence (PA) of species was correctly detected (true presence: PA TP), incorrectly detected (false presence; PA FP), missed (false absence; PA FN) or correctly undetected (true absence; PA TN). Using these values, for each template we calculated precision, recall and ROC value, which are given by the formulas: