Results
Recogniser performance evaluation
Recognisers performed well for most species (Table 3). Performance was high (ROC > 0.8) for templates of L. dumerilii ,L. fletcheri, and L. peronii. Performance was also high for N. sudelli and L. raniformis but the sample sizes of their evaluation files were relatively small (Table 2). Performance was moderately high (ROC > 0.7) for most templates of C. signifera and C. parinsignifera. Conversely, performance was poor for L. tasmaniensis (ROC < 0.6), for which most templates showed low precision and moderate recall.
The L. fletcherii recogniser comprised two templates from two sites. All templates performed well. The first template had very high precision with only two false positive detections in one sound file. However, it had the greatest number of false negatives (i.e. poorest recall). The highest performing template had a ROC value of 0.897 and was moderately sensitive, but yielded fewer false positives. The third template was excluded. The L. dumerilii recogniser comprised four templates from two sites. Three of the templates had very high ROC values (0.87-0.81). The fourth template displayed the greatest number of false positive detections and the poorest ROC value (0.824) was excluded from the recogniser. The L. peronii recogniser comprised three templates from a single site. Two templates, had the same ROC value of 0.847 and all templates had relatively high precision.
The C. signifera recogniser comprised three templates from two sites, two from site S and one from site H (Appendix 1). All three templates performed moderately well, with ROC values between 0.767 and 0.793, modest survey precision and good recall. The C. parinsignifera recognisers were constructed from three templates, stemming from two sites. The template from the first site performed poorly, with a ROC value of 0.671. This template detected over 6700 calls in 58 sound files where the species was absent (low precision). The other templates were more precise and performed moderately well, with ROC values around 0.7-0.73.
The L. tasmaniensis recognisers performed poorly, with ROC values of 0.562, 0.596 and 0.582. One template performed moderately well for survey recall (0.822) but had poor precision and a very high number of detections in sound files where the species was absent. The other templates performed worse.
Two other species had limited validation data. The L. raniformisrecogniser comprised three templates from site R. All performed highly, with two templates having no false positives (precision of 1.00), and one template having no false negatives (recall of 1.00). All templates had a ROC of 0.917, however these performance metrics were calculated from only 24 evaluation sound files and should be interpreted cautiously. The N. sudelli recogniser comprised three templates from site M. All templates had a high ROC value of 0.984. All templates had false positive detections in only one file, although the number of detections varied. However, performance was evaluated on only 62 sound files.