3.5. Machine learning-based models to predict important s-OTUs
The overall accuracy of the RandomForest classifier (RFC) model was
0.923 with an accuracy ratio of 1.68, indicating high reliability
(Figure 4a). However, the Gradient Boosting Classifier (GBC) model
showed better prediction accuracy with a value of 0.967 and accuracy
ratio of 1.76 (Figure 4b). The accuracy of RFC in predicting important
bacterial s-OTUs in copepod genera was in the range of 0.16 to 1 (Figure
4a), whereas the accuracy of GBC in predicting important s-OTUs in the
copepod genera was in the range of 0.16 to 1(Figure 4b). The prediction
accuracy of important s-OTUs predicted in Calanus spp. andPleuromamma spp. by both of the supervised machine learning (SML)
(RFC and GBC) classifiers, was high (1.00) unlike the prediction
accuracy for Acartia spp. (0.5 in RFC and 0.83 in GBC),Temora spp. (0.0 in RFC and 0.66 in GBC) and Centropagessp. (0.5 in RFC and 0.5 in GBC). The graphical representation of the
machine learning model’s Receiver Operating Characteristic (ROC) curve
was in the range of 0.98 to 1, and 0.98 to 1 for RFC and GBC,
respectively (Figure 4c and 4d). This shows the high positive prediction
rate and low false prediction rate for both the SML classifiers (RFC and
GBC).
RFC predicted 25 bacterial taxa and one archaeal taxon in five copepod
genera as important s-OTUs with differential hierarchical resolutions
ranging from family to species level. From the RFC prediction accuracy
values, only the s-OTUs predicted as important s-OTUs for theCalanus spp. and Pleuromamma spp. can be considered due to
the low prediction accuracy for Acartia spp., Temora spp.
and Cetrophages sp. The following s-OTUs were predicted as the
important s-OTUs by RFC only for Calanus spp., i.e.Photobacterium, Vibrio shilonii, Acinetobacter johnsonii,
Acinetobacter schindleri, Micrococcus, Micrococcus luteus ,Anaerospora, and Methylobacteriaceae. Specific important s-OTUs
for the three other genera of copepod was not evident (Figure 4e).
In the case of GBC, a total of 28 taxa and one archaeal taxon was
predicted as important s-OTUs for the five copepod genera (Figure 4f).
From the GBC prediction accuracy values, only the were s-OTUs predicted
as important s-OTUs for the Calanus spp. and Pleuromammaspp. which can further be considered due to the low prediction accuracy
for Acartia spp., Temora spp. and Cetrophages sp.
which was similar to the RFC prediction. The following s-OTUs were
predicted as the important s-OTUs by GBC only for Calanus spp.,
i.e. Acinetobacter johnsonii, Vibrio shilonii, Phaeobacter and
Piscirickettsiaceae. And s-OTU of Marinobacter, Alteromonas,
Pseudoalteromonas, Desulfovibrio , Limnobacter, Sphingomonas,
Methyloversatilis, Enhydrobacter and Coribacteriaceaewere predicted as
important s-OTUs in Pleuromamma spp. [58].