1. Introduction
Copepods (Subphylum Crustacea; Class Hexanauplia; Subclass Copepoda) are an abundant and diverse group of zooplankton in the ocean [1, 2]. They play a key role in energy transfer within the pelagic food web[3]. They are also well-known for their wide-ranging and flexible feeding approaches [4]. Copepods, usually not more than a few millimetres in length, support a wide range of bacterial communities, both internally and externally (due to the release of organic and inorganic nutrients during feeding and excretion) [1-3]. Also, it is an already established fact that there is an exchange of bacterial communities between the copepods and the water-column due to their feeding behaviour[5, 6] and copepods transfer microbes from the photic zone up to the middle of the twilight zone [3, 7, 8]. The different environmental conditions between the surrounding water and copepods favour different bacterial communities[6, 7, 9].
However, feeding also changes the composition of bacterial communities in the copepod gut, i.e. high abundance of Rhodobacteraceae was reported in Acartia sp. with full gut than its starved counterparts[10]. Copepods have mutualistic associations with (Gammaproteobacteria) Pseudoalteromonas spp.. Also, Gammaproteobacteria was found to be more abundant in starvedCentropages sp., Acartia sp. [10]and Pleuromamma sp. [11]. Meanwhile, a notable change was observed among bacterial communities between the diapause phase and actively feeding Calanus finmarchicus[2]. In a similar way, Flavobacteriaceae was meagred in copepods during diapause and abundant in actively feeding its counterparts [2]. Datta et al.[2] reported that Marinimicrobium(Alteromonadaceae) was relatively more abundant in deep-dwelling copepods than its shallow counterparts and concluded that the copepods have inter-individual microbiome variations but the factors driving these variations are still unknown. From these early reports its well-known that bacterial communities associated with copepods vary on many factors, based on feeding, the difference in stages of life, body size and their vertical migration through the water column. Moreover, there may be a particular relationship or symbiotic and a natural core microbiome that depends necessarily not on the food, but on the host environment [10]. Herein, the terminology ’bacteriobiome’ means the total bacterial composition inhabiting in a specific biological niche (for example, copepods), including their genomic content and metabolic products [12]. It is a well-known fact that host-associated microbial communities remain essential for maintaining any ecosystems, and any variation in these communities can be unfavourable. So, studying the specific bacterial taxa associated with copepods and its variations as well as analysing the potential genes within the CAB will help us in understanding their role in the host health, marine food web and biogeochemical cycles.
Until now, only a few studies have sought to find the core-bacteria associated with the copepods using their clustering patterns[2] and presence/absence data[1]. From these studies, about eight bacterial orders, such as Actinomycetales, Bacillales, Flavobacteriales, Lactobacillales, Pseudomonadales Rhizobiales and Vibrionales , were found as core members in Pleuromamma spp.[1], whereas the phylum Proteobacteria was identified as core OTUs along with Actinobacteria and Bacteroidetes inCalanus finmarchicus [2].
Moreover, the gut of copepods has acidic pH and different oxygen gradient from the anal opening to the metasome region. This may influence certain groups of bacteria to colonise within the copepods. These bacterial communities could be specialised in iron dissolution, anaerobic methanogenesis [13], nitrite reduction[14] and anaerobic dinitrogen (N2)–fixation[15]. At any given time, the abundance of CAB would be two to three order less than the seawater, but, if we assume that there is one copepod per litre of seawater, the contribution of CAB to the marine biogeochemical cycles will be significant[1]. Already various studies have shown that CAB has a potential role in biogeochemical processes, such as nitrogen-fixation, [15, 16], denitrification[9], sulphur [17] and iron mineralisation [13].
The masking effect of the abundant bacterial community associated with copepod diet, copepod life stage and environmental conditions was considered the main hindrance in defining core bacterial operational taxonomic units (OTUs; equivalent to species) specific to copepod genera [2, 10]. So, herein, we combined the data from previous studies that dealt with copepod associated bacteria and used machine-learning algorithms to understand the core-bacteria associated with the copepods at least up to the genus level. For this, we analysed 16S rDNA gene sequences (V3-V4 & V4-V5 regions; ~16.5 million reads) of CAB belonging to five different copepod genera (Acartia spp., Calanus spp.,Centropages sp., Pleuromamma spp. and Temora spp.) using Quantitative Insights into Microbial Ecology (QIIME2) package[18]. Also, we hypothesised that, if the copepod genera have specific OTUs, then different copepods have a distinctive CAB, and the biogeochemical potential of the CAB will differ. We used Random Forest classifier, Gradient Boosting Classifier, Principle Coordinate Analysis (PCoA), Analysis Of the Composition of Microbiome (ANCOM), Principle Component Analysis (PCA) and Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt2) analysis [19] to test this hypothesis. The present study represents one of the biggest CAB-related DNA sequence data analysed to date.
2 Materials and Methods