2.3 Bioinformatics, sequence analysis and statistical analysis
In order to obtain more reliable and high-quality sequencing results (valid reads), the following pre-procedures were performed on the raw reads from the Illumina MiSeq platform: raw reads were demultiplexed, quality-filtered by FASTP (version 0.19.6; Chen, Zhou, Chen, & Gu, 2018) and merged by FLASH (version 1.2.11; Magoc & Salzberg, 2011), and high-quality reads were clustered as an operational taxonomic unit (OTU) by UPARSE (version 7.0) when the sets of sequences shared at least 97% identity (Edgar, 2013), and chimeric sequences were identified and removed. All OTUs with totaling reads more than 50 were used. The taxonomy of each OTU representative sequence was analyzed by RDP Classifier (version 2.11; Wang, Garrity, Tiedje, & Cole, 2007) against the Silva 16S rRNA database (version 138) using confidence threshold of 70% (Quast et al., 2013).
Mothur software (version 1.30.2) was employed to calculate α-diversity including Sobs, Chao1, Shannon, Simpson, and Coverage, and Student’st -test was performed to compare α-diversity estimates andP -value less than 0.05 was considered statistically significant. β-diversity analysis was performed and visualized with principal coordinates analysis (PCoA) were determined by Bray-Curtis distances, based on OTU compositions and Adonis test (with 999 permutations) was conducted to show differentiation in microbial structures of different sexes.
Taxa abundances in two sexes at the phylum and genus levels were compared by Wilcoxon rank-sum test and a two-tailed P -value less than 0.05 was considered significant (with bootstrap values 95%). The different biomarkers associated with sex were characterized by linear discriminant analysis (LDA) effect size (LEfSe) (Segata et al., 2011). Microbial functions were predicted by using phylogenetic investigation of communities by reconstruction of unobserved states 2 (PICRUSt2) based on high-quality sequences, and an Independent sample t -test was performed to measure whether the difference between the two sexes is significant (SPSS, version 26.0), and the diagrams were finished by Origin (version 2019).