2. The diversity and intra-group heterogeneity of the VMB was higher in PCOS patients
At the genus level, the proportion of Lactobacillus reduced (FDR<0.2) and the proportion of Gardnerella andUreaplasma increased in the PCOS group (FDR<0.2) (Figure 1a), compared with the control group. At the species level, we found that U. parvum , G. vaginalis , A. baumannii , P. buccalis ,P. timonensis , and P. acnes were more abundant in the PCOS group, while the abundance of L. Jesenia ,L. iners, B. breve, and L. pontis were significantly depleted (FDR<0.2) (Figure 1b).
The increase of the Shannon index and decrease of the Simpson index indicated that the VMB in the PCOS group had higher diversity than that of the control group (P<0.05) (Figure 2a,2b). The Chao 1 index and Shannon index of the PD subgroup was also higher than that of the control group (P<0.05)(Figure 2c,2d). However, there was no significant difference in diversity between the PA subgroup and the control group (P>0.05). The PCoA analysis indicated that there was no significant difference in the VMB structure between groups(Figure 3a,3b) (P>0.05). PCOS group had higher intragroup variation compared with control group (P<0.05) (Figure 3c)
We constructed a random forest model for PCOS. The feature confirmed by the Boruta algorithm23 was selected as an important species for classification accuracy. We could accurately distinguish PCOS patients from healthy controls, as indicated by the area under the receiver operating curve (AUC), which had a maximum value to 0.8 (Figure 4)
3. Correlations between VMB and clinical indicators
A total of 35 bacterial species from an overlap set of differential species between PCOS and controls and significant species in model of random forest model were used to analyze the correlation with clinical indicators. G. vaginalis was positively correlated with serum level of AMH, E2, and P (p<0.05). AMH, LH and T showed the highest positive correlation strength withU.parvum and A. baumannii , but a negative correlation with Prevotella . In addition, HDL and TG levels were associated with the abundance of L. acidophilus , P.buccali and U. parvum (Figure 5).
4. Lactobacillus crispatus and Prevotella timonensis drove changes in PCOS vaginal microbiota co-occurrence network
Two networks were separately constructed for the PCOS and control groups. The topology of the two networks were similar. In both groups module 1 and 2 mostly contained L. crispatus , and L.iners, which showed a negative correlation with each other (Figure 6a,6b). The largest was module 3 in the two groups and was mainly composed of potential vaginal pathogens, including G. vaginalis, P. bivia, P. timonensis, P. amnii, P. buccalis,P. disiens, A. vaginae, D. micraerophilus, S. sanguinegens, and