Sequence network
For the comparison of PETase sequences, the profile HMM for PETases was
used to identify the PETase core domain, and the sequences of all core
domains were aligned without considering additional regions at the N- or
C-termini (signal peptides or transport domains, respectively). In
superfamily 1 of the LED, 31,560 sequences were annotated as GX-type,
but only 2930 sequences were identified as PETase homologous by a
profile HMM. At a threshold of 55% sequence similarity, the bacterial
PETase core domains formed a large cluster, mainly originating from
Actinobacteria or Proteobacteria (Figure 1 ). Most of the
sequences from the PMBD were found in this cluster (Figure S3 ).
In addition, a connected subgroup of PETase core domains from other
bacterial phyla emerged, such as the PETase proteins from Bacteroidetes
or Planctomycetes. Some homologues of PETase core domains occurred also
in enzymes from extremophiles (Figure S4 ). The fungal PETase
core domains such as the PETase homologues from Fusarium were
separated from the bacterial PETase core domains. At a higher threshold
of 60% sequence similarity, the sequences for PETase core domains from
Bacteroidetes or Planctomycetes emerged as a separated cluster
(Figure S5 ).