PFVM for SUMO1 protein
The protein of small ubiquitin-related modifier 1 (SUMO1)has highly
flexible on the regions of N- and C-terminuses, which involves a variety
of cellular processes, such as nuclear transport, transcriptional
regulation, apoptosis and protein stability. The sequence of human
protein SUMO1_HUMAN with 101 of amino acids is available from UniProt
database. The PFVM was obtained within a few of seconds on a personal
computer (4 x [Intel(R) Core(TM) i53337 CPU @1.80 GHZ], Windows
64bit operating system) after input of the sequence of SUMO1_HUMAN. The
PFVM for SUMO1_HUMAN protein is shown in Table 1. It is obvious that
each set of 5 amino acids has different folds. For example, starting
from N-terminate, the first set of 5 amino acids “MSDQE” has 11 PFSC
letters representing different folding shape. With moving forward by one
amino acid, the second set of 5 amino acids “SDQEA” has 13 PFSC
folding shapes; the third set of 5 amino acids “DQEAK” has 11 of
different folding shapes; fourth set of 5 amino acids “QEAKP” has 9
different folding shapes; fifth set of 5 amino acids “EAKPS” has 8
different folding shapes and so on. Except the numbers different, the
patterns of folding shapes for each set of 5 amino acids are different.
With PFSC scheme, the distribution of the different numbers of folding
shapes as well as the different types of folding shapes for entire
protein of SUMO1_HUMAN can be simultaneously observed. Also, the
pattern of folding variations of SUMO1_HUMAN in PFVM is overall agree
with the disorder determined by PDB which is displayed on the bottom of
Table 1, i.e. both N-terminus and C-terminus are more disorder than
center region. Thus, the PFVM actually demonstrates the folding
variations, and the local folding variations of protein SUMO1 relying on
the order of amino acids in its sequence.
The PFVM in Table 1 also revealed the features of folding conformation
for entire protein of SUMO1_HUMAN. First, the PFVM is able to expose
the flexibility or rigidity for protein folding conformations. In the
middle portion of the sequence, the fragment (45-54), has a favor
conformation as “AAAADAAAAA” with PFSC letters in first row which is
almost typical alpha-helical conformation, and has the favor
conformation as “DDDDADDDDD” in second row which is alike
alpha-helical conformation. Also, the fragment in middle portion has
fewer options changing folding variations than the fragments at both
N-terminus and C-terminus. The revealed folding pattern in PFVM for
SUMO1_HUMAN generally agrees with the knowledge from given protein 3D
structural data in PDB. Seven of 3D structures for SUMO1_HUMAN protein
are displayed in Table 2, which listed their 3D images with PDB ID,
measurement methods, resolution, solvent and ligand, interacted protein,
PH and temperature etc. Thus, these 3D structures may have different
folding conformations. With comparison, the superimposition of 17
folding conformations from seven PDB 3D structures is displayed in
Figure 3, where the fragments in middle region share common folding
pattern while the fragments at both N-terminus and C-terminus are
diverse. Second, the PFVM is able to provide the information to
construct all possible conformations for a protein. Each conformation of
SUMO1_HUMAN can be constructed by taking one PFSC letter from each
column in PFVM, and is expressed by a PFSC string. Apparently, the
conformations with astronomical number can be generated by various
combinations of PFSC letters in PFVM, and its astronomical number can be
calculated by multiplying the number of PFSC letters in each column for
each 5 amino acids. For SUMO1_HUMAN, the astronomical number of all
possible conformations is 1.587x1077. Obviously, it is
impossible to display all 3D folding structures of such huge number.
Thus, the PFVM truly is an optimized mode to present the comprehensive
local folding variations for entire protein, and it can generate all
possible folding conformations for a protein. Furthermore, the most
possible conformations with stable states can be constructed from PFVM.
One of most probable conformations for SUMO1_HUMAN protein is directly
consisted by PFSC letters on the first row in PFVM of Table 1.
Therefore, the PFVM provides significant folding information for
SUMO1_HUMAN.