Table
6. The PFVM of P53_HUMAN embracing folding conformations of known 3D
structures. Top section: the sequence (101-200) of P53_HUMAN protein
with numeric ruler. Middle section: the PSCF strings for 12 known 3D
structures of P53_HUMAN which are available in PDB: 2YBG, 4AGP, 2XWR,
2X0U, 2J1X, 2J1Y, 2FEJ, 2BIM, 3D05, 3D06, 3ZME and 5LAP. All PFSC
letters for 2YBG is first marked by yellow color. Other PFSC letters in
same column for these given 3D structures are marked by yellow color if
they differ from 2YBG. Some pieces of missed fragments without atomic
coordinate data in structure database in PDB are indicated by dots.
Bottom section: the PFVM of P53_HUMAN protein. The PFSC letters in each
column in PFVM are highlighted by yellow color if the corresponding
local folding shapes for 5 successive amino acids in given 3D structures
are yellow. The PFSC letters are marked by colors: red is for typical
helix fold; blue for typical beta fold; pink and light blue for folds
with partial helix or beta; black for irregular folds.
Astronomical Number of Conformations
The number of total conformations for a protein can be figured out
according the aggregation of local folding variations in the PFVM.
Generally known, total number of possible folding conformations for
protein is large, but the so called astronomical number of conformations
for any protein is blurred. However, with PFVM, the number of total
conformations for a protein can be explicitly calculated based on the
PFVM because total number of conformations for a protein is the product
of numbers of PFSC letters at each column in PFVM. The total numbers of
all possible conformations for ten proteins as samples are listed in
Table 7. These samples represent a wider variety of proteins and the
lengths of sequences with a range from 101 to 1,382 of amino acids. The
results show that the total numbers of possible conformations for these
proteins are large which have a scope from 1077 to
10784. Two proteins among these samples, PDCD1_MOUSE
and PDCD1_HUMAN, belong to same gene code PDCD1 with equal length of
sequence, but they are respectively for human and mouse species. Despite
both proteins having the equal length of sequence of 288 amino acids,
the alignment revealed 172 of amino acids identical (about 60%) between
two protein sequences. According PFVM of both proteins, total numbers of
conformations respectively are 3.028x10151 and
2.827x10147 because of the protein differentiation in
sequence for different organisms. The results demonstrated that the
total number of conformations of protein related to the length of
sequence as well as the composition of amino acids in sequence. In
summary, based on protein sequence, the PFVM does not only expose the
comprehensive local folding variations and predicted the most possible
conformations, but also provided the actual number of total
conformations for a protein.