Figure 5. The predicted protein 3D structures from PFVM-01. First row displays the predicted 3D Structure from SUMO1_HUMAN PFVM-01; second row displays the predicted 3D Structure from P53_HUMAN PFVM-01; left side is comparison between known 3D structure and the predicted structure (brown color). The predicted 3D structures for K4GSD6_9SAUR, C4IXC1_9TELE, A0A851ZE52_9AVES and EP3B_HUMAN are displayed respectively.
DISCUSSION
Computation and database for protein folding.
Many of computational methodologies and database for protein folding have been developed,11Compiani M, Capriotti E, ”Computational and theoretical methods for protein folding”. Biochemistry. 52 (48): 8601–24, (2013).. and the efforts may be divided into two aspects, one aspect is to predict the protein structure with thermodynamic stability and another aspect is to investigate the protein conformations with variability.
In first aspect, the prediction of protein structure from a sequence is pursuing to obtain a native folding conformation with thermodynamic stability, and the stable structure is mainly controlled by hydrophobic interactions, hydrogen bonds, van der Waals forces, and conformational entropy. In general, the methods for prediction of protein structure fall into two main categories: template-free modeling and template-based modeling.22Guo JT, Ellrott K, Xu Y. A historical perspective of template-based protein structure prediction. Methods Mol Biol; 413:3–42, (2008).,33Dorn M, E Silva MB, Buriol LS, Lamb LC. Three-dimensional protein structure prediction: methods and computational strategies. Comput Biol Chem; 53PB:251–76, (2014).,44Brylinski M. Is the growth rate of Protein Data Bank sufficient to solve the protein structure prediction problem using template-based modeling? : Bio-Algorithms and Med-Systems[J]. Bio-Algorithms and Med-Systems, 11(1):1-7, (2015). The template-free methods, i.e., ab initio or de novoapproaches, are based on the energy functions which carry out through the molecular dynamics (MD) simulation calculations under various force fields for atoms interaction or experiential parameters for group atoms interaction.55Honig B. Protein folding: from the levinthal paradox to structure prediction. J Mol Biol; 293:283–93, (1999).,66Onuchic JN, Wolynes PG. Theory of protein folding. Curr Opin Struct Biol; 14:70–5, (2004).,77Zhang J, Li W, Wang J, Qin M, Wu L, Yan Z, et al. Protein folding simulations: from coarse-grained model to all-atom model. IUBMB Life; 61:627–43, (2009). The protein with stable conformation is finally obtained by iterative convergence to lower thermodynamic free energy under defined force fields, such as AMBER,88Yang, L., Tan, C. H., Hsieh, M. J., Wang, J., Duan, Y., Cieplak, P., Caldwell, J., Kollman, P. A., and Luo, R. New-generation amber united-atom force field. J. Phys. Chem. B 110, 13166-13176, (2006). CHARMM99Brooks, B. etc, CHARMM: The biomolecular simulation program. J. Comput. Chem. 30, 1545-1614, (2009). and GROMOS1010Riniker, S., Christ, C. D., Hansen, H. S., Hunenberger, P. H., Oostenbrink, C., Steiner, D., and van Gunsteren, W. F. Calculation of relative free energies for ligand-protein binding, solvation, and conformational transitions using the GROMOS software. J. Phys. Chem. B 115, 13570-13577, (2011). force fields. The software from Chemistry at Harvard Macromolecular Mechanics (CHARMM) 33,1111Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M, ”CHARMM: A program for macromolecular energy, minimization, and dynamics calculations”. J Comp Chem. 4 (2): 187–217, (1983).  is one of the most mature algorithm for molecular dynamics, which minimizes the free energy of a protein structure while collecting the molecular dynamics trajectory of united-atom all-atom, dihedral potential corrected variants and polarization. The Rosetta software1212Leaver-Fay, A., Tyka, M., Lewis, S. M., Lange, O. F., Thompson, J., Jacak, R., Kaufman, K., Renfrew, P. D., Smith, C. A., Sheffler, W., Davis, I. W., Cooper, S., Treuille, A., Mandell, D. J., Richter, F., Ban, Y. E., Fleishman, S. J., Corn, J. E., Kim, D. E., Lyskov, S., Berrondo, M., Mentzer, S., Popovic, Z., Havranek, J. J., Karanicolas, J., Das, R., Meiler, J., Kortemme, T., Gray, J. J., Kuhlman, B., Baker, D., and Bradley, P., ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545-574, (2011). developed by the Berkeley Open Infrastructure for Network Computing Platform is one of de novo tools to predict protein structure, which is assembled by Monte Carlo simulated annealing procedure relying on a library of residue fragments.1313Kroese, D. P.; Brereton, T.; Taimre, T.; Botev, Z. I., ”Why the Monte Carlo method is so important today”. WIREs Comput Stat. 6: 386–392, (2014). In practice, the protein structure prediction is efficient for calculating smaller proteins, and requires vast computational resources for larger proteins. The template-based methods, such as homology modeling or comparative modeling, align sequences according to similarity of multiple templates from PDB, and then process energy optimization to predict protein 3D structure. With sequence homologous, it assumes that similar sequences have similar folding conformations. Depending on the homology modeling, I-TASSER1414Roy, A., Kucukural, A., and Zhang, Y., I-TASSER: A unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725-738, (2010)., Robetta1515Kim, D. E., Chivian, D., and Baker, D., Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526-W531, (2004). and MODELER1616Eswar, N., Webb, B., Marti-Renom, M. A., Madhusudhan, M. S., Eramian, D., Shen, M. Y., Pieper, U., and Sali, A., Comparative protein structure modeling using MODELLER. Current Protocols in Protein Science, Chapter 2, Unit 2.9, (2007), Wiley, New York.,1717Liu, T., Tang, G. W., and Capriotti, E., Comparative Modeling: The state of the art and protein drug target structure prediction. Comb. Chem. High Throughput Screening 14, 532-537, (2011). software build protein for unknown 3D structure. If there is not a distinguishably similar sequence matched in PDB database, the template-free approaches will provide the supplement for thermodynamics calculations. Recently, with a deep learning in artificial intelligence (AI), AlphaFold approach was particularly successful at predicting the most accurate structure and with demonstration in CASP13 and CASP14.1818DeepMind’s protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. Retrieved (2020).,1919Sample, Ian (2 December 2018). ”Google’s DeepMind predicts 3D shapes of proteins”. The Guardian. Retrieved 30 November (2020).,2020 ”DeepMind’s protein-folding AI has solved a 50-year-old grand challenge of biology”. MIT Technology Review . (2020). AlphaFold first handled the protein structure as a spatial graph with the residues as nodes and the connection of residues as edges. Then, it trained the system on all available protein 3D structures from PDB together with the databases containing protein sequences of unknown structure. For physical interactions within proteins, it created an attention-based neural network system, and trained residue-to-residue and atom-atom using an internal confidence measure. The protein structure was refined by evolutionarily related multiple sequence alignment (MSA) and a representation of amino acid residue pairs. With iterating process, AlphaFold predicted the underlying physical structure of the protein and is able to determine highly-accurate structures.
In second aspect, the objective of protein folding is to investigate variations of conformations because the proteins in essence are non-static structures, but rather conformational ensembles with multiple states. With general knowledge, the protein adjusts the folding conformations under different environments or interaction of ligand or protein. Also, intrinsically disordered proteins and regions (IDPs/IDR) are widely distributed in natural proteins, which are associated with many biological processes and diseases.2121Chen J , Guo M , Wang X , et al., A comprehensive review and comparison of different computational methods for protein remote homology detection[J]. Briefings in Bioinformatics(2):2. 1–17, (2017). The IDPs/IDR for protein 3D structures can be identified by many experimental techniques.2222Robin van der Lee , etc. Classification of intrinsically disordered regions and proteins.[J]. Chemical Reviews, 114(13):6589, (2014). DisProt,2323Piovesan D, Tabaro F, Micetic I, et al. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res, 45:D1123–4, (2017). IDEAL2424Fukuchi S, Sakamoto S, Nobe Y, et al. IDEAL: intrinsically disordered proteins with extensive annotations and literature. Nucleic Acids Res; 40:D507–11, (2012). and MobiD2525Potenza E, Di Domenico T, Walsh I, et al. MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res 2015;43:D315–20. are useful databases for IDP/IDR, and PDB also provides the illustration. Moreover, under physiological conditions, a native protein essentially is able to undergo a reversible transition between disorder and order folding conformations. In 1973, Anfinsen’s Nobel prize-winning experiments2626Anfinsen CB, Principles that govern the folding of protein chains. Science 181: 223–230. showed that the protein ribonuclease can be reversibly denatured and re-natured in a test tube, and then over thousands of other proteins have been demonstrated with folding reversibility with condition changes. The protein has folding reversibility because of small energy barriers (5 to 15 kcal/mol) between the folded and unfolded populations.2727Kumar MD, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, Uedaira H, Sarai A (2006) Nucleic Acids Res 34:D204–D206, (1973). Different computational approach have been developed focusing on the variability of protein folding. In the late 1970s, Karplus and Weaver developed the diffusion-collision (DC) model,2828Karplus, M., and Weaver, D. L., Protein-folding dynamics. Nature 260, 404-406, (1976).,2929Karplus, M., and Weaver, D. L., Diffusion-collision model for protein folding. Biopolymers 18, 1421-1437, (1979).,3030Islam, S. A., Karplus, M., and Weaver, D. L., Application of the diffusion-collision model to the folding of three-helix bundle proteins. J. Mol. Biol. 318, 199-215, (2002).,3131Myers, J. K., and Oas, T. G., Preorganized secondary structure as an important determinant of fast protein folding. Nat. Struct. Biol. 8, 552-558, (2001). that explored the long-term protein evolution and allowed the large amplitude changes in the folding dynamics. Later it was modified into the foldon diffusion-collision (FDC)3232Fuxreiter, M., Simon, I., Friedrich, P., and Tompa, P., Preformed structural elements feature in partner recognition by intrinsically unstructured proteins. J. Mol. Biol. 338, 1015-1026, (2004).,3333Compiani, M., Capriotti, E., and Casadio, R., Dynamics of the minimally frustrated helices determine the hierarchical folding of small helical proteins. Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys. 69, 051905, (2004).,3434Stizza, A., Capriotti, E., and Compiani, M., A minimal model of three-state folding dynamics of helical proteins. J. Phys. Chem. B 109, 4215-4226, (2005). which provided a more refined description of folding transforms, including predicting the secondary native structure and specifying stability of the foldons themselves. In 1977, the hydrophobic collapse (HC) mechanism3535Dill, K. A., Theory for the folding and stability of globular proteins. Biochemistry 24, 1501-1509, (1985).,3636Haran, G., How, when and why proteins collapse: The relation to folding. Curr. Opin. Struct. Biol. 22, 14-20, (2012). was developed to predict that the hydrophobic forces and backbone forces result in chain collapse prior to the formation of elements of secondary structure. Of course, except hydrophobic, the hydrogen bonds and van der Waals forces are also steering the unfolded protein toward a collapsed configuration.3737Barbosa, M. A., Garcia, L. G., and Pereira de Araujo, A. F., Entropy reduction effect imposed by hydrogen bond formation on protein folding cooperativity: Evidence from a hydrophobic minimalist model. Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys. 72, 051903, (2005). In 2000, Folding@Home project was developed at Stanford University to compute the protein folding with widely adopting the contribution of computing resource. As a huge number of folding conformations, the molecular dynamics (MD) simulations is a time-demanding process which rely on parallel supercomputing architectures or using personal computing clusters.3838Zagrovic, B., Snow, C. D., Shirts, M. R., and Pande, V. S., Simulation of folding of a small α-helical protein in atomistic detail using worldwide-distributed computing. J. Mol. Biol. 323, 927-937, (2002).,3939Adcock, S. A., and McCammon, J. A., Molecular dynamics: survey of methods for simulating the activity of proteins.Chem. Rev. 106, 1589-1615, (2006).,4040Rizzuti, B., and Daggett, V., Using simulations to provide the framework for experimental protein folding studies. Arch. Biochem. Biophys. 531, 128-135, (2013).,4141Daggett, V., Protein folding-simulation. Chem. Rev. 106, 1898-1916, (2006). Anyway, the computational approaches for all possible conformations to thoroughly resolve the protein folding problem is now far less successful than was thought in the early days, and it is still one of challenging subjects in the field of protein physical science. Recently, Google’s DeepMind applied the artificial intelligence (AI) and successfully developed Alphafold approach which can regularly predict protein structures with atomic accuracy competitive with experimental structures. It trained a neural network to accurately predict the distances between pairs of residues in a protein, and a protein was optimized by a simple gradient descent algorithm to realize structures. As the achievement of Alphafold, more scientific resource and attention are focusing on the resolution of protein folding problem.
The protein folding information can be extracted from protein structure databases. The PDB is the most inclusive repository of protein 3D structures. So far, nearly 190,000 protein 3D structures have been available in PDB, where approximately 90% are obtained by X-ray crystallography and the remain by NMR, CryoTEM and other techniques. The X-ray crystallography may determine accurate atomic coordination for 3D structure, but it only represents a specific static protein folding state. The NMR and CryoTEM display the protein flexibility that structural oscillation is limited around an equilibrium state under certain conditions. The Structure Classification of Proteins (SCOP)4242http://scop.mrc-lmb.cam.ac.uk/scop database classifies the protein structural domains into the hierarchy in terms of Species, Protein, Family, Superfamily, Fold and Class. It defines 1,232 folds, 2,026 superfamilies and 4,919 families. If two protein domains have similar secondary structures with the similar topological connections, they belong to the same fold. The Class, Architecture, Topological fold and Homologous superfamily (CATH) 4343http://www.cathdb.info classifies 95 million of protein domains into 1,391 topological folds and 6,119 superfamilies. If two proteins have similar topological fold and sequence in conjunction with similar functions, they are assumed to be associated with the same category in CATH. The ProTherm4444http://www.abren.net/protherm/ database is a source for understanding the protein folding stability with the thermodynamic parameters for 25,830 structures, which includes numerical data changes in Gibbs free energy, enthalpy, heat capacity and transition temperature etc. Nevertheless, the crucial question is whether the protein database can be directly utilized for the investigation of protein folding. The first question is if current protein structural data and future coming data are sufficient for fold recognition, and the answer is negative.28 The second question is whether the defined topological folding patterns (about 1200 types of folds in SCOP and near 1,400 in CATH database) are enough to correlate the protein folding with the regulation of amino acid in sequence, and the answer is insufficient. However, a number of structural data from experimental and computational approaches should assist to understand the protein folding in some degree. As a whole, the longer fragments were hard thorough to investigate the folding patterns because of the larger the folding prototype involving less universal folding pattern. Therefore, to define a universal small folden as a prototype, such as the backbone of 5 amino acid residues, may overcome these obstacles to probe the folding patterns in protein structure database.
Here, the protein structure fingerprint approach demonstrated a useful means to describe complete protein folding conformations and to construct explicit database for protein folding. In mathematical space, the backbone of 5 points connection is adopted as a universal folden and the complete folding space is described by 27 PFSC alphabetic letters. In biological space, the possible folds of 5 of amino acid residues are limited by constrains, and then different combinations of 5 of amino acid residues have different folding number and patterns. Thus, a database (5AAPFSC) was created to collect all folding shapes for all combinations of 5 of amino acid residues. For protein, one PFSC string represents a complete folding description, and one PFVM matrix represents comprehensive folding variation. Based on PFVM, not only does all possible folding conformations in astronomical number are obtained, but the most possible conformations are also obtained. Therefore, the protein structure fingerprint approach covers two aspects, it can predict stable folding conformation as well discover variations of folding conformation with massive number. Furthermore, the digital alphabetic PFSC provides a simplified mode to resolve the protein folding problem. As a result, the astronomical number of folding conformations can be easily stored into a database for protein folding. Thus, the protein structure fingerprint approach made a significant foundation to solve protein folding problem.
Image visualization vs. alphabetic description.
Due to complexity of protein structure, the protein structure fingerprint provided the PFSC alphabetical description to probe a huge number of protein data, especially it is suitable to study the protein folding conformations with an astronomical number. The protein 3D structure data are originally obtained by experimental measurements or computational approaches, which pursue to display 3D image visualization for protein structure. For single protein, its 3D structural image is displayed according thousand lines of atomic coordinates in the protein data file. Although a protein 3D structure is directly perceived through the senses to understand the folding orientation in space, it is not easily to illustrate the features of protein folding features. For comparison of proteins, with structural superposition, the similarity is quantified by the root-mean-square deviation (RMSD) as score. Nevertheless, it does not provide any detail where and how are similar or dissimilar between proteins, and artificial process severely affect the outcome. So, it is hard to explain the similarity and dissimilarity between proteins with 3D image visualization.4545Fitzkee NC, Fleming PJ, Gong H, Panasik N, Street TO, Rose GD. Are proteins made from a limited parts list? Trends Biochem Sci; 30:73–80, 2005.,4646Irving JA, Whisstock JC, Lesk AM. Protein structural alignments and functional genomics. Proteins; 42:378–382, 2001.,4747Sam V, Tai CH, Garnier J, Gibrat JF, Lee B, Munson PJ. ROC and confusion analysis of structure comparison methods identify the main causes of divergence from manual protein classification. BMC Bioinform; 7:206, 2006.,4848Yang J. Complete description of protein folding shapes for structural comparison[J]. Proteomics Research Journal, 3(1):1-22, (2012).,4949Sarah A. Middleton, Joseph Illuminati & Junhyong Kim, Complete fold annotation of the human proteome using a novel structural feature space, Scientific Reports volume 7, Article number: 46321 (2017). Furthermore, it is almost unimaginable to construct an astronomical number of 3D conformations for a protein to probe the protein folding problem, and to involve with billions of protein sequences even worse. However, one-dimensional PFSC alphabetic string provided a useful protocol to overcome these obstacles because it makes easily store and study a massive number of protein conformations. The PFSC alphabetic representation does not only simplify the description of protein conformation, but also it can align a large number of folding conformations for comparison. With advance, the PFSC alphabetic string covers the regular secondary fragments as well as the tertiary fragments, so it became a valuable approach to study the protein conformations with an astronomical number.
The alphabetic description has been adopted following development of protein structure study. Except to label regular secondary motifs of alpha helixes and beta strands, many different methods have been developed trying to label protein conformation more detail with alphabetic description. Some methods adopted more alphabetic letters to distinguish secondary structure motifs in detail which specified the patterns of hydrogen bonds and geometric criteria, such as Cα distances, Cα angles, dihedral angles between Cα atoms, or a pairs of ψ and φ dihedral angles around a Cα atom.5050Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers; 22:2577–2637, (1983).,5151Ridchards FM, Kundrot CE. Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure. Proteins; 3:71–84, (1988).,5252Frishman D, Argos P. Knowledge-based protein secondary structure. Proteins; 23:566–579, (1995).,5353Sklenar H, Etchebest C, Lavery R. Describing protein structure: a general algorithm yielding complete helicoidal parameters and aunique overall axis. Proteins; 6:46–60, (1989).,5454Labesse G, Colloc’h N, Pothier J, Mornon JP. P-SEA: a new efficient assignment of secondary structure from C alpha trace of proteins. Comput Appl Biosci; 13:3:291–295, (1997).,5555Martin J, Letellier G, Marin A, Taly JF, de Brevern AG, Gibrat JF. Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct Biol; 5:17–34, (2005). Other methods identified the patterns of structural segments with observations from a large number of structures in training database, and extracted certain motifs as folding prototypes by statistics adjustment and then labeled with alphabetic letters.5656Fetrow JS, Palumbo MJ, Berg G. Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme. Proteins; 27:249–271, (1997).,5757Zhang X, Fetrow JS, Berg G. Design of an auto-associative neural network with hidden layer activations that were used to reclassify local protein structures. In: Crabb VJ, editor. Advances in Protein Chemistry. San Diego, CA: Academic Press; pp 397–404 (1994).,5858Brevern AG, Etchebest C, Hazout S. Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Proteins; 41:271–287, (2000).,5959Alexandre G, de Brevern1, Valadie´ H, Hazout S, Etchebest C. Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship. Prot Sci; 11:2871–2886, (2002).,6060Fourrier L, Benros C, Brevern AG. Use of a structural alphabet for analysis of short loops connecting repetitive structures. BMC Bioinform; 5:58, (2004).,6161Joseph A P, Srinivasan N, Brevern A G D. Improvement of protein structure comparison using a structural alphabet[J]. Biochimie, 93(9):1434, (2011). So far, most of alphabetic methods adopted 9-16 letters to describe various folding protocols with different lengths in fragments. Nevertheless, none of methods guarantee to provide a complete coverage for all possible folding patterns due to ignoring some of fragment motifs, such as irregular loops and coils or uncommon folding shape with rare appearances in structures, etc. However, the PFSC overcome the shortcomings, it provided a set of 27 alphabetical letters to cover all possible folds for successive 5 amino acid residues, and a PFSC string describe the complete folding conformation without gaps from N-terminus to C-terminus including regular secondary fragments and irregular tertiary fragments.
The protein structure fingerprint can describe the folding conformations with alphabetic description, no matter what the protein 3D structure is known or unknown. For protein with known 3D structure, the folding shape of each of 5 amino acid residues is assigned by one of PFSC letter according the atomic coordinates, and then the conformation of entire protein is expressed by a PFSC string. For protein without known 3D structure, the comprehensive folding variations for a protein are able simultaneously to be observed by the PFSC letters in PFVM with impressiveness covering all at one glance. Also, an astronomical number of folding conformations for a protein can be assembled with various PFSC letters in PFVM. Furthermore, any PFSC string represents one of folding conformations, and it can be conversely converted into 3D structure.
The alphabetic letters provide a brief description for biological structure in macromolecule system. The DNA polymer applies four letters (C, G, A and T) to describe the backbone strand comprised of four deoxyribonucleic acids in genetic code. The protein polymer applies 20 of amino acids with single letters to describe one-dimensional sequence. Biological structure is embedded in assembly processes, from one-dimensional DNA, mRNA to protein sequence until protein folding. In the first step, the genetic information is stored in the DNA sequence and transmitted through transcriptional and translated into one-dimension protein sequence. In the second step, the protein is folded from one-dimensional sequence to 3D structure for expressing the vitality of life. To date, however, the knowledge and understanding of protein folding lag far behind the DNA and protein sequences. The protein structure fingerprint made a significant progress which applied a set of 27 PFSC letters to describe protein folding. Thus, the PFSC perfectly matched alphabetic description of DNA and protein sequence, and it is possible to integrate the huge data of protein folding conformations with DNA or mRNA sequence and protein sequence.
Protein folding vs. the order of amino acids in sequence
It is well known that the protein folding in principal depends on the order of amino acids in sequence. Although researchers confirmed this principal with many biological experiments, it lacks a systematical depiction in bioinformatics aspect. Also, it is not easy to clearly illustrate how the order of amino acids in sequence affects the folding changes in protein. However, with a universal process, the PFVM integrally displays the correlation between protein folding changes and sequence variations. Generally, different protein sequences will have different folding patterns in PFVM. The folding pattern difference is presented in several aspects in PFVM even if only one amino acid was substituted. The differences include the changes of the types of folding shapes as well as the number of possible folding. Also, if one of amino acid is substituted, it will not only cause PFSC letter changes in one column, but a band of 5 columns in PFVM. These changes in PFVM well demonstrated that the protein folding depended on the order of amino acids in sequence.
The PFVM characteristically display the local folding variations along the sequence. The numbers changes of local folding shapes display the analogous fluctuation spectrum, and indicate some portions of protein with more flexible while other portions with less flexible. The fluctuation curves of numbers of local folding shapes for protein PDCD1_MOUSE and PDCD1_HUMAN are shown in Figure 6. First, each curve exposed how the folding flexibility following the order of amino acids in sequence. Second, both curves are different because of the differentiation of amino acids in sequences. At least, 4 locations in curves (35-43, 78-91, 139-151 and 170-179 in sequence) have the opposite tendency for the vibration of numbers of local folding variations. Thus, a fluctuation curve from PFVM concretely indicates how the protein folding relates the order of amino acid in sequence. Thus, the PFVM is a useful tool to probe the protein mutation, protein differentiation, protein design, protein prediction and protein misfolding etc.