PFVM for SUMO1 protein
The protein of small ubiquitin-related modifier 1 (SUMO1)has highly flexible on the regions of N- and C-terminuses, which involves a variety of cellular processes, such as nuclear transport, transcriptional regulation, apoptosis and protein stability. The sequence of human protein SUMO1_HUMAN with 101 of amino acids is available from UniProt database. The PFVM was obtained within a few of seconds on a personal computer (4 x [Intel(R) Core(TM) i53337 CPU @1.80 GHZ], Windows 64bit operating system) after input of the sequence of SUMO1_HUMAN. The PFVM for SUMO1_HUMAN protein is shown in Table 1. It is obvious that each set of 5 amino acids has different folds. For example, starting from N-terminate, the first set of 5 amino acids “MSDQE” has 11 PFSC letters representing different folding shape. With moving forward by one amino acid, the second set of 5 amino acids “SDQEA” has 13 PFSC folding shapes; the third set of 5 amino acids “DQEAK” has 11 of different folding shapes; fourth set of 5 amino acids “QEAKP” has 9 different folding shapes; fifth set of 5 amino acids “EAKPS” has 8 different folding shapes and so on. Except the numbers different, the patterns of folding shapes for each set of 5 amino acids are different. With PFSC scheme, the distribution of the different numbers of folding shapes as well as the different types of folding shapes for entire protein of SUMO1_HUMAN can be simultaneously observed. Also, the pattern of folding variations of SUMO1_HUMAN in PFVM is overall agree with the disorder determined by PDB which is displayed on the bottom of Table 1, i.e. both N-terminus and C-terminus are more disorder than center region. Thus, the PFVM actually demonstrates the folding variations, and the local folding variations of protein SUMO1 relying on the order of amino acids in its sequence.
The PFVM in Table 1 also revealed the features of folding conformation for entire protein of SUMO1_HUMAN. First, the PFVM is able to expose the flexibility or rigidity for protein folding conformations. In the middle portion of the sequence, the fragment (45-54), has a favor conformation as “AAAADAAAAA” with PFSC letters in first row which is almost typical alpha-helical conformation, and has the favor conformation as “DDDDADDDDD” in second row which is alike alpha-helical conformation. Also, the fragment in middle portion has fewer options changing folding variations than the fragments at both N-terminus and C-terminus. The revealed folding pattern in PFVM for SUMO1_HUMAN generally agrees with the knowledge from given protein 3D structural data in PDB. Seven of 3D structures for SUMO1_HUMAN protein are displayed in Table 2, which listed their 3D images with PDB ID, measurement methods, resolution, solvent and ligand, interacted protein, PH and temperature etc. Thus, these 3D structures may have different folding conformations. With comparison, the superimposition of 17 folding conformations from seven PDB 3D structures is displayed in Figure 3, where the fragments in middle region share common folding pattern while the fragments at both N-terminus and C-terminus are diverse. Second, the PFVM is able to provide the information to construct all possible conformations for a protein. Each conformation of SUMO1_HUMAN can be constructed by taking one PFSC letter from each column in PFVM, and is expressed by a PFSC string. Apparently, the conformations with astronomical number can be generated by various combinations of PFSC letters in PFVM, and its astronomical number can be calculated by multiplying the number of PFSC letters in each column for each 5 amino acids. For SUMO1_HUMAN, the astronomical number of all possible conformations is 1.587x1077. Obviously, it is impossible to display all 3D folding structures of such huge number. Thus, the PFVM truly is an optimized mode to present the comprehensive local folding variations for entire protein, and it can generate all possible folding conformations for a protein. Furthermore, the most possible conformations with stable states can be constructed from PFVM. One of most probable conformations for SUMO1_HUMAN protein is directly consisted by PFSC letters on the first row in PFVM of Table 1. Therefore, the PFVM provides significant folding information for SUMO1_HUMAN.