A numerical model for hydropathy tuning
To investigate hydropathy tuning by Leu, we used a simple model with sequences composed of only 4 «amino acids»: A, B, C and D. These form sequences of the type AaBbCcDd, with small letters indicating the number of the amino acids. A and B were modeled according to Ile and Leu, with «a» and «b» corresponding to the occurrences of the two amino acids within the TMD sequences of class A GPCRs. The hydropathy of Ile was assigned to A (hA = −0.81 kcal/mol) and the one of Leu was assigned to B (hB= −0.69 kcal/mol). C and D, and their counts «c» and «d», were modeled to reflect all other amino acids with hydropathies smaller than zero and amino acids with hydropathies larger than zero. C and D are thus generic amino acids that represent the averages of all hydrophobic (except Ile and Leu) and all hydrophilic amino acids, respectively. For C and D, the average hydropathies of the amino acids they represent were used (hC: −0.36 kcal/mol, hD: 0.8175 kcal/mol).
Amino acid compositions for simulated sequences were created by generating Gaussian distributed numbers for a-d based on the amino acid occurrences in the class A GPCR TMD sequences (A: 8.8 % ± 3.0 % (SD), B: 15.2 % ± 3.4 %, C: 28.0 % ± 3.0 %, D: 48.0 % ± 2.8 %). The generated random numbers a-d were then multiplied by 220 and rounded to the nearest integer to obtain sequence lengths that are comparable to the lengths of the TMD sequences. To test for statistical features, a total of 1’500 sequences were generated in each of the 10’000 runs.
Driver residues were introduced to drive hydropathies towards a defined optimum value hopt, which was set to −1.5 kcal/mol to resemble the mean hydropathy of the TMD sequences (−1.47 kcal/mol). With B as the driver, «a», «c» and «d» were randomly determined by a Gaussian distribution as described above. Then «b» was determined as shown by the equation below, with g(B) being a randomly Gaussian distributed number and hB being the hydropathy of B. The first term calculates the difference between the optimal and the already present hydropathy, and divides it through the hydropathy of B, yielding the value of «b» needed to get to the optimal hydropathy. A defined degree of noise was introduced using fdrive, which determines the amount of drive towards the optimum value hopt, with the rest (1- fdrive) being determined randomly by the Gaussian distribution g(B). The value of fdrive used was 0.25, which, however, does not mean that 25 % of the final number of «b» is driving the hydropathy towards the desired value since this fraction additionally depends on the value of hopt. Interestingly, the variances and correlations were identical between runs with different values for hopt, indicating that the actual value of hopt is not important to observe the effects of tuning towards it.
\begin{equation} b=f_{\text{drive}}\times\frac{h_{\text{opt}}-(a\times h_{A}+c\times h_{C}+d\times h_{D})}{h_{B}}+(1-f_{\text{drive}})\times g(B)\nonumber \\ \end{equation}
Two different models were tested (Fig. 2). In the first model, all amino acids were modeled independently from the resulting hydropathies by generating a-d based on Gaussian distributions alone. This simulates a case in which TMD hydropathy is not optimized (Fig. 2A-2C). In the second model, «a», «c» and «d» were generated based on Gaussian distributions, whereas «b» was chosen based on the equation shown above. This simulates the case in which Leu would be the driving force for adjusting the hydropathy of the TMDs (Fig. 2D-2F).