Abstract
The entry of SARS-CoV-2 into host cells proceeds by a two-step
proteolysis process, which involves the lysosomal peptidase cathepsin L.
Inhibition of cathepsin L is therefore considered an effective method to
prevent the virus internalization. Analysis from the perspective of
structure-functionality elucidates that cathepsin L inhibitory
proteins/peptides found in food share specific features: multiple
disulfide crosslinks (buried in protein core), lack or low contents of
α-helix structures (small helices), and high surface hydrophobicity.
Lactoferrin can inhibit cathepsin L, but not cathepsins B and H. This
selective inhibition might be useful in fine targeting of cathepsin L.
Molecular docking indicated that only the carboxyl-terminal lobe of
lactoferrin interacts with cathepsin L and that the active site cleft of
cathepsin L is heavily superposed by lactoferrin. Food protein-derived
peptides might also show cathepsin L inhibitory activity.
The ongoing coronavirus disease 2019 (COVID-19) is caused by Severe
Acute Respiratory Syndrome Corona Virus-2 (SARS-CoV-2). Coronavirus
binds though its spike glycoprotein (S) to angiotensin converting enzyme
2 (hACE2) on the host cell membrane for endocytosis. The S protein is
trimeric and each monomer contains two subunits, S1 which mediates the
attachment, and S2 which enables membrane fusion. The fusion potential
of S protein is activated by a two-step proteolysis process: priming
cleavage between S1 and S2 (amino acids 682-685, RRAR), and activating
cleavage on S2’ site. In the infection of some coronaviruses, cathepsin
B is involved; however, an exploration on a SARS-CoV-2 S protein
pseudovirus system indicated that cathepsin L, not B, is critical for
SARS-CoV-2 S protein activation, similar to SARS-CoV and human
immunodeficiency virus (HIV) (Ou et al., 2020). After endocytosis of
SARS-CoV, cathepsin L cleaves S to S1 and S2. The cleavage enables
fusion of the viral membrane, which was attached to hACE2, with the
endosomal membrane (Fig. 1A). Then the viral genome is released into the
host cell for replication (Adedeji et al., 2013).
Cathepsin L (220 amino acid residues) (Fujishima et al., 1997) is a
lysosomal cysteine peptidase and has a two-chain form (L and R). The L
domain contains three α-helices (one is the longest central helix) and
the R domain is a β-barrel, which is closed at the bottom by an α-helix
(Fig. 1B). The reactive site-cleft composes of histidine (His163)
located at the top of the barrel and cysteine (Cys25) located at the
N-terminus of the L domain central helix (Turk et al., 2012). Cathepsin
L contributes to protein turnover, and apoptosis in cells. When a
clinically proven serine protease inhibitor was used to inhibit cellular
serine protease TMPRSS2, which is exploited by SARS-CoV-2 for S protein
priming, viral infection was incompletely inhibited. Full inhibition was
accomplished by the concurrent employment of two drugs, one for
inhibition of TMPRSS2 and the other for cathepsin L/B, suggesting the
necessity of cathepsin L inhibition for prevention of SARS-CoV-2
membrane fusion (Hoffmann et al., 2020).
The overexpression of cathepsin L and H in cancer cells, in contrast to
cathepsins B, C, S, and X/Z has made cathepsin L a target for anticancer
strategies (Lankelma et al., 2010). The enzyme is also implicated in
other pathologies such as osteoporosis, and periodontal diseases
(Fujishima et al., 1997). Consequently, a plethora of cathepsin L
inhibitory drugs are present for assessment. Some of the drugs e.g. E-64
(Fujishima et al., 1997) inhibit several cathepsins including cathepsin
L, whereas, a few such as CLIK-148 (an epoxysuccinyl derivate) and iCL
(an aldehyde derivative) exclusively inhibit cathepsin L (Lankelma et
al., 2010). It has been shown that the inhibition of cathepsin L, rather
than cathepsin B, for instance by SSAA09E1
{[(Z )-1-thiophen-2-ylethylideneamino]thiourea} holds
premises for suppression of SARS-CoV (Adedeji et al., 2013), and
SARS-CoV-2 endocytosis. Many of the cathepsin L inhibitory drugs, like
dipeptide epoxyketones, which are used for SARS-CoV suppression (Adedeji
et al., 2013) and iCL (napsul-Ile-Trp-CHO) (Lankelma et al., 2010) are
essentially peptidomimetic (Adedeji et al., 2013).
Given that food contains naturally occurring bioactive peptides and
proteins, recently we outlined how food protein-derived peptides (small
fragments, usually <3 kDa) that influence the
renin-angiotensin system might have implications on SARS-CoV-2
endocytosis and pulmonary function in COVID-19 patients. The presumed
mechanisms of action were over-expression of hACE2, and the receptor
Mas, as well as blockage of the angiotensin type I receptor on cell
surface (Goudarzi et al., 2020). However, food proteins/peptides afford
further opportunities for inhibition of SARS-CoV-2. Certain foods
contain protein/peptide-based cathepsin L inhibitory compounds. This
characteristic confers a new vision to exploration and implementation of
biomolecules that can inhibit SARS-CoV-2 entry into host cells.
Whereas some naturally occurring bioactive peptides in food such as
carnosine (226 Da, β-alanyl-L-histidine, in meat and chicken) do not
affect cathepsin activity (Bonner et al., 1995), several cathepsin
inhibitory molecules have been identified in plant and animal foods.
Oryzacystatin-I (11.4 kDa, residue
count 102) in rice is known to inhibit cathepsin L (inhibition constant,
Ki : 7.3 × 10-10 M), but also
cathepsin B (Ki : 7.9 × 10-8 M) and
cathepsin H (Ki : 1.0 × 10-6)
(Hellinger and Gruber, 2019). A hydrophobic cluster between the α-helix
and the five-stranded antiparallel β-sheet structures (Fig. 1C)
stabilizes the helix architecture of oryzacystatin-I (Nagata et al.,
2000) and the N-terminal 21 amino acid residues are most probably not
essential for cathepsin inhibitory activity of oryzacystatin-I (Abe et
al., 1988). It is known that hydrophobic interactions are crucial for
the inhibition of cathepsins, likely associated with the hydrophobic
wedge-shaped structure of their active-site cleft (Turk et al., 2012).
Corn cystatin-I as well inhibits cathepsin L (Ki : 1.7 ×
10–8 M) and cathepsin H (Ki : 5.7 ×
10–9 M) (Hellinger and Gruber, 2019).
In contrast to cystatin, bromelain inhibitor VI [BI-VI; 5.89 kDa;
heavy chain (H, 41 amino acid residues) and light chain (L, 11 amino
acid residues)] (Fig. 1D) which is a peptide present in pineapple stem
selectively inhibits cathepsin L (Ki 0.2 ×
10-6 M) and at a lesser extent trypsin (Polya, 2003).
The primary and secondary structures of BI-VI are remarkably distinct
from those of cystatin. Hen egg white cystatin consists of two α-helices
and a five-stranded antiparallel β-sheet, but the secondary structure of
BI-VI lacks α-helix structure. It is composed of two domains A and B,
each of which formed by a three-stranded antiparallel β-sheet (Hatano et
al., 1995). BI-VI contains ten cysteine residues, and five
intra/inter-chain disulfide bonds between
Cys3L-Cys7H,
Cys6L-Cys39H,
Cys8L-Cys5H,
Cys14H-Cys21H, and
Cys18H-Cys30H crosslink the protein.
The disulfide bonds form the protein core, which is not common as a
protein core is commonly occupied with some bulky hydrophobic side
chains. This arrangement is homologous with Bowman−Birk
trypsin/chymotrypsin inhibitor from soybean (BBI-I), a typical serine
protease inhibitor (Hatano et al., 1996). The heavily S-S crosslinked
double chain conformation might be important for the inhibition
selectivity towards cathepsin L. This feature is shared with
lactoferrin, which is discussed later in the current communication. It
is noteworthy that similar to cathepsin L but not essentially required,
trypsin could efficaciously activate SARS-CoV-2 S protein, enabling
formation of syncytium (Ou et al., 2020). Therefore, the co-inhibition
of trypsin and cathepsin L by BI-VI is advantageous.
Comparable to plant resources, foods of animal origin contain cathepsin
L inhibitors. Though the proteins and peptides with cathepsin L
inhibitory property in foods of animal origin have been scarcely
investigated, some evidences are available in the literature. Mammalian
milk has significant contents of cysteine protease inhibitors, such as
lactoferrin, β-casein and β-lactoglobulin. The inhibitory effect of
β-lactoglobulin B, also at a lower extent that of β-lactoglobulin A, on
cathepsins K and L have been observed (Ogawa et al., 2009).Beta- lactoglobulin is a member of lipocalin family and binds to
hydrophobic molecules. In addition to a single cysteine residue buried
in protein and protected by α-helix, β-lactoglobulin has 2 internal
disulfide linkages that stabilize the protein structure. At
physiological pH, β-lactoglobulin exists as a supramolecular dimer. The
secondary structure of β-lactoglobulin consists of nine antiparallel
β-sheets, three helical turns and only one short α-helix (Ragona et al.,
2000). Beta -lactoglobulin A has an additional negative charge
compared to β-lactoglobulin B. A shared feature between BI-VI and
β-lactoglobulin B is the existence of internal disulfide bonds.
For β-casein, an allosteric-type inhibition mechanism is believed to
cause cathepsin inhibitory (Sano et al., 2005). Beta -casein
(23.98 kDa) is a single chain polypeptide and the most hydrophobic
casein. Essentially all of β-casein net charge and α-helix structures
are positioned at the N-terminal portion of the molecule (Kumosinski et
al., 1993). It lacks cysteine. Considerable hydrophobicity and low
contents of α-helix structure (<20%) are the shared features
between β-casein and β-lactoglobulin B. Taking hydrophobicity into
account, I speculate that comparable to oryzacystatin-I (Abe et al.,
1988), the hydrophilic N-terminal portion of β-casein may not be
essential for cathepsin inhibitory activity. Rather, the C-terminal
hydrophobic portion which is also poor in α-helix structure probably
causes the inhibition. In fact, inhibition of cathepsin L may occur as a
consequence of the interaction between the carbonyl (carboxyl) group of
an inhibitor with the electrophilic oxyanion hole of cathepsin L, which
consists of side chains of Gln16, Trp189, His163 and the main chain of
Cys25 (Fujishima et al., 1997).
Lactoferrin is present at much higher contents in human milk than cow’s
milk. It strongly inhibits cathepsin L. Cathepsin L inhabitation by
lactoferrin takes place at 10-7 M, while, a synthetic
peptide which targets cathepsin L active site inhibited the enzyme at
10-5 M. Notably, lactoferrin does not inhibit
cathepsin B and cathepsin H (Sano et al., 2005). This may enable fine
targeting of cathepsin L for obstacle SARS-CoV-2 internalization, while
avoiding possible jeopardizes to cells. Bovine lactoferrin is a single
polypeptide consisting of 689 residues. It is folded into two lobes
(N-terminal and C-terminal halves) joined by a small helix (3-turn)
(Moore et al., 1997) (Fig. 1E). The crystal structures of lactoferrin
and cathepsin L were taken from the RCSB protein data bank. Molecular
docking using MDsrv (Tiemann et al., 2017) indicates that the C-lobe of
lactoferrin interacts with cathepsin L (Fig. 1F). It also shows that the
C-lobe superposes the active-site cleft of cathepsin L (Fig. 1F). The
α-helix content of bovine lactoferrin varies between 7% and 16%
depending on the pH (Sreedhara et al., 2010). This protein contains 17
disulfide bonds (Wang et al., 2019) and the interactions between the two
lobes are mostly hydrophobic caused by packing of nonpolar surfaces on
the lobes (Moore et al., 1997). Low α-helix content, small helices, and
hydrophobic surfaces are the shared features between lactoferrin,
β-lactoglobulin and β-casein. It is worthy to note that both lactoferrin
and BI-VI, which show high selectivity towards cathepsin L, include
several disulfide bonds, positioned in protein core. As mentioned in an
earlier place, cathepsin L inhibition proceeds by thiol chemistry.
Moreover, both lactoferrin and BI-VI possess a double chain/lobe
conformation.
In addition to the naturally present protein and peptides with cathepsin
L inhibitory activity in foods, some peptides encrypted in food proteins
might be able to inhibit cathepsin L and hence assist COVID-19
prevention. Food proteins release diverse biologically active peptides
once they are hydrolyzed by the enzymes present in the gastrointestinal
tract, and from microbes. Presently evidences showing cathepsin L
inhibitory by food protein-derived peptides are scarce. Two cathepsin B
inhibitory peptides derived from β-casein were identified in a
pancreatic digest of casein (Lee and Lee, 2000), as well as peptides
from the digestion of β-lactoglobulin B by an endopeptidase could
inhibit cathepsin K. It remains to be explored whether food
protein-derived peptides can inhibit cathepsin L. Isolation of peptides
that exclusively (or much more preferentially) inhibit cathepsin L than
the other cathepsins can be highly advantageous.
The structural and biofunctional characteristics of many food proteins
are extensively characterized. This is especially true for milk
proteins, but also egg white proteins. For example, it is known that hen
egg white riboflavin-binding protein has a high degree of crosslinking
by nine disulfide bonds. Besides, food proteins are generally recognized
as safe (GRAS) and inexpensive. Hundreds of biologically active peptide
sequences derived from food proteins are known in the literature. The
hydrolysis conditions, separation and purification procedures of these
peptides are established. In silico examination of the cathepsin
L inhibitory property of the known peptide sequences can accelerate the
drug discovery.