On the evolutionary conservation of hydrogen bonds made by buried polar amino acids: the hidden joists, braces and trusses of protein architecture
© Worth and Blundell; licensee BioMed Central Ltd. 2010
Received: 2 September 2009
Accepted: 31 May 2010
Published: 31 May 2010
The hydrogen bond patterns between mainchain atoms in protein structures not only give rise to regular secondary structures but also satisfy mainchain hydrogen bond potential. However, not all mainchain atoms can be satisfied through hydrogen bond interactions that arise in regular secondary structures; in some locations sidechain-to-mainchain hydrogen bonds are required to provide polar group satisfaction. Buried polar residues that are hydrogen-bonded to mainchain amide atoms tend to be highly conserved within protein families, confirming that mainchain architecture is a critical restraint on the evolution of proteins. We have investigated the stabilizing roles of buried polar sidechains on the backbones of protein structures by performing an analysis of solvent inaccessible residues that are entirely conserved within protein families and superfamilies and hydrogen bonded to an equivalent mainchain atom in each family member.
We show that polar and sometimes charged sidechains form hydrogen bonds to mainchain atoms in the cores of proteins in a manner that has been conserved in evolution. Although particular motifs have previously been identified where buried polar residues have conserved roles in stabilizing protein structure, for example in helix capping, we demonstrate that such interactions occur in a range of architectures and highlight those polar amino acid types that fulfil these roles. We show that these buried polar residues often span elements of secondary structure and provide stabilizing interactions of the overall protein architecture.
Conservation of buried polar residues and the hydrogen-bond interactions that they form implies an important role for maintaining protein structure, contributing strong restraints on amino acid substitutions during divergent protein evolution. Our analysis sheds light on the important stabilizing roles of these residues in protein architecture and provides further insight into factors influencing the evolution of protein families and superfamilies.
As Pauling and Corey realised, satisfaction of hydrogen bonding potential of polypeptide mainchain functions is one of the major factors that give rise to the β-strand and α-helix [1, 2]. These regular elements of secondary structure give their names to the main features of protein structure: classical β-sheets, α-helical bundles, αβ-Rossman fold, αβ-barrel and many others. Hydrogen bonding also plays important roles in the intricate and sometimes elaborate arches and turns which link α-helices and β-strands [3–5].
Water molecules or sidechains can usually satisfy the hydrogen bonding potential of mainchain functions that are at the protein surface in a variety of ways and so the residues are often substituted in evolution. However, in the smaller proportion of functions that must be satisfied from the core of the protein, this is achieved by buried sidechains of polar residues.
Previous in silico analyses of the stabilizing roles that polar sidechains have on the backbone of protein structures have tended to focus on a particular architectural context [13, 23, 24]. Bordo and Argos  identified recurring patterns and amino acid types involved in sidechain-to-sidechain and sidechain-to-mainchain interactions. However, the conservation of polar residues and the three-dimensional (3D) arrangements of the sidechain-to-mainchain hydrogen bonds were not considered. What then are the features of sidechain-to-mainchain hydrogen bonds formed by polar sidechains? Which amino acids are involved? What kinds of structures do these buried polar residues maintain? Are they local to a secondary structure or do they link between different helices and strands, stabilizing tertiary structure?
Results and Discussion
Buried polar residues stabilizing protein architecture through conserved interactions
Shows the frequency of occurrence for each polar amino acid in the 233 conserved positions.
Propensity of polar residues forming sidechain hydrogen bonds to mainchain atoms in various architectural contexts.
Propensity of polar residues forming sidechain hydrogen bonds to mainchain atoms
Within helix N-termini
Within helix C-termini
Within edge strands
From edge strands
Within central strands
From central strands
Within 310 helices
Within polyproline helices
Within coil regions
Interactions with the N-terminal regions of α-helices
For conserved and buried polar residues making hydrogen bonds to mainchain NH functions in the N-terminal regions of α-helices, cysteine has the highest propensity to form such interactions, followed by negatively charged aspartate, histidine and glutamate (Table 2 and see Additional file 3, Figure S1A - grey bars); surprisingly, neutral residues such as serine, threonine and asparagine have higher propensities when solvent accessible positions are considered (Table 2 and see Additional file 3, Figure S1A - white bars) [8, 27, 28]. This may reflect the importance of the charged hydrogen bond in regions of low dielectric strength, as well as its interaction with the helix dipole .
Interactions with the C-terminal regions of α-helices
Interactions with edge strands
Interactions from within edge strands
Interactions with centre strands
Interactions from within centre strands
Of conserved, buried polar residues within centre strands forming hydrogen bonds to mainchain atoms, tyrosine has the highest propensity to form such interactions, followed closely by arginine, asparagine, serine, aspartate and glutamate (Table 2 and see Additional file 3, Figure S2D - grey bars). We see a different pattern however when we consider all polar amino acids in centre strands that form hydrogen bonds to mainchain atoms - arginine has the highest propensity to form this type of interaction followed by cysteine, tyrosine, threonine and asparagine (Table 2 and see Additional file 3, Figure S2D - white bars). Asparagine, aspartate, glutamate, serine and tyrosine are more commonly found to form hydrogen bonds to mainchain atoms from within edge strands when conservation and solvent accessibility are considered whereas threonine and cysteine are less common.
Interactions to residues within 310 helices
Interactions with beta hairpins
In β-hairpins, mainchain atoms that are hydrogen-bonded to conserved and buried sidechains have a high propensity to interact with aspartate, cysteine, tryptophan and serine (Table 2 and see Additional file 3, Figure S4 - grey bars). We see a similar pattern when we consider all polar amino acids forming hydrogen bonds to mainchain atoms in β-hairpins; asparagine has the highest propensity to form this type of interaction followed by aspartate, arginine, serine and threonine (Table 2 and see Additional file 3, Figure S4 - white bars). Therefore, although asparagine, arginine and threonine often form hydrogen bonds to mainchain atoms within β-hairpins, these interactions tend not to be conserved in buried positions.
Interactions with polyproline
From the set of conserved, buried polar residues hydrogen-bonded to mainchain atoms of polyproline-type helices, arginine is most common, followed by histidine, tyrosine and tryptophan (Table 2 and see Additional file 3, Figure S5 - grey bars). Arginine also has the highest propensity to form this interaction when we consider all residues forming this type of interaction, followed by glutamine, asparagine and histidine (Table 2 and see Additional file 3, Figure S5 - white bars). A similar result has previously been observed where hydrogen bonds from sidechains to mainchains in polyproline were most frequently formed by arginine followed by glutamine, asparagine, serine and threonine .
Interactions with coil regions
Cysteine and aspartate clearly have the highest propensity to form hydrogen bonds to coil regions out of buried conserved polar residues (Table 2 and see Additional file 3, Figure S6 - grey bars). However, arginine has the highest propensity to perform this role when all positions are considered, followed by asparagine and aspartate (Table 2 and see Additional file 3, Figure S6 - white bars). A previous analysis of intra-coil sidechain-to-mainchain hydrogen bonds revealed that aspartate, serine, asparagine and threonine are the polar residues that most commonly form this type of interaction, with 80% of these cases being at solvent-exposed sites .
We have previously demonstrated that buried polar residues, although small in number, tend to be more conserved when their hydrogen-bonding potential is satisfied or where they form hydrogen bonds to mainchain atoms . Conservation of these residues and the interactions that they form implies that they are important for maintaining protein structure and hence provide restraints on amino acid substitutions during divergent evolution. We have shown that conserved, buried polar residues have conserved roles in stabilizing the tertiary structure of proteins by forming hydrogen bonds to mainchain atoms. The conservation of these sidechain-to-mainchain hydrogen bonds implies that mainchain architecture is a crucial restraint on the evolution of proteins and that the interactions are retained as an essential part of the protein fold. The structural motifs that we have examined have been shown to have particular propensities for polar residues which form hydrogen bonds with mainchain atoms. Although local sidechain-to-mainchain interactions have been the focus of most previous studies, the propensity for sidechain-to-mainchain hydrogen bond formation is often met by distant interaction. For example, we observe that arginine frequently caps the C-termini of α-helices through a distant interaction. We have shown that buried polar residues maintain 3D relationships between secondary structures where mainchain-to-mainchain hydrogen bonds cannot play a role and that similar stabilizing structures recur in different architectures. The key roles of these stabilizing interactions in maintaining protein structures have been previously demonstrated in a few cases, for example in the tyrosine corner , but we have shown here that there are many others important for maintaining protein stability.
Although it is generally unfavourable to bury hydrophilic amino acids in the core of proteins, this is counterbalanced by the need to satisfy mainchain atom hydrogen-bond potential. The interactions that the polar residues form when providing these supporting roles are often quite complex and can be thought of as analogous to features in our own built 3D environment. Many form joists, bridging between the elements of secondary structure (for example, Figures 3B, 4D-F, 5B-C, 7A-E), analogous to those that bridge columns and support structures above them in man-made buildings (Figure 2A). Other sidechains act as braces, tethering two strands at the point at which they diverge (Figure 7F and Figure 2B). Buried hydrogen bonded polar sidechains often maintain triangulated structures, supporting distorted helices and complex loop structures (Figures 3I, 6A,C, 8A-C, 11A-B): these provide a striking parallel with the trusses supporting the roofs of buildings (Figure 2C-E). Remarkably, these structural features have been highly conserved in their respective architectural histories, despite the variation in surface structures. Both are hidden from view and remain unappreciated, except by the cognoscenti. We hope that this paper will help bring understanding of these important structural features of protein architecture to a wider audience.
Protein families containing five or more members were selected from HOMSTRAD where the family alignment contained a conserved, buried polar residue and where the sidechain of the polar residue forms a hydrogen bond to a mainchain atom in each family member. The JOY  alignment of each family within HOMSTRAD was used to identify families that met these criteria. JOY's default relative accessibility cut-off (7% or less) was used to define solvent inaccessible (buried) residues. In order to avoid redundancy, where protein families overlapped, the family with the highest sequence coverage was chosen for the analysis.
Identification of hydrogen bond partners
Hydrogen bond partner(s) to the conserved, buried polar residues were identified using the program, HBOND (J. Overington, unpublished). HBOND identifies all possible hydrogen bonds based on a distance criterion (3.5Å between donor and acceptor).
Identification of structural motifs
α-helices (N-terminal and C-terminal residues were identified based on the following positional criteria: N-(N+1) to N-(N+3) for N-terminal residues and N-3 to N+1 for C-terminal residues (where N is the length of the helix).
β-strands - edge strands were distinguished from centre strands by referring to the number of hydrogen bonding partner strands. Strands defined as having >1 hydrogen bonding partner strand were defined as centre and all others as edge.
We also identified polyproline helices using the program SEGNO .
Calculation of residue propensities
where narch(x) is the number of residues of type x forming hydrogen bonds to mainchain atoms in a particular architectural context, N(x) is the number of residues of type x in the dataset of 131 families, narch(total) is the total number of residues forming hydrogen bonds to mainchain atoms in a particular architectural context and N(total) is the total number of residues in the dataset of 131 families.
Polar residues which are entirely conserved, buried in each family member and forming a hydrogen bond to a mainchain atom group in each family member. These numbers were therefore derived from the 233 alignment positions identified in the 66 families.
All polar residues in the 131 family set, regardless of solvent accessibility and conservation but where the polar residue forms a hydrogen bond to a mainchain atom group.
This work was supported by a BBSRC studentship to CLW. TLB is supported by the Wellcome Trust.
- Pauling L, Corey RB: Configurations of Polypeptide Chains With Favored Orientations Around Single Bonds: Two New Pleated Sheets. Proc Natl Acad Sci USA. 1951, 37 (11): 729-740. 10.1073/pnas.37.11.729.PubMed CentralView ArticlePubMed
- Pauling L, Corey RB, Branson HR: The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA. 1951, 37 (4): 205-211. 10.1073/pnas.37.4.205.PubMed CentralView ArticlePubMed
- Hutchinson EG, Thornton JM: A revised set of potentials for beta-turn formation in proteins. Protein Sci. 1994, 3 (12): 2207-2216. 10.1002/pro.5560031206.PubMed CentralView ArticlePubMed
- Wilmot CM, Thornton JM: Analysis and prediction of the different types of beta-turn in proteins. J Mol Biol. 1988, 203 (1): 221-232. 10.1016/0022-2836(88)90103-9.View ArticlePubMed
- Sibanda BL, Blundell TL, Thornton JM: Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J Mol Biol. 1989, 206 (4): 759-777. 10.1016/0022-2836(89)90583-4.View ArticlePubMed
- Baker EN, Hubbard RE: Hydrogen bonding in globular proteins. Prog Biophys Mol Biol. 1984, 44 (2): 97-179. 10.1016/0079-6107(84)90007-5.View ArticlePubMed
- Presta LG, Rose GD: Helix signals in proteins. Science. 1988, 240 (4859): 1632-1641. 10.1126/science.2837824.View ArticlePubMed
- Richardson JS, Richardson DC: Amino acid preferences for specific locations at the ends of alpha helices. Science. 1988, 240 (4859): 1648-1652. 10.1126/science.3381086.View ArticlePubMed
- Wan WY, Milner-White EJ: A recurring two-hydrogen-bond motif incorporating a serine or threonine residue is found both at alpha-helical N termini and in other situations. J Mol Biol. 1999, 286 (5): 1651-1662. 10.1006/jmbi.1999.2551.View ArticlePubMed
- Wan WY, Milner-White EJ: A natural grouping of motifs with an aspartate or asparagine residue forming two hydrogen bonds to residues ahead in sequence: their occurrence at alpha-helical N termini and in other situations. J Mol Biol. 1999, 286 (5): 1633-1649. 10.1006/jmbi.1999.2552.View ArticlePubMed
- Chan AWE, Hutchinson EG, Thornton JM: Identification, classification, and analysis of beta-bulges in proteins. Protein Sci. 1993, 2: 1574-1590. 10.1002/pro.5560021004.PubMed CentralView ArticlePubMed
- Richardson JS, Getzoff ED, Richardson DC: The beta bulge: a common small unit of nonrepetitive protein structure. Proc Natl Acad Sci USA. 1978, 75 (6): 2574-2578. 10.1073/pnas.75.6.2574.PubMed CentralView ArticlePubMed
- Eswar N, Ramakrishnan C: Secondary structures without backbone: an analysis of backbone mimicry by polar side chains in protein structures. Protein Eng. 1999, 12 (6): 447-455. 10.1093/protein/12.6.447.View ArticlePubMed
- Barlow DJ, Thornton JM: Helix geometry in proteins. J Mol Biol. 1988, 201 (3): 601-619. 10.1016/0022-2836(88)90641-9.View ArticlePubMed
- Cubellis MV, Caillez F, Blundell TL, Lovell SC: Properties of polyproline II, a secondary structure element implicated in protein-protein interactions. Proteins. 2005, 58 (4): 880-892. 10.1002/prot.20327.View ArticlePubMed
- Stapley BJ, Creamer TP: A survey of left-handed polyproline II helices. Protein Sci. 1999, 8 (3): 587-595. 10.1110/ps.8.3.587.PubMed CentralView ArticlePubMed
- Milner-White E, Ross BM, Ismail R, Belhadj-Mostefa K, Poet R: One type of gamma-turn, rather than the other gives rise to chain-reversal in proteins. J Mol Biol. 1988, 204 (3): 777-782. 10.1016/0022-2836(88)90368-3.View ArticlePubMed
- Milner-White EJ: Beta-bulges within loops as recurring features of protein structure. Biochim Biophys Acta. 1987, 911 (2): 261-265.View ArticlePubMed
- Overington J, Donnelly D, Johnson MS, Sali A, Blundell TL: Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci. 1992, 1 (2): 216-226. 10.1002/pro.5560010203.PubMed CentralView ArticlePubMed
- Overington J, Johnson MS, Sali A, Blundell TL: Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. Proc Biol Sci. 1990, 241 (1301): 132-145. 10.1098/rspb.1990.0077.View ArticlePubMed
- Worth CL, Blundell TL: Satisfaction of hydrogen-bonding potential influences the conservation of polar sidechains. Proteins. 2009, 75 (2): 413-429. 10.1002/prot.22248.View ArticlePubMed
- Slingsby C, Driessen HP, Mahadevan D, Bax B, Blundell TL: Evolutionary and functional relationships between the basic and acidic beta-crystallins. Exp Eye Res. 1988, 46 (3): 375-403. 10.1016/S0014-4835(88)80027-7.View ArticlePubMed
- Eswar N, Ramakrishnan C: Deterministic features of side-chain main-chain hydrogen bonds in globular protein structures. Protein Eng. 2000, 13 (4): 227-238. 10.1093/protein/13.4.227.View ArticlePubMed
- Vijayakumar M, Qian H, Zhou HX: Hydrogen bonds between short polar side chains and peptide backbone: prevalence in proteins and effects on helix-forming propensities. Proteins. 1999, 34 (4): 497-507. 10.1002/(SICI)1097-0134(19990301)34:4<497::AID-PROT9>3.0.CO;2-G.View ArticlePubMed
- Bordo D, Argos P: The role of side-chain hydrogen bonds in the formation and stabilization of secondary structure in soluble proteins. J Mol Biol. 1994, 243 (3): 504-519. 10.1006/jmbi.1994.1676.View ArticlePubMed
- Mizuguchi K, Deane CM, Blundell TL, Overington JP: HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 1998, 7 (11): 2469-2471. 10.1002/pro.5560071126.PubMed CentralView ArticlePubMed
- Harper ET, Rose GD: Helix stop signals in proteins and peptides: the capping box. Biochemistry. 1993, 32 (30): 7605-7609. 10.1021/bi00081a001.View ArticlePubMed
- Serrano L, Sancho J, Hirshberg M, Fersht AR: Alpha-helix stability in proteins. I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J Mol Biol. 1992, 227 (2): 544-559. 10.1016/0022-2836(92)90906-Z.View ArticlePubMed
- Nicholson H, Anderson DE, Dao-pin S, Matthews BW: Analysis of the interaction between charged side chains and the alpha-helix dipole using designed thermostable mutants of phage T4 lysozyme. Biochemistry. 1991, 30 (41): 9816-9828. 10.1021/bi00105a002.View ArticlePubMed
- Adzhubei AA, Sternberg MJ: Conservation of polyproline II helices in homologous proteins: implications for structure prediction by model building. Protein Sci. 1994, 3 (12): 2395-2410. 10.1002/pro.5560031223.PubMed CentralView ArticlePubMed
- Hamill SJ, Cota E, Chothia C, Clarke J: Conservation of folding and stability within a protein family: the tyrosine corner as an evolutionary cul-de-sac. J Mol Biol. 2000, 295: 641-649. 10.1006/jmbi.1999.3360.View ArticlePubMed
- Mizuguchi K, Deane CM, Blundell TL, Johnson MS, Overington JP: JOY: protein sequence-structure representation and analysis. Bioinformatics. 1998, 14 (7): 617-623. 10.1093/bioinformatics/14.7.617.View ArticlePubMed
- Hutchinson EG, Thornton JM: PROMOTIF--a program to identify and analyze structural motifs in proteins. Protein Sci. 1996, 5 (2): 212-220. 10.1002/pro.5560050204.PubMed CentralView ArticlePubMed
- Cubellis MV, Cailliez F, Lovell SC: Secondary structure assignment that accurately reflects physical and evolutionary characteristics. BMC Bioinformatics. 2005, 6 (Suppl 4): S8-10.1186/1471-2105-6-S4-S8.PubMed CentralView ArticlePubMed
- DeLano WL: The PyMOL Molecular Graphics System. 2002, Palo Alto, CA, USA: DeLano Scientific
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.