Describing the structural robustness landscape of bacterial small RNAs
© Rodrigo and Fares; licensee BioMed Central Ltd. 2012
Received: 27 February 2012
Accepted: 13 April 2012
Published: 13 April 2012
Skip to main content
© Rodrigo and Fares; licensee BioMed Central Ltd. 2012
Received: 27 February 2012
Accepted: 13 April 2012
Published: 13 April 2012
The potential role of RNA molecules as gene expression regulators has led to a new perspective on the intracellular control and genome organization. Because secondary structures are crucial for their regulatory role, we sought to investigate their robustness to mutations and environmental changes.
Here, we dissected the structural robustness landscape of the small non-coding RNAs (sncRNAs) encoded in the genome of the bacterium Escherichia coli. We found that bacterial sncRNAs are not significantly robust to both mutational and environmental perturbations when compared against artificial, unbiased sequences. However, we found that, on average, bacterial sncRNAs tend to be significantly plastic, and that mutational and environmental robustness strongly correlate. We further found that, on average, epistasis in bacterial sncRNAs is significantly antagonistic, and positively correlates with plasticity. Moreover, the evolution of robustness is likely dependent upon the environmental stability of the cell, with more fluctuating environments leading to the emergence and fixation of more robust molecules. Mutational robustness also appears to be correlated with structural functionality and complexity.
Our study provides a deep characterization of the structural robustness landscape of bacterial sncRNAs, suggesting that evolvability could be evolved as a consequence of selection for more plastic molecules. It also supports that environmental fluctuations could promote mutational robustness. As a result, plasticity emerges to link robustness, functionality and evolvability.
The discovery of the regulatory role of RNA has revolutionized our understanding of the molecular control and genome organization of living cells [1, 2]. Small non-coding RNAs (sncRNAs) have been shown, both in prokaryotes and eukaryotes, to exert a tight control on gene expression. Of relevance, a particular secondary structure can confer a regulatory ability of translation , a catalytic activity , or an interfering ability to silence gene expression . Importantly, a unique secondary structure is underlying all these mechanisms that, while preventing the degradation of the sncRNA, allows the interaction with and subsequent modification of other sncRNAs, mRNAs, or proteins. In summary, structures are fundamental in determining the potential roles of sncRNA and are, consequently, a fundamental component of the fitness of these molecules . In an attempt to proof this point, many research groups have pursued identifying the footprints of natural selection on secondary structures of sncRNAs, although this remains elusive. In this study we test the hypothesis that selection indeed operates at this level, driving the evolution of sequences to codify structures that present beneficial traits for the organism.
Early studies tackled the structural robustness of RNA molecules [7, 8], considering that robustness would be a beneficial trait. These approaches took advantage of a physicochemical model  that allows predicting the resulting phenotype (structure) from a given genotype (sequence). Recent computational studies have been mainly focused on precursors of miRNAs [10–13] and on viruses [14–16], and have suggested that natural RNA sequences are robust to mutations. However, as we show in this study, the statistical significance of the results depends on the choice of the reference sample of sequences. Moreover, whether robustness evolves driven directly by selection or is the by-product of the selection for another related magnitude remains highly controversial . Despite their biological relevance, however, very little is known about the structural robustness of bacterial sncRNAs. Here, we propose a new definition of environmental robustness that better allows studying its relationship with mutational effects. In addition, we explore and describe the robustness landscape of bacterial sncRNAs and link it to functionality and evolvability.
Robustness to environmental perturbations is the cornerstone of biological adaptation and diversification. In bacteria, adaptation to environment requires of fundamental changes at the molecular level (i.e., mutations). These changes may lead to the functional divergence of proteins or RNAs that mediate the adaptation to the environment. Indeed, bacteria have the ability to rapidly accumulate beneficial mutations when growing in new environments . If most of such functional mutations are destabilizing, owing to the fact they compromise ancestral functions, robustness to these mutations may fuel biological evolvability . However, a strong robustness may buffer the accumulation of beneficial mutations. Hence, determining how robust are proteins or RNAs to environmental and genetic perturbations may unearth the rules of evolvability . Our study reveals that plasticity evolved in natural sncRNAs, conferring evolvability to bacteria , and it also reveals that this magnitude modulates robustness.
Here, we define structural robustness as the sensitivity of an RNA molecule to perturbations: the greater the robustness of an RNA molecule, the more insensitive is to perturbations. To understand how RNA molecules respond to perturbations, we measured two types of robustness. First, environmental robustness (R e ) accounts for the robustness to perturbations in the environment where the sncRNA lives. These perturbations come mostly from extra-cellular factors. We assumed that environmental perturbations alter the physicochemical properties for RNA folding. Under this assumption, we computationally induced environmental perturbations by altering the energetic parameters implemented in the thermodynamic model for the base-pairs interactions . Alterations in the conformation of the sncRNA structure resulting from such perturbations were used to calculate R e . This assumption is justified because the thermodynamic model assumes a mathematical expression by decomposition, whose parameterization must be done against experimental data. Different sets of energetic parameters have been proposed , each of them being a relatively good approximation for making predictions. However, the model is certainly a simplification of the reality (effective model) and many more equations and parameters would be needed for a much more accurate calculation of free energies and RNA structures. Therefore, it is indebted to think that environmental conditions (e.g., concentration of ions) modulate the energetic parameters of this effective model, and that environmental robustness would be achieved by being insensitive to perturbations in those parameters . Second, mutational robustness (R m ) accounts for the robustness of structures to single point mutations in the sncRNA sequence. We provide formal mathematical definitions of these variables in section Methods. To perform the computation over RNA secondary structures we used Vienna RNA package .
The conclusion that bacterial sncRNAs are significantly plastic could entail important evolutionary and functional implications (Figure S3). First, plasticity could serve as a mechanism to diversify the functions of molecules, since a single genotype could yield multiple phenotypes (large thermodynamic ensemble of structures), even sncRNAs can adopt multistable states . Second, plastic molecules have greater evolvability , which could lead to functional innovation (i.e., by increasing plasticity, the time of adaptation could be reduced). Third, the greater the plasticity, the larger structural changes can be after mutational or environmental perturbations (t-tests, P-values < 0.0001 for R m and R e , using the average of P to construct two subsets). Recently, it has been shown that robustness can correlate with evolvability but in a way strongly modulated by plasticity: intermediate robustness levels are optimal for fueling evolvability, where higher plasticity induces lower optimal robustness levels .
We also investigated epistasis (E) , the interaction of mutations, and its relationship with plasticity. In terms of population, we found that double mutations in bacterial sncRNAs tend to be antagonistic (E > 0) (t-test, P-value = 5·10-5). 70.9% of the sequences display E > 0, although with respect to sample III the statistical significance is reduced (U-test, P-value = 0.16). Antagonistic epistasis indicates that the effects of the first mutation at a nucleotide site provoke a disruption of the structure that is more severe than the one provoked by the effects of a second mutation at another site . Accordingly, synergistic epistasis entails that single mutations will have a moderate impact on the structure. In fact, sncRNAs with synergistic epistasis displayed higher levels of mutational robustness (t-test, P-value < 0.0001, using E = 0 to construct two subsets). In principle, epistasis would tend to 0 when the two mutations fall down in the sequence with sufficient separation so that their effects are uncorrelated. Nucleotide sites that were detected to interact epistatically, both synergistically and antagonistically, were on average closer in the structure than expected by chance (Figure S4). In addition, we found a positive correlation between plasticity and epistasis (Figure 3b). Antagonistic epistasis thus comes from the fact that more plastic molecules are less robust. In this scenario, each individual is more sensitive to mutations (i.e., the deleterious mutants are quickly diluted while beneficial ones are fixed) and the population tends to accumulate beneficial genetic variability (Figure S3). Hence, our results are in tune with the suggestion that antagonistic epistasis would promote evolvability , and that evolvability and mutational robustness are inversely correlated, at least in the short term .
Free-living bacteria are subjected to fluctuations in their environment. These perturbations affect the available resources that bacteria use for their development and reproduction, but also affect variables such as temperature, pressure, oxygen, metals, and concentration of ions. Changes in these variables may affect RNA folding, among other effects. It is then expected that molecules of free-living bacteria (which live in highly fluctuating environments) have evolved towards higher robustness to these environmental changes. It follows that environmental fluctuations may promote the evolution of mechanisms that confer robustness to such fluctuations. Afterwards, environmental robustness would provide sncRNAs with robustness to mutations, which is an inherent property of the molecule. In theory, direct selection for mutational robustness would only occur in organisms presenting high mutation rates such as viruses . Thus, in populations with lower variability, mutational robustness could certainly be a side effect of selection for mechanisms that mitigate the effects of environmental perturbations . In addition, the energetic features of the molecule manage its structural robustness to both mutational and environmental perturbations [22, 27], and this explains the strong correlation between environmental and mutational robustness.
An illustrative test of this hypothesis would be the analysis of the robustness of sncRNAs of different bacteria, each subjected to different environmental fluctuations. Here we included in the analysis the endosymbiotic bacterium of aphid insects (Buchnera aphidicola), which lives in highly stable environments (i.e., devoid of fluctuations), among other free-living bacteria. However, these endosymbionts (also Blochmannia floridanus) have an extremely reduced genome  and hence very few or even none reported sncRNAs. Among all sncRNAs studied here, the gene codifying for one RNA component of the signal recognition particle, ffs , is highly conserved in many bacteria including B. aphidicola. Then we decided to focus our study on just this gene, observing that in B. aphidicola Ffs is significantly less robust than their Ffs homologs in other bacteria, which live in more fluctuating environments (Figure S8). Although these initial results do not constitute an exhaustive analysis to point out that evolution of robustness negatively correlates with environmental stability, they show that robustness can be compared among species and not only against artificial sncRNAs.
Of special interest are those molecules that are both plastic and environmentally robust. In principle, as we have shown, these two variables are negatively correlated. However, we observed that 17 bacterial sncRNAs presented this pattern. Among them, we highlight GcvB, IsrB, GlmZ, RseX, and RyhB. Interestingly, these sncRNAs present a high connectivity degree, especially GcvB. This could suggest that hub elements, in addition to increased degree of functionality, require high levels of plasticity to operate (P and V do not correlate, Figure S10). However, DicF and IsrA, which also establish many connections, do not exhibit this feature. Because the interaction network was inferred, these results should be interpreted with caution. Further studies are needed to address the important issue of linking robustness with functionality.
Of course, the use of the secondary structure as a fitness magnitude is a simplification. Future work could aim to determine the robustness to changes in gene expression by accounting for the interactions between sncRNAs and mRNAs , and also to assess the optimality of the natural riboregulators exploiting computational design methods . Furthermore, the use of secondary structures to evaluate plasticity and robustness results in a limited framework. Certainly, a more accurate model, although at a high computational cost, would be the three-dimensional structure of the molecule, as we know that different types of interactions (not only Watson-Crick) exist . In that way, packages such as iFoldRNA  could be exploited to carry out such robustness analyses.
Another important aspect corresponds to the uncertainty coming from transcription termination (a sncRNA also encodes a transcription terminator, usually the last hairpin of the structure is followed by a poly(U) tail). This process of transcription termination produces a population of sncRNAs with different lengths. The extra nucleotides in the transcript due to an inefficient termination (or the lack of nucleotides due to a premature termination) may influence the folding of the global structure . Therefore, we could analyze the robustness of bacterial sncRNAs to this consequence, gaining accuracy with predictors of transcription termination .
In addition to compare natural RNAs against structural analogs, we could generate random sequences by adapting the nucleotide composition of the pool [47, 48]. We could also use structural variables to complement sequence alignments in the detection of functional RNAs . On the other hand, randomly generated sequences of sncRNAs could be a non-appropriate null model, because the evolution of natural sequences usually comes from shorter sequence distances . To overcome this issue, we can analyze sncRNAs among different bacterial organisms. Although an sncRNA could not be significantly robust with respect to artificial sequences, it could be so with respect to analogs from other organisms (e.g., Ffs from B. aphidicola was significantly less robust than its analogs from free-living bacteria). This comparison indeed accounts for the short evolutionary distance and phylogenetic dependence.
In this work, we used a computational approach to dissect the structural robustness landscape of the sncRNAs encoded in the genome of the bacterium E. coli. We identified that natural sncRNAs are not significantly robust to both mutational and environmental perturbations when compared against artificial, unbiased sequences. We also showed the dependence of the robustness analyses on different sets of artificial sequences. However, using the appropriate null model, we found a significant enrichment of plasticity in natural sequences. In contrast, previous studies claimed for significant robustness of natural pre-miRNAs [10–12], but this could reflect a caveat of the reference set of artificial sequences. By further applying our methodology to pre-miRNAs of C. elegans, we found that they are not so robust as claimed before but are significantly plastic. This is in tune with the results here presented for bacterial sncRNAs. Indeed, both bacterial sncRNAs and nematode pre-miRNAs appear as significantly more plastic on average, a trait that could promote evolvability . This enhances the idea that evolvability, or the ability of finding beneficial or innovative mutations, could be a selected trait in bacterial sncRNAs.
In addition, our results can strengthen the understanding of the evolution of robustness and plasticity, concepts that have fueled enormous interest in the latest literature owing to their direct link with the promotion of adaptive evolution . On the one hand, more functional (complex) structures would permit a larger number of RNA-RNA interactions and we have shown these structures display higher robustness levels. On the other hand, plasticity would promote evolvability and we have shown it is negatively correlated with robustness. The observation that plasticity positively correlates with epistasis (on average, significantly antagonistic in bacterial sncRNAs) supports the positive relationship between plasticity and evolvability, since antagonistic epistasis would promote evolvability . This reflects a given modulator effect of plasticity on both robustness and evolvability. All in all, our study provides a quantitative, deep characterization of the complex map linking robustness, functionality and evolvability in bacterial sncRNAs.
For a given sncRNA sequence (of length L), there is a thermodynamic ensemble (Ω) that contains the different suboptimal structures, each with a given free energy (G i ) . Thus, the partition function reads , and the free energy of the ensemble is G = -kT ln(Z). Then, the probability that the sncRNA folds into the structure i is given by . We assumed T = 37°C, then kT = 0.616 Kcal/mol. In this work, instead of comparing the MFE structures to analyze two different sequences, we compared the two ensembles of structures corresponding to the sequences. We introduced the base-pair distance between two structures (d BP ), which is more accurate than the Hamming distance, to evaluate the difference between two structures . The base-pair distance (d BP ) between the different structures of Ω (S i denotes structure i), referred as intrinsic distance, is given by (doubly probabilistically averaged). This magnitude accounts for the structural variability within Ω of a given sequence, and then allows defining plasticity (P). Lower values of d 0 indicate that Ω is dominated by the MFE structure, while higher values correspond to ensembles with more suboptimal structures within a given energy gap. More plastic is a sequence when it presents more structural fluctuations at the equilibrium. Therefore, we defined plasticity as . This magnitude can then be used to distinguish very stable RNAs.
To compute R m we need to compare different mutant sequences. The average distance between structural ensembles after one single point mutation (d 1 ) follows (where Ω1 is the ensemble of mutants and Π j is calculated using the partition function of Ω1, denoted by Z 1 ). Since d 1 only accounts for one mutant, we need to average several calculations. Here 〈•〉 indicates average for perturbations and Δ• standard deviation. Hence, 〈d 1〉 is the average structural distance after 1 single point mutation (L calculations). Then, we defined mutational robustness as . As in the definition of P, we rescaled by L/2 to have an absolute value, since 〈d i 〉 scales with L and because the number of base-pairs of a structure is bounded by L/2 (i.e., d BP between certain structure and the unfolded state is bounded by L/2). Certainly, the lower the distance, closer to 1 (maximum) should be the robustness. Here we considered that robustness follows a linear trend with the relative structural distance, although quadratic expressions could also be employed . Analogously, we calculated the distance between structural ensembles after 2 single point mutations (d 2 ), and the distance between ensembles after one environmental perturbation (d e ), which was simulated as a random variation over the value of all the energetic parameters that define the model for base-pair interactions. For that, all parameters determining the energies for base pairing and stacking are perturbed simultaneously . More in detail, to perform environmental perturbations, we took variations up to 20% of the nominal values following normal random distributions, i.e., being β 0 the nominal value of an energetic parameter, the perturbed value reads β = (1+0.2ξ)β 0, where ξ ~ Ν(0,1). Hence, 〈d 2〉 is the average structural distance after 2 single point mutations (10 L calculations), and 〈d e 〉 is the average structural distance after an environmental perturbation (1,000 calculations). Then, we defined environmental robustness as . We further defined epistasis as , which measures the interference between mutations. E > 0 means antagonistic epistasis (i.e., 〈d 2〉 < 2〈d 1〉, resulting in compensatory effects), while E < 0 synergistic epistasis (i.e., 〈d 2〉 > 2〈d 1〉, resulting in enhancement effects) .
In addition, for each sncRNA we computed its degree of functionality (V), given by (probabilistically averaged), where V i is the number of times that a motif involving consecutively three free nucleotides and three bound nucleotides (in the 5' sense or in the 3') appears in the structure i of the ensemble. Two overlapping motifs were counted as a single event. While V i is a magnitude that corresponds to one structure, V corresponds to a sequence. We call this magnitude functionality because it quantifies the number of different mechanisms for potential interactions with further RNA molecules [2, 42]. In other words, the degree of functionality accounts for the number of regions that may provide accessibility for RNA-RNA interactions. Moreover, V i is roughly proportional to the number of hairpins of the structure, and that metric of functionality also accounts for the complexity of the molecule.
Structural robustness was tested for significance by comparing it to a distribution of robustness values generated from a large set of artificially originated sequences. Artificial sequences shared the property of yielding the same MFE structures as the real sequences. For each sncRNA, we generated 69 random sequences, resulting in a population of 5,451 elements. Results were primary maintained when using smaller random populations. We constructed three different random samples. Sample I was obtained by iteratively solving the corresponding inverse folding problems using different initial sequences  with Vienna RNA package (default energetic parameters, dangles = 2, MFE objective) [23, 51]. Notably, this allows sharing the MFE structure, but the thermodynamic ensembles may differ. Subsequently, sample II was obtained by combining inverse folding and neutral evolution, introducing mutations that do not change the MFE structure , thereby minimizing the bias introduced by the optimization method itself. For that, we performed L mutations. This process, nevertheless still produces biased sequences because mutations would be accumulated in regions with unpaired nucleotides (e.g., loops or tails). By definition, mutations affecting paired nucleotides are not neutral, with the exception of G-U/G-C paired regions. To counterbalance this bias, we constructed a sample III by which sequences were subjected to a neutral evolution process accounting for potential compensatory mutations (also L mutations). This process was based, in the case of paired nucleotides, on mutating the complementary nucleotides as well. Following this procedure, the simulated neutral evolution process accounts for both neutral single-point mutations and neutral base-pair mutations. This allowed enlarging considerably the sequence space and avoid more efficiently the bias produced by inverse folding methods.
We thank S. F. Elena for useful comments. G.R. is supported by an EMBO long-term fellowship co-funded by Marie Curie actions (ALTF-1177-2011). M.A.F. is supported by the Spanish Ministerio de Ciencia e Inovación grant BFU2009-12022.