I tested for evidence of selective constraint acting on mRNA secondary structures in protein coding yeast genes. Predicted secondary structures differ greatly according to the prediction method used and between species. Nevertheless, there are significantly fewer conserved optimal codons and consistently fewer synonymous substitutions at paired sites for all predicted secondary structures. The results of this study are consistent with purifying selection on mRNA secondary structures.
Similar tendencies of codon use have been reported for Drosophila and humans: mRNA stability seems high when optimal codon use is low in Drosophila  and paired sites contain an excess of rare codons in humans . Note that in this study, the comparison of optimal codon use is restricted to conserved sites. Besides the methodological need for ALIfold (see Material and Methods), the restriction to conserved sites restricts the analysis to sites potentially under considerable strong selection. For RNAfold structures, results become non-significant when not restricting the data to these conserved sites (data not presented). Strong conflicting selection pressures seem to act on certain sites while the remaining sites seem less constrained for structure. Selection on local and not global structures may explain these results and contribute to the low structural similarity across species. Selection on local mRNA structures in coding regions of eukaryotic genes has been suggested before . Beside the low structural similarity also compensatory substitutions may contribute to the non-significant results when comparing substitution numbers at paired and unpaired sites.
Previous bioinformatic studies that focussed on whether or not the thermodynamic stability of mRNA structures of various organisms is selected for or against [20–23, 49, 55] lead to partly inconsistent results and controversies about the accurate randomization procedure. In these studies, the observed MFE is compared to the expected MFE, which is estimated by taking the mean MFE of randomized versions of the same sequence, and a significant deviation is taken as evidence for selection for or against thermodynamic stability of the structure. The randomization of sequences can be performed in a number of different ways holding various properties of the sequence constant, while randomizing others. The properties are of biological importance; variables that are affected by forces other than selection for mRNA structure – for example the amino acid sequence – should be fixed. Which variables should remain free to vary however may not always be obvious, while the results are very sensitive to them. Di-nucleotide content for example might be selected for its effect on stability and should be allowed to vary for randomized sequences argue Chamary and Hurst . However, di-nucleotides might well be affected by mutation bias, or selected for some other reason , in which case, di-nucleotide content should be kept fixed. The control of di-nucleotides in fact renders significant results non-significant [20–23, 55].
Note that in contrast to comparing observed and expected MFE values, the comparison of constraint at paired and unpaired sites does not indicate that selection acts for or against the thermodynamic stability of the structure, but that the very predicted structure is under selection. With respect to selection for or against stability of structures, ALIfold results indicate that the thermodynamically most stable global structure is not conserved across the four yeasts: ALIfold consensus energy value is much higher i.e. less stable compared to the average energy value of the single sequences [see also Washietl et al.  for approach]. This is conform with results of Babak et al.  which support selection against stability of structures in coding regions. It is reasonable to expect selection on mRNA structures may act against too stable structures because too stable and un-flexible mRNA structures may interfere for instance with translation  and some mRNAs flexibility may allow their specific and dynamic complexes with other factors. mRNAs lead a complex life  and besides thermodynamic stability, selection on mRNA structure may also exist to maintain specific local or global mRNA structures that allow binding and interaction with other factors and thus effect biological functioning. Not only structural targets may be of effect, also accessibility of sequence targets may depend on global or local mRNA structures.
While results of this study are consistent with selection upon mRNA structures in coding regions and support laboratory studies that report synonymous substitutions are functionally important with respect to mRNA structure and translation in humans [52–54], two considerations should be made. First, we do not know whether thermodynamic mRNA structure predictions predict the mRNA structures that are formed in the cell. mRNAs are generally associated with other factors , and effects of mRNA-associated microRNAs and proteins on the structure are hard to predict. Also, kinetics of mRNA folding and pseudo-knots are not considered here. Even with the comparative method, mRNA structures may remain at best approximations of the real mRNA structures in the cell. Secondly, the predicted and also the real structure will be affected by certain DNA patterns – however whether or not the respective DNA patterns are selected for mRNA structure or another reason may be hard to judge. There are several DNA patterns one may consider. (i) Di-nucleotide content of naturally occurring sequences leads to higher than expected thermodynamic stability [e.g. [21, 23, 55]]. Di-nucleotide content may be selected for its effect on mRNA structure but it may also be affected by mutation bias, or selected for some other reason, for example for nucleosome positioning [56–58] or transcription pause sites . (ii) Frequency of polypurine tracts is increased in exons and may affect thermodynamic structure. Again, polypurine tracts may be selected with respect to mRNA structure but also for other reasons such as enhancing splicing . (iii) Translational protein folding into alpha-helix and beta sheets may affect synonymous codon use  and periodic DNA patterns may affect mRNA structure. If thermodynamic predictions correspond to any other force such as selection on nucleosome positioning and transcription or co-translational pause sites, the observed patterns may be a consequence of that and inference of selection acting directly upon on mRNA structure may be incorrect.
Alternative selection upon mRNA structures (or any other selective target) may counterbalance translational selection and explain why the bias towards translationally optimal codon is never complete and even in highly expressed genes non-optimal codons are used. Alternative selection may also contribute to the discrepancy between expected and observed codon bias , and may lead to systematic underestimates of selection strength for optimal codons. As selection for mRNA structures may be acting stronger on GC-ending codons, in organisms in which potential translationally optimal codons are biased towards GC, such as Drosophila, mammals, C. elegans, estimates of selective strength for optimal codons may also be overestimated. It will be worth considering effects of alternative selection and disentangling the different targets of selection.