The NCS family includes members with very different evolutionary rates, which provide insights into their diversification in both structure and function. Among these calcium sensors, Frq/NCS-1 appears to be one of the most conserved subfamilies, not only at the amino acid sequence level but also because of their very low propensity to retain gene duplications - no other fixed duplication has been detected here except in D. rerio. For that reason, the discovery of a fixed and long-term stable (estimated as > 100 Mya) duplication in Drosophila, as well as the rapid and heterogeneous accumulation of amino acid changes between duplicates, constitutes a very unusual feature for this family. Understanding the basis of this finding may contribute to the better knowledge of molecular evolution of NCS proteins as well as of important fields as the origin and fate of duplicated genes and of functional innovation.
Gene duplications arise initially in a single individual and can become fixed in the population by natural selection or by random genetic drift. Initially, duplications generate functional redundancy, a situation that is generally non-advantageous; hence, even if duplicates have been fixed by chance, the accumulation of mutations will result in the disruption of structure and function of one of the duplicates, which becomes a pseudogene (i.e. a non-functional gene). Results of nucleotide and amino acid variation analyses performed here clearly demonstrate that amino acid changes observed in Drosophila Frequenins are not the result of this pseudogenization process. The coding sequence of frq1 shows strong functional constraints across the Drosophila genus as well as within a D. melanogaster population (unpublished results) and none of the nine amino acid substitutions affecting the encoded protein was disruptive, pointing to purifying selection as the main force acting on this gene. In addition, analysis of loss-of-function and over-expression phenotypes also point to the role of both Frq1 and Frq2 on synaptic transmission and nerve terminal growth [9, 11].
Several theoretical models have been proposed to explain the maintenance of duplicated genes, which consider different mechanisms of preservation and subsequent optimization [20–25]. The main differences among models rely in which part of the gene is involved (coding or non-coding sequences) and in the relative role of natural selection and genetic drift in determining the outcome of duplication. Dealing with protein sequence evolution, two duplicates can be preserved without the action of positive selection just by the selectively neutral division of different subsets of the original functions between daughter copies (product subfunctionalization). Nevertheless, this situation is not compatible with present data because of: i) the distribution of mutations after gene duplication is significantly different from the one expected under neutrality, and ii) the strongly selective constrains acting on amino acid positions affected by these mutations across the Drosophila genus. We would expect that positions with degenerative mutations removing ancestral functions evolved in a completely neutral way.
The amino acid substitution pattern observed in Frq1 might result from the fixation of nearly neutral mutations (and so governed largely by genetic drift) in an initial period of relaxation, just after gene duplication, followed by a subsequent increase of functional constraints in Drosophila. Environmental conditions responsible for these constraint changes would affect frq1 and frq2 in a different way as suggested by the highly biased distribution of amino acid substitutions detected between these paralogues.
Accordingly, some functional or regulatory diversification from the native state between Drosophila Frequenins would be needed. In this context, some models of structural evolution such as compensatory mutations  or conformational epistasis  might generate the observed evolutionary pattern. In the first, some variants are fixed by positive selection to compensate deleterious mutations in other epistatically interacting positions. Under this model, some of the amino acid substitutions in Frq1 ought to have been selected to maintain protein structure or function rather than be adaptive . Under the conformational epistasis hypothesis, most of the Frq1 mutations should have been slightly deleterious or permissive substitutions (i.e. small-effect) that stabilized specific structural elements in this protein allowing further positively selected mutations-which in the absence of previous small-effect mutations should destabilize the protein. These mutations could have conferred a new function and then increased selective constraints in Drosophila. Although we have found evidences of molecular co-evolution across the Frq/NCS-1 subfamily, none of the predicted coevolving amino acids involves two positions with replacements in Frq1. This finding should rule out structural evolution as the main explanation for the rapid accumulation of amino acid substitutions between duplicates. Nevertheless, the probability of observing more double substitutions than expected by chance largely depends on the presence of relatively strong epistatic interactions between mutations. If both Frequenins have very few potentially permissive substitutions, the probability of observing repeated pathways across the subfamily should be very low. Under this situation, we will have very little power to detect signals of co-evolution between amino acid sites across the alignment. An in-depth experimental study would be needed to analyze the putative contribution of these structural evolution models in generating the pattern observed in Drosophila Frequenins.
The recurrent fixation of advantageous mutations might also account for the excess of amino acid substitutions in Frq1. ML estimate of the d
S ratio in the internal branch leading to Drosophila Frq1 sequences is actually higher than in the rest of the branches of the Drosophila Frq phylogeny. Although the estimate is considerably lower than 1, as well as lower than genomic averages reported for the Drosophila genus , this result does not exclude the possibility that positive selection acted in the fixation of certain Frq1 changes. It has been largely demonstrated that positive selection commonly acts on few amino acid positions in a protein and, therefore, present estimates based on the complete coding sequence could be too conservatives. Frequenin is a highly conserved protein, with very low ω estimates across bilateria (the pair-wise ω ratios calculated between vertebrate NCS-1 sequences range from 0 to 0.0065; d
S data from Ensembl Genome Browser) and, therefore, even a significant increase in the number of amino acid changes could not be greatly reflected in the average ω value calculated from the entire protein. In fact, when we applied to the data a much more powerful branch-site approach , results were marginally significant (results not shown). Even so, we have to interpret this result with caution because it has been reported that this method often generates false positives under certain conditions [31, 32].
All simple models for the preservation of frq1 and frq2 in Drosophila considered above that are compatible with the existing data require the action of positive selection. Thus, the key question that remains unsolved is if natural selection promoted a functional change between Frq1 and Frq2 in Drosophila. We investigated whether positions differing between paralogues are involved in functional diversification between other members of the NCS family. The fact that many of these positions are candidates to participate in significant amino acid substitution rate shifts between NCS subfamilies suggests that Frq1 and Frq2 might have diverged (at least to some extent) in their functions. It is difficult to determine, however, the specific functional features that could have diverged between these two proteins. Most of the positions with changes between Drosophila Frequenins are among the most probable functionally diverged residues in many subfamily comparisons. The putative specialized roles in neural function of Drosophila Frequenins might result from differences in the affinity to Ca2+, sub-cellular location or targeted proteins . It has been demonstrated that sites of the C-terminal part of the human NCS-1 interact with target proteins . Also, we had shown that the last 33 amino acids of the Drosophila Frq1 and Frq2 act as dominant negative peptide, with effects in synaptic transmission and terminal morphology . Thus, the two candidate C-terminal positions, 161 and 162 (and perhaps the position 187) might have promoted some diversification in the interaction with target proteins. On the other hand, the replacements observed in these two sites might have also produced changes in Ca2+ binding either directly, because they are located in the fourth EF-hand, or by producing structural changes affecting protein thermostability and Ca2+ affinity of the other EF-hands [34–36]. The position 79 is one of the coordinated residues of the second EF-hand and, therefore, replacements in this site might also be related with Ca2+ affinity differences between paralogues. The amino acid fixed in Frq1 in this position is hydrophilic and highly exposed, in contrast to the hydrophobic and buried ancestor. This feature might indicate a possible structural change produced by the replacement in this position. Binding sites for target proteins have been also mapped in the N-terminal part of the human Frq/NCS-1protein . Consequently, substitution in positions 58 (and perhaps in 79 and 102 in Frq1) might have altered the interaction properties of Frq2 with some of their partners. The other good candidate to participate in functional divergence between NCS subfamilies, the position 102, is located in the loop connecting the second and third EF-hands. The homologous region in GUCA2 determines the concentration of Ca2+ that activates the target of this protein . Then, the replacement in this position might also affect the Ca2+ binding properties of Frq1, contributing to the functional divergence of duplicates in Drosophila.
Finally, in addition to the retention and diversification of protein coding regions, regulatory divergence is also prevalent in the evolution of duplicated genes . In fact, it has been proposed that subfunctionalization of regulatory regions can increase the mutational space accessible to duplicates, removing selective constraints and allocating neofunctionalization . The mRNAs of frq1 and frq2 are expressed in D. melanogaster with a similar spatio-temporal pattern, but with important quantitative differences . These quantitative differences could be related with the significantly different silent evolutionary rate found between these two proteins since it has been shown that gene expression levels are negatively correlated with evolutionary rates . Here we found that Brc and Eip74EF factors might be involved in the regulatory divergence of these two duplicates. Suitable Bcr binding sites are only present in regulatory regions of frq1. In fact, frq1 and br (the locus encoding Brc proteins)  mRNAs have a very similar expression pattern in late stages of embryo development (both appear in brain and ventral nervous system at embryo stages 13-16; http://www.fruitfly.org/cgi-bin/ex/insitu.pl), coincident with the peak of frq1 expression. Eip74EF, in contrast, appear only in early metamorphosis, after the major pulse of Ecdysone in the third larval instar, being consistent with the higher expression of frq2 mRNA in adult flies. Hence, differences in the response to these two Ecdysone-induced early genes might be responsible, at least in part, for the differences in gene expression levels between duplicates. This regulatory diversification might render the action of positive selection suitable, resulting in further functional diversification at the protein level. The current in silico analysis sets the frame of future experimental studies on the regulation of frq1 and frq2 expression, as well as on the functional mechanisms of the corresponding proteins.