FoxO gene family evolution in vertebrates
© Wang et al; licensee BioMed Central Ltd. 2009
Received: 3 March 2009
Accepted: 7 September 2009
Published: 7 September 2009
Forkhead box, class O (FoxO) belongs to the large family of forkhead transcription factors that are characterized by a conserved forkhead box DNA-binding domain. To date, the FoxO group has four mammalian members: FoxO1, FoxO3a, FoxO4 and FoxO6, which are orthologs of DAF16, an insulin-responsive transcription factor involved in regulating longevity of worms and flies. The degree of homology between these four members is high, especially in the forkhead domain, which contains the DNA-binding interface. Yet, mouse FoxO knockouts have revealed that each FoxO gene has its unique role in the physiological process. Whether the functional divergences are primarily due to adaptive selection pressure or relaxed selective constraint remains an open question. As such, this study aims to address the evolutionary mode of FoxO, which may lead to the functional divergence.
Sequence similarity searches have performed in genome and scaffold data to identify homologues of FoxO in vertebrates. Phylogenetic analysis was used to characterize the family evolutionary history by identifying two duplications early in vertebrate evolution. To determine the mode of evolution in vertebrates, we performed a rigorous statistical analysis with FoxO gene sequences, including relative rate ratio tests, branch-specific d N /d S ratio tests, site-specific d N /d S ratio tests, branch-site d N /d S ratio tests and clade level amino acid conservation/variation patterns analysis. Our results suggest that FoxO is constrained by strong purifying selection except four sites in FoxO6, which have undergone positive Darwinian selection. The functional divergence in this family is best explained by either relaxed purifying selection or positive selection.
We present a phylogeny describing the evolutionary history of the FoxO gene family and show that the genes have evolved through duplications followed by purifying selection except for four sites in FoxO6 fixed by positive selection lie mostly within the non-conserved optimal PKB motif in the C-terminal part. Relaxed selection may play important roles in the process of functional differentiation evolved through gene duplications as well.
Mammalian FoxO proteins (FoxO1, FoxO3a, FoxO4 and FoxO6) which are homologous to Caenorhabditis elegans protein DAF-16, belong to the O ('other') class of the Fox superfamily [1, 2]. FOXO1 is the first identified member of the FoxO family of transcription factors  and is involved in the transcriptional activity of alveolar rhabdomyosarcomas . Since then, the discovery of mammalian FoxO genes has grown rapidly, now FoxO proteins have been identified in several different organisms, including zebrafish, mouse, rat and human. As transcription factors in the nucleus, the primary function of FoxO proteins is to bind to their cognate DNA targeting sequences as monomers. The co-crystal structure of HNF-3γ with DNA shows that there are 14 protein-DNA contacts distributing throughout the forkhead domain, but the third α-helix (H3) plays the most important role in a winged helix/forkhead protein's DNA-binding specificity . In addition, both winged loops also make important interactions with DNA [4, 5]. Although the molecular basis of the DNA-binding specificity of FoxO transcription factors is poorly understood, high-affinity DNA-binding studies have identified a consensus FoxO-recognized element (FRE) as (G/C) (T/A)AA(C/T)AA [6–8]. Indeed, functional FRE sites that match this consensus sequence have been identified in the promoters of many genes, such as Fas ligand (FasL), insulin-like growth factor binding protein 1 (IGFBP1) and the apoptotic regulator BIM[9, 10]. Additional putative FoxO-target genes and their potential cis-regulatory binding sites have been predicted by systematic bioinformatic approaches . Thus, FoxO transcription factors appear to be involved in various signaling pathways and control a wide range of biochemical processes including cellular differentiation, tumor suppression, metabolism, cell-cycle arrest, cell death, and protection from stress [1, 9, 10]. In the mouse, four different FoxO members have been identified to date: Foxo1, Foxo3, Foxo4 and Foxo6 [12, 13]. FoxO6 is the latest member of the FoxO family to be cloned and shares significant sequence similarity with the other members of the family .
FoxO1 and its close paralogus (FoxO3, FoxO4 and FoxO6) are thought to some degree of functional diversification during development [14–16] and their potential physiological roles might be different . Indeed, a rapid overview of the data collected on FoxO1, 3, 4 and 6 highlights how these proteins may be different. First, each FoxO gene showed different expression patterns in tissues [7, 12, 17–19]. While Foxo1 was strongly expressed in the striatum and neuronal subsets of the hippocampus (dentate gyrus and the ventral/posterior part of the CA regions), Foxo3 was more diffusely expressed throughout the brain including all hippocampal areas, cortex and cerebellum, and Foxo6 expression was eminent in various parts of the adult mouse brain. Moreover, the individual disruption of Foxo1, Foxo3 and Foxo4 genes in mice results in different phenotypes [14, 16]. While a homozygous knockout of Foxo1 (FKHR) was embryonic lethal due to failures in angiogenesis and vessel formation, Foxo3a-/- (FKHRL1) and Foxo4-/- (AFX) were viable and appeared to develop normally. Later in development, Foxo3a-/- females were found to be age-dependently infertile and showed abnormal ovarian follicular development. As to the physiological role, each FoxO gene exhibits a distinct response under a variety of conditions [20–22]. Therefore, despite the high sequence identity shared by FoxO genes domain (more then 60% in humans ), the physiological roles of FoxO genes are functionally diverse in mammals.
Single copy genes are thought to evolve conservatively because of strong negative selective pressure. Gene duplications produce a redundant gene copy and thus release one or both copies from negative selection pressure . There are a number of models for the fate of duplicate gene that predict functional differentiation of paralogs based on protein sequence or regulatory divergence [24, 25]. Currently four most prominent models are neofunctionalization , subfunctionalization , the Dykhuizen-Hartl effect  and adaptive diversification. Very recently, the list has been expanded by the introduction of the subneofunctionalization  and the adaptive radiation  models that predict rapid subfunctionalization after duplication followed by a prolonged period of neofunctionalization and adaptive divergence of duplicate genes in a process analogous to species radiations, respectively. Thus, duplications are thought to be an important precursor of functional divergence . Here, we are interested in the specific role that natural selection might play in the evolutionary history of this gene duplication.
The increased availability of FoxO sequences in the public databases allows us to explore the functional diversity from a phylogenetic perspective within the FoxO family in vertebrates. The study was conducted by analyzing amino acid and nucleotide-based divergence data from different species covering the entire vertebrates. Our aim was to elucidate the evolutionary mechanisms operating in the retention of these genes and evaluate the changes in selection pressures following duplication. We also identified the sites under positive Darwinian selection. Finally, we tried to map the positively selected sites to the structural and functional regions of FoxO molecules.
Materials and methods
Sequence Data Collection
The DNA sequences and amino acids sequences of FoxO genes were downloaded from NCBI's GenBank http://www.ncbi.nlm.nih.gov. PSI-BLAST searches were conducted against the non-redundant database of vertebrate genomes at NCBI (e-value cutoff = 1e-24) using the amino acid sequences of Foxo1, Foxo3, Foxo4 and Foxo6 of mouse (gi: 56458, gi: 56484, gi: 54601 and gi: 329934) as queries. Only full length coding sequences were included in our analysis. Jalview 2.3  was used to remove the sequences with the identity higher than 95%. A table with species names, abbreviations and accession numbers are provided in supplementary materials (Additional file 1).
Sequence alignment and phylogenetic analysis
The sequences of FoxO proteins were aligned by MUSCLE  and the resulting alignment was manually optimized by BioEdit . Incomplete sequences, and highly divergent regions or gaps resulting in uncertain alignments were excluded from the further analysis. The final data set included a total of 66 sequences from 19 species. The amino acid alignment was subsequently transformed into an aligned cds fasta file using PAL2NAL  which is a program to construct multiple codon alignments from matching amino acid sequences. The nucleotide alignment was then converted to nexus format with DnaSP  version 4.10 for phylogenetic analysis.
The full alignment of 66 sequences was used to perform the phylogenetic analysis. Tree reconstructions were done by the Bayesian method from the DNA alignment done in the MrBayes version 3.1.2 [36, 37] software package, and rooted with the BfFoxO, Cifoxo and SpFoxO from amphioxus (Branchiostoma floridae), Ciona intestinalis and Strongylocentrotus purpuratus. We analyzed four independent runs, each using the general time reversible (GTR) model plus gamma distribution plus invariant sites model of molecular evolution (GTR+G+I), as determined by Modeltest version 3.7 . We ran 2 million generation Markov Chain Monte Carlo simulations with four separate chains (three heated, one cold), with the first 500,000 generations discarded as burn-in. Trees were summarized for each independent run and compared to check for concordant topologies. The consensus tree of all compatible groupings among all runs was used in all analyses.
Synonymous codon usage analyses
Codon usage bias was estimated by the effective number of codons (ENC; ), the frequency of optimal codons (F OP ; ) and proportion of G and C in the third codon position (G/C 3rd). For ENC, lower values indicate stronger synonymous codon usage bias, while for F OP higher values indicate stronger bias. These measures were calculated for all genes using the CodonW program http://bioweb.pasteur.fr/seqanal/interfaces/codonw.html and used to test whether the degree of synonymous codon usage biases in individual genes.
Relative rate tests
The substitution rates of the FoxO genes were compared between different paralogous genes that had undergone duplication events recently, using the RRTree software . The orthologs FoxOs (Cifoxo, BfFoxOA and SpFoxOl) from amphioxus (Branchiostoma floridae), Ciona intestinalis and Strongylocentrotus purpuratus were used as an outgroup. The null hypothesis is that the rate of substitution of the tested clade is the same as that of the reference group.
Estimation of substitution rates and testing natural selection
We estimated the selective pressures acting on coding regions by applying a phylogenetic-based Maximum Likelihood (ML) analysis. ML estimated of the relevant parameters -as branch lengths and the ratio of the nonsynonymous (d N ) to synonymous substitution rates (d S ), ω = d N /d S -that were obtained using the codeml program implemented in the PAML package version 4 . The ω parameter was used as a measure of the protein selective constraints . These analyses were conducted under different competing evolutionary hypothesis. We first investigated whether the distribution of selective constraints acting on the each gene fluctuated across lineages; for that, we compared the fit to the data of the "one ratio" model (M0), which assumes a constant selective pressure across branches, with the "free ratios" model (FR), where the rate parameters were estimated independently in each lineage. We also examined other evolutionary scenarios; i) to determine which FoxO lineage had evolved at a different rate, as compared to the rest of the phylogeny, we applied a branch-specific model to the data. Sequences were divided into four groups according to their phylogenetic analysis, and each FoxO lineage was set as the foreground branch. ii) to detect sites under positive selection in four lineages, we applied three codon-based ML substitution models that are site-specific (i.e., models that allow variation in the ω ratio across sites) of  but assume the same selection pattern for a site in all lineages; iii) to investigate the existence of sites evolving under positive selection in only a specific lineage, we applied the modified branch-site model A of  in two consecutive tests (test1 and test2 in ) to the same alignment used for the site-based models. The model allowing for positive selection is denoted model A and the lineage to be tested is the foreground lineage, whereas the remaining ones are the background lineages; the multiple hypothesis testing problem  was taking into account using Bonferroni's correction . The likelihood Ratio Test (LRT) was used to compare the fit to the data of two nested models, assuming that twice the log likelihood difference between the two models (2ΔL) follows a χ2 distribution with a number of degrees of freedom equal to the difference in the number of free parameters .
We used the TreeSAAP version 3.2  to determine the FoxO physicochemical properties affected by natural selection. This program for examining the effects of nonsynonymous substitutions on protein evolution compares the observed distribution of physicochemical changes inferred from a phylogenetic tree with an expected distribution based on the assumption of completely random amino acid replacement expected under the condition of selective neutrality. For all possible pairwise amino acid changes, the range of effect size for each of the 31 properties was determined and equally divided into 8 magnitude categories. Categories 1 to 3 indicate small variation in the amino acid characteristics while categories 6 to 8 represent the most radical substitutions. For all properties that differed significantly from neutrality, Z-scores were then calculated in each magnitude category to determine which classes contributed to this deviation. The critical Z-score values for P = 0.001 are 3.09, indicating positive selection on that magnitude category, and -3.09, which indicates negative (purifying) selection. That is, positive and negative Z-scores indicate positive and purifying selection, respectively. Radical substitutions affecting a particular property that occurred more frequently than expected by chance constituted the signature of adaptive evolution .
Testing functional divergence and structure analysis
To study the functional divergence and structural differences after the gene duplication, we used the Diverge 2.0 software to estimate the type I (θI) and type II (θII) functional divergence coefficients [52, 53] among paralogous proteins. Type I and type II refer to shifts in the evolutionary rate pattern after the emergence of a new phylogenetic cluster (indicative of changes in functional constrains), and amino acid replacements completely fixed between duplicates (resulting in cluster-specific alterations of amino acid physiochemical properties), respectively.
Genes which have been predicted to subject to positive selection were used to search for homologous sequences in the PDB database of protein structures http://www.rcsb.org/pdb/home/home.do using Blastp [54, 55]. The Rasmol http://rasmol.org/ was used for all structural manipulations and highlighting the relevant amino acid replacements identified in the evolutionary analyses..
Sequence similarity searches and multiple alignments
Available FoxO1, FoxO3, FoxO4 and FoxO6 sequences were retrieved from 19 species ranging from amphioxus (Branchiostoma floridae) to mammals. Additional file 1 outlines the sequences (protein and DNA) used in the phylogenetic analyses. The highly conserved forkhead domain remained in all alignments. It should be noted that additional FoxO genes for eutherians and teleosts were identified. Inclusion of these did not improve the reliability of the phylogeny, and as the aim of this study was to determine the evolutionary history of the FoxO gene family, only representatives from the major vertebrate clades were included.
Phylogenetic analyses of FoxO gene lineages
Synonymous codon usage analyses
Mean values of GC%, GC3%, ENC, CAI and Fop of the FoxO genes
0.5775 ± 0.0416
0.6478 ± 0.1009
51.0256 ± 5.1246
0.0661 ± 0.0119
0.3506 ± 0.0259
0.5797 ± 0.0538
0.6744 ± 0.1194
47.9308 ± 6.3169
0.0765 ± 0.0171
0.3851 ± 0.0283
0.5974 ± 0.0430
0.6082 ± 0.0459
50.9273 ± 2.5006
0.0660 ± 0.0129
0.3522 ± 0.0385
0.6773 ± 0.0867
0.7695 ± 0.1486
42.4620 ± 8.7015
0.0462 ± 0.0212
0.3032 ± 0.0599
Relative rates of evolution of FoxO6 lineage
Evolutionary Rate of the FoxO Gene Families
Selective constraints and functional divergence
Maximum likelihood estimates of the coefficient of functional divergence (θ) from pairwise comparisons between FoxO groups
FoxO1 Vs FoxO3
P < 0.01
FoxO1 Vs FoxO4
P < 0.01
FoxO1 Vs FoxO6
P < 0.01
FoxO3 Vs FoxO4
P < 0.01
FoxO3 Vs FoxO6
P < 0.01
FoxO4 Vs FoxO6
P < 0.01
Recently, a method has been developed to test for type II functional divergence . Type II sites are those that are highly conserved in both clusters but are fixed for amino acids with different biochemical properties between sister clusters, implying these residues are responsible for the functional differences between these groups. Although at least one site with evidence of type II divergence was found for comparisons between FoxO1/FoxO3, FoxO3/FoxO4, and FoxO1/FoxO4 clusters, the θ II values are extremely small (θ II = 0.005 ~0.074) that highlighted the conservation between different clusters. These results are not unexpected given that this method calculates θ across all sites in an alignment and thus effectively averages site-wise θ values. With only ~3% of sites/cluster showing a pattern of type II divergence in our concatenated alignment, it is not likely that the ~9 possible type II sites have θ II values high enough to compensate for the extremely low θ II values of the over 300 sites with θ II effectively equal to zero. Our results are similar to the analysis of Hox-gene .
LRTs done to detect heterogeneous selection regimes among lineages for each gene
ω = 0.08442
two-ratio vs one-ratio
ω0 = 0.0898ω1 = 0.0758
p > 0.05
two-ratio vs one-ratio
ω0 = 0.0910ω1 = 0.0730
p < 0.05
two-ratio vs one-ratio
ω0 = 0.0811ω1 = 0.1044
P < 0.05
two-ratio vs one-ratio
ω0 = 0.0786ω1 = 0.1358
P < 0.01
Site model analyses for the FoxO1, FoxO3, FoxO4 and FoxO6 phylogenies
M3 vs M0
M2a vs M1a
M8 vs M7
2ΔL = (L1-L0)
2ΔL = (L1-L0)
2ΔL = (L1-L0)
Positively selected sites
p < 0.01
p < 0.01
p < 0.05
ω = 1.36
66 L (p > 0.90)
p < 0.01
p < 0.01
p < 0.01
ω = 127.02
264K* 266P* 434G* 439T*
Parameter estimations and likelihood ratio tests for the branch-site models
Positive selected sites
MA Vs M1a (test 1)
p0 = 0.70592 p1 = 0.18493 (p2 = 0.10915) w0 = 0.07205 (w1 = 1.00000) w2 = 1.00000
P < 0.01
213S** 216S* 219S* 252M* 276V** 285P* 296L** 306A** 340F* 360E*
MA Vs M1a (test 1)
p0 = 0.77299 p1 = 0.10453 (p2 = 0.12248) w0 = 0.06958 (w1 = 1.00000) w2 = 1.00000
P < 0.01
5H** 25D* 26F** 33D** 34L**37N** 217A* 231G** 329G*
MA Vs M1a (test 1)
p0 = 0.67705 p1 = 0.18055 (p2 = 0.1424) w0 = 0.07468 (w1 = 1.00000) w2 = 1.00000
P < 0.01
7V** 173R** 194T** 201I** 202L** 211F**223H* 225P** 242T* 254R** 314S**
MA Vs M1a (test 1)
p0 = 0.58516 p1 = 0.13030 (p2 = 0.28454) w0 = 0.06572 (w1 = 1.00000) w2 = 1.05311
P < 0.01
42Q** 46K** 155I* 164T** 165N** 173R* 174E** 176E** 178L** 179F** 180C** 188I* 189V** 203L* 207R* 223H** 230I** 231G* 232Y** 233K** 234N** 237Y** 258S** 265N* 269T** 271E** 272N** 273E** 274V** 275H** 276V** 277S** 278Q* 279G** 280L** 281H** 282P** 283S** 286N* 314S** 316V** 320H** 330Y* 366T** 367G** 368T** 369P*
The molecular adaptation processes occurred after the gene duplication event were also investigated by comparing the magnitude of the physicochemical changes produced by the observed amino acid replacements with those expected at random . We used the program Tree-SAAP  to model how 31 different physicochemical properties were affected by amino acid substitutions in each FoxO gene. Consistent overrepresentation of radical amino acid changes (i.e., categories 7 and 8) would suggest repeated adaptive substitution . The results indicated that, some amino acid replacements altering these physicochemical properties in the FoxO1 and FoxO3 proteins accumulated more (or less) often than expected by chance (likely reflecting fitness differences) (supplementary materials (Additional file 1)). Moreover, for each physicochemical property, the distribution of the Z-scores across 8 magnitude classes  indicated that, amino acid substitutions occurred less often than expected by chance at the most extreme magnitude-classes (supplementary Additional file 1); these FoxO1 and FoxO3 protein properties, therefore, were likely evolving under purifying selection. For FoxO4 and FoxO6 genes, less physicochemical properties were affected by amino acid substitutions. The FoxO6 gene, on the contrary, seems to evolve positive selection, because category 8 occurs more frequently than expected by chance for 2 of the properties (alpha-helical tendencies and compressibility).
Spatial distributions of possible selected FoxO6 Sites on three-dimensional structure
It has long been know that FoxO transcription factors play important roles in regulating various signals, which translate various environmental stimuli into dynamic gene expression programs to influence many physiological and pathological processes, including cancer and aging. The functions of FoxO proteins are regulated at multiple levels, which include but are not limited to phosphorylation, ubiquitylation and acetylation. Interestingly, all of these activities affect nuclear/cytoplasmic trafficking of FoxO proteins. The specific function of each member of this family is different . As the accumulation of gene sequences in the database, it is feasible to explore the functional diversity from a phylogenetic perspective. We performed firstly to the resolution of the evolutionary relationships of these FoxOs using molecular sequence data. Whereas, the incorrect phylogenetic topology resulting from mutationally saturated positions, inadequate modeling of the evolutionary process and systematic bias due to variable rates of evolution among species or within sequences  may make LRT generate many false positives. Anisimova et al (2003) examined the effect of assuming a "wrong" tree , and he found that LRT falsely suggested positive selection in 96% of the replicates in the M0-M3 comparison and in 86% of the replicates in the M7-M8 comparison at the ΰ = 5% significance level. In order to overcome this problem, we adopted a number of ways in combination. Firstly, the addition of more taxa to the dataset: denser sampling of species can reduce the effect of long branch attraction (LBA) by reducing the overall distances between taxa. Secondly, we used the best model of DNA substitution, determined by Modeltest version 3.7 . And finally, our inclusion of enough sequences in each lineage helped alleviate loss of LRT power from short conserved sequences. From phylogenetic result, we focused on the 2 main duplications along the evolutionary history of FoxO genes, the FoxO1-FoxO4 and the FoxO3-FoxO6 duplication, which formed four gene lineages (all with the high confidence values, nearly 100% posterior probability in Bayesian analysis) and used for further analysis.
Codon bias is largely thought to be due to weak selection acting to optimize protein production [64–66]. Selection intensity for codon usage bias, therefore, is expected to vary among genes. Our survey of synonymous codon usage in FoxO genes revealed a strong and consistent pattern of codon bias in genes with FoxO6 relative to those with FoxO1, FoxO3 and FoxO4 (Table 1). At the same time, there appears to be some conflicting results observed between FOP and ENC, which may be caused by differences in the way that the two methods estimate codon bias. FOP is based on the frequency of a set of species specific "optimal" codons, while ENC is based on the observed number of codons used for each amino acid. Thus it is possible for the two methods to give different estimates of codon bias.
It is widely accepted that gene duplication can create opportunities for functional divergence in paralogues. Divergence is thought to occur where one duplicate retains the original protein function and the other accumulates changes, (either through redundancy or by positive selection) or alternatively, through the partitioning of the functions of an unduplicated ancestor protein. Whatever the mechanism, if functional divergence has occurred between duplicated genes, then it should be observable as changes within their coding regions.
The functional divergence of FoxO genes has been studied by . The branch length leading to the FoxO6 clade is extended relative to other FoxO genes in gene phylogeny, (Figure 1). This suggested that after the duplications, FoxO6 evolved at a faster rate than other FoxO genes. This result was confirmed by significant relative rate test results for FoxO gene lineages (Table 2). In this sense, we performed type I functional divergence analysis, and we detected significant type I divergence among FoxOs. The comparison between the FoxO3 and FoxO6 groups showed the highest value for θ (0.40 ± 0.05), suggesting that these two groups had diverged considerably more at the functional level. Next, DIVERGE was used to establish the posterior probability of type I divergence at each site in the alignment, employing two cut-off posterior probability values of 0.87 and 0.95. However, the cutoff value for residue selection is an empirical decision and is expected to depend on the intrinsic properties of the protein family being analyzed. Thus, we predicted 32 candidate functional divergence-related sites using 0.87 as a cutoff value (supplementary materials (Additional file 1)). When we narrowed our criteria to 0.95, we got 15 candidate residues as the most likely candidate sites for type I functional divergence (supplementary materials (Additional file 1)), but we lacked a way to verify how the rate-shift in these sites contributed to functional divergence among the FoxO gene groups. For comparative purposes, the same alignment and phylogeny was submitted to a ML LRT, which, like the Bayesian method provided a statistical framework where evolutionary rate shifts at particular protein positions could be established . At last, the statistically most likely positions predicted to underlie functional divergence were agreement by both methods, particularly for the highest-ranking candidates (Table 6).
In this study we used codon substitution models implemented through a maximum likelihood framework to estimate the rate of evolution at silent and replacement sites in FoxO1 and its paralogs, FoxO3, FoxO4 and FoxO6. Different models were used to investigate variation in the rate of evolution between lineages of a phylogeny, and to estimate ω for specific lineages and sites across phylogenies. Our objective was to determine the mode of evolution on each FoxO gene lineage, and to determine whether increased positive selection or decreased constraint led to the functional divergence of FoxO genes. As we have demonstrated, variation among branch and sites was observed in the FoxO6 phylogeny. Moreover, physicochemical amino acid properties analysis also provided evidence that the entire FoxO6 gene had experienced repeated episodes of adaptive evolution. The site models showed that adaptation had appeared at four sites located at C-terminal of FoxO6. For FoxO3 gene, Model M8 fits data significantly better than Model M7 (2ΔL = 8.23, df = 2, P < 0.05), and because 1.5% of sites are located in the positively selected site class with ω = 1.36, weak positive selection may be indicated with this comparison. However, it has been found that a poor fit of the data to a beta distribution may result in a high frequency of significant tests when comparing models M7 and M8 even in the absence of positive selection. To take account for the elevated type I error rates, the original model M8 was compared with a more restricted null model (M8a), where the extra site class was constrained to have ω = 1. When performing this analysis with the sequence data from the FoxO3 gene, model M8 did not significantly differ from model M8a (2ΔL = 0.067, df = 1, P > 0.05), indicating that the estimated ω = 1.36 was not significantly different than 1 and that there was little indication of positive selection in this gene. Further, our observation of strong purifying selection being the primary mode of evolution throughout the FoxO phylogeny is consistent with the findings of a recent study about forkhead family [67, 68].
When we performed branch-site model analysis, we found relaxed functional constraint was most consistent with the molecular evolutionary analyses of the FoxO data. Our conclusion is in contrast to a previous study which concluded that one site was found to be under positive selection in the FoxO3 lineage . In our paper, we applied the modified branch-site model A of  in two consecutive tests (test1 and test2 in ) to the same alignment used for the site-based models, but we could not detected the positive selection site. To determine why these two articles are giving drastically different results, we had a look at the sequences used for branch-site model analysis in , and we found that only 12 sequences (4 for FoxO3) used for testing evolution selection. Test 2 is a more direct test for identifying positive selections in the foreground branch, and significant LRT from test 1 can be resulted from either positive selection or relaxed selective constraint in the foreground branch , however, the power of this method may be limited when sample size or divergence time is low . Therefore, we concluded that the contradiction between our results and the previous study  due to the number of sequences used for analysis. Moreover, our work on site-model analysis, relative rate test and physicochemical changes indicated that FoxO6 was under positive selection. Four positive selected sites were identified by site-model analysis, two (264K, 266P, corresponding to mouse Gly337 and Pro339) of them fell into the region of non-conserved optimal PKB motif in the C-terminal part (Thr338) . The C-terminal PKB recognition sequence is not conserved in FoxO6 . Besides a PKB phosphorylation motif, this region contains a stretch of 3 additional serine residues, present in the other members of the FoxO group, but FoxO6. All these suggest that these serines may be functionally important in the other sequences analyzed with the exception of FoxO6 gene, and positive selection may lead to functional divergence between FoxO6 and the other members too. Another two positive selective site are Gly545 and Pro550 in the mouse Foxo6, and the functional role remains elusive. That is to say that the real reason for their accelerated evolution is unclear. However, it should be mentioned that there is a gap in the knowledge of the relationship between amino acid sequence and structure for full-length FoxO sequence, and we are unable to speculate on the particular role of this region in these FoxO6 genes. Unfortunately, the shared evolutionary history and molecular selection alone cannot be used as the unique criterion to infer protein function, and the true nature of each FoxO gene needs to be determined experimentally and independently. Therefore, the positively selected site may play an important functional role and could represent an interesting target site for future mutagenesis experiment thus facilitating our understanding of the structure-function relationships in FoxO genes. Molecular testing is required to validate this hypothesis. The result from branch-site analysis (relaxation of functional constraints) of FoxO6 also differs somewhat from previously signature of positive selection. We infer that the weak positive selection and multiple branches are considered as foreground branch may explain this phenomenon, because power will be reduced unless the same sites and selective constraints are occurring along all foreground branches .
Genomic data have provided an opportunity to gain a better understanding about the evolution of FoxOs using phylogenetic analyses. The FoxO gene family phylogeny showed that two duplications took place early in the evolution of vertebrates and triggered diversification of the FoxO gene family into four groups. However, further genome projects on a greater diversity of evolutionary lineages would help better understand the gene-duplication history. The relative rate analysis and physicochemical changes indicated that FoxO6 seemed to be different from other members. Evolutionary rate analysis showed that molecular adaptation can also play an important role in the evolution of this gene family. Indeed, positive selection was likely involved in the functional differentiation of FoxO6 gene; likewise, relaxed selection might play important roles over evolutionary time and shape variation of some members of the family. Considering the evolutionary history of the FoxO gene family, we provided insight into which amino acid residues might have undergone positive selection and could be targeted for site-directed mutagenesis. However, the identification of four sites under positive selection requires supporting evidence from further functional experiments to demonstrate the adaptive character of the amino acids. All these studies and experiments will certainly contribute to better understand the precise role of natural selection and functional divergence of this family.
The authors wish to express their gratitude to the members of animal sciences laboratory of Shanghai Jiao Tong University. The authors also thank editor for his suggestions about the manuscript and Jing Li for her help in revising the manuscript. This work is supported by the National High Technology Research and Development Program of China (863 project) (grant no. 2006AA10Z1E3, 2008AA101002), the National 973 Key Basic Research Program (grant no, 2006CB102102, 2004CB117500) and the National Natural Science Foundation of China (grant no. 30671492, 30871782).
- Barthel A, Schmoll D, Unterman TG: FoxO proteins in insulin action and metabolism. Trends Endocrinol Metab. 2005, 16 (4): 183-189. 10.1016/j.tem.2005.03.010.View ArticlePubMedGoogle Scholar
- Kaestner KH, Knochel W, Martinez DE: Unified nomenclature for the winged helix/forkhead transcription factors. Genes Dev. 2000, 14 (2): 142-146.PubMedGoogle Scholar
- Fredericks WJ, Galili N, Mukhopadhyay S, Rovera G, Bennicelli J, Barr FG, Rauscher FJ: The PAX3-FKHR fusion protein created by the t (2; 13) translocation in alveolar rhabdomyosarcomas is a more potent transcriptional activator than PAX3. Mol Cell Biol. 1995, 15 (3): 1522-1535.PubMed CentralView ArticlePubMedGoogle Scholar
- Clark KL, Halay ED, Lai E, Burley SK: Co-crystal structure of the HNF-3/fork head DNA-recognition motif resembles histone H5. Nature. 1993, 364 (6436): 412-420. 10.1038/364412a0.View ArticlePubMedGoogle Scholar
- Boura E, Silhan J, Herman P, Vecer J, Sulc M, Teisinger J, Obsilova V, Obsil T: Both the N-terminal Loop and Wing W2 of the Forkhead Domain of Transcription Factor Foxo4 Are Important for DNA Binding. J Biol Chem. 2007, 282 (11): 8265-8275. 10.1074/jbc.M605682200.View ArticlePubMedGoogle Scholar
- Biggs Iii WH, Meisenhelder J, Hunter T, Cavenee WK, Arden KC: Protein kinase B/Akt-mediated phosphorylation promotes nuclear exclusion of the winged helix transcription factor FKHR1. Proc Natl Acad Sci USA. 1999, 96 (13): 7421-7426. 10.1073/pnas.96.13.7421.View ArticleGoogle Scholar
- Furuyama T, Nakazawa T, Nakano I, Mori N: Identification of the differential distribution patterns of mRNAs and consensus binding sequences for mouse DAF-16 homologues. Biochem J. 2000, 349 (Pt 2): 629-634. 10.1042/0264-6021:3490629.PubMed CentralView ArticlePubMedGoogle Scholar
- Gilley J, Coffer PJ, Ham J: FOXO transcription factors directly activate bim gene expression and promote apoptosis in sympathetic neurons. J Cell Biol. 2003, 162 (4): 613-622. 10.1083/jcb.200303026.PubMed CentralView ArticlePubMedGoogle Scholar
- Accili D, Arden KC: FoxOs at the Crossroads of Cellular Metabolism, Differentiation, and Transformation. Cell. 2004, 117 (4): 421-426. 10.1016/S0092-8674(04)00452-0.View ArticlePubMedGoogle Scholar
- Greer EL, Brunet A: FOXO transcription factors at the interface between longevity and tumor suppression. Oncogene. 2005, 24: 7410-7425. 10.1038/sj.onc.1209086.View ArticlePubMedGoogle Scholar
- Xuan Z, Zhang MQ: From worm to human: bioinformatics approaches to identify FOXO target genes. Mech Ageing Dev. 2005, 126 (1): 209-215. 10.1016/j.mad.2004.09.021.View ArticlePubMedGoogle Scholar
- Biggs Iii WH, Cavenee Karen CWK: Identification and characterization of members of the FKHR (FOX O) subclass of winged-helix transcription factors in the mouse. Mamm Genome. 2001, 12 (6): 416-425. 10.1007/s003350020002.View ArticleGoogle Scholar
- Jacobs FM, Heide van der LP, Wijchers PJ, Burbach JP, Hoekman MF, Smidt MP: FoxO6, a novel member of the FoxO class of transcription factors with distinct shuttling dynamics. J Biol Chem. 2003, 278 (38): 35959-35967. 10.1074/jbc.M302804200.View ArticlePubMedGoogle Scholar
- Hosaka T, Biggs WH, Tieu D, Boyer AD, Varki NM, Cavenee WK, Arden KC: Disruption of forkhead transcription factor (FOXO) family members in mice reveals their functional diversification. Proc Natl Acad Sci USA. 2004, 101 (9): 2975-2980. 10.1073/pnas.0400093101.PubMed CentralView ArticlePubMedGoogle Scholar
- Arden KC: FoxOs in tumor suppression and stem cell maintenance. Cell. 2007, 128 (2): 235-237. 10.1016/j.cell.2007.01.009.View ArticlePubMedGoogle Scholar
- Castrillon DH, Miao L, Kollipara R, Horner JW, DePinho RA: Suppression of ovarian follicle activation in mice by the transcription factor Foxo3a. Science. 2003, 301 (5630): 215-218. 10.1126/science.1086336.View ArticlePubMedGoogle Scholar
- Anderson MJ, Viars CS, Czekay S, Cavenee WK, Arden KC: Cloning and Characterization of Three Human Forkhead Genes That Comprise an FKHR-like Gene Subfamily. Genomics. 1998, 47 (2): 187-199. 10.1006/geno.1997.5122.View ArticlePubMedGoogle Scholar
- Kitamura T, Nakae J, Kitamura Y, Kido Y, Biggs WH, Wright CV, White MF, Arden KC, Accili D: The forkhead transcription factor Foxo1 links insulin signaling to Pdx1 regulation of pancreatic beta cell growth. J Clin Invest. 2002, 110 (12): 1839-1847.PubMed CentralView ArticlePubMedGoogle Scholar
- Hoekman MF, Jacobs FM, Smidt MP, Burbach JP: Spatial and temporal expression of FoxO transcription factors in the developing and adult murine brain. Gene Expr Patterns. 2006, 6 (2): 134-140. 10.1016/j.modgep.2005.07.003.View ArticlePubMedGoogle Scholar
- Nakae J, Kitamura T, Kitamura Y, Biggs WH, Arden KC, Accili D: The forkhead transcription factor Foxo1 regulates adipocyte differentiation. Dev Cell. 2003, 4 (1): 119-129. 10.1016/S1534-5807(02)00401-X.View ArticlePubMedGoogle Scholar
- Richards JS, Sharma SC, Falender AE, Lo YH: Expression of FKHR, FKHRL1, and AFX genes in the rodent ovary: evidence for regulation by IGF-I, estrogen, and the gonadotropins. Mol Endocrinol. 2002, 16 (3): 580-599. 10.1210/me.16.3.580.View ArticlePubMedGoogle Scholar
- Bois PRJ, Grosveld GC: FKHR (FOXO1a) is required for myotube fusion of primary mouse myoblasts. EMBO J. 2003, 22: 1147-1157. 10.1093/emboj/cdg116.PubMed CentralView ArticlePubMedGoogle Scholar
- Hughes J, Criscuolo F: Evolutionary history of the UCP gene family: gene duplication and selection. BMC Evol Biol. 2008, 8: 306-10.1186/1471-2148-8-306.PubMed CentralView ArticlePubMedGoogle Scholar
- Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J: Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999, 151 (4): 1531-1545.PubMed CentralPubMedGoogle Scholar
- Kimura M: The neutral theory of molecular evolution. Sci Am. 1979, 241 (5): 98-100.View ArticlePubMedGoogle Scholar
- Goodman M, Moore GW, Matsuda G: Darwinian evolution in the genealogy of haemoglobin. Nature. 1975, 253 (5493): 603-608. 10.1038/253603a0.View ArticlePubMedGoogle Scholar
- Dykhuizen D, Hartl DL: Selective neutrality of 6PGD allozymes in E. coli and the effects of genetic background. Genetics. 1980, 96 (4): 801-817.PubMed CentralPubMedGoogle Scholar
- He X, Zhang J: Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics. 2005, 169 (2): 1157-1164. 10.1534/genetics.104.037051.PubMed CentralView ArticlePubMedGoogle Scholar
- Francino MP: An adaptive radiation model for the origin of new gene functions. Nat Genet. 2005, 37 (6): 573-577. 10.1038/ng1579.View ArticlePubMedGoogle Scholar
- Lynch VJ, Roth JJ, Wagner GP: Adaptive evolution of Hox-gene homeodomains after cluster duplications. BMC Evol Biol. 2006, 6: 86-10.1186/1471-2148-6-86.PubMed CentralView ArticlePubMedGoogle Scholar
- Clamp M, Cuff J, Searle SM, Barton GJ: The Jalview Java alignment editor. Bioinformatics. 2004, 20 (3): 426-427. 10.1093/bioinformatics/btg430.View ArticlePubMedGoogle Scholar
- Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004, 5: 113-10.1186/1471-2105-5-113.PubMed CentralView ArticlePubMedGoogle Scholar
- Hall TA: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. 1999, 95-98.Google Scholar
- Suyama M, Torrents D, Bork P: PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006, W609-612. 10.1093/nar/gkl315. 34 Web Server
- Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R: DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 2003, 19 (18): 2496-2497. 10.1093/bioinformatics/btg359.View ArticlePubMedGoogle Scholar
- Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.View ArticlePubMedGoogle Scholar
- Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17 (8): 754-755. 10.1093/bioinformatics/17.8.754.View ArticlePubMedGoogle Scholar
- Posada D, Crandall KA: MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998, 14 (9): 817-818. 10.1093/bioinformatics/14.9.817.View ArticlePubMedGoogle Scholar
- Wright F: The'effective number of codons' used in a gene. Gene. 1990, 87 (1): 23-29. 10.1016/0378-1119(90)90491-9.View ArticlePubMedGoogle Scholar
- Ikemura T: Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol. 1981, 151 (3): 389-409. 10.1016/0022-2836(81)90003-6.View ArticlePubMedGoogle Scholar
- Robinson-Rechavi M, Huchon D: RRTree: Relative-Rate Tests between groups of sequences on a phylogenetic tree. Bioinformatics. 2000, 16 (3): 296-297. 10.1093/bioinformatics/16.3.296.View ArticlePubMedGoogle Scholar
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24 (8): 1586-1591. 10.1093/molbev/msm088.View ArticlePubMedGoogle Scholar
- Yang Z: Inference of selection from multiple species alignments. Curr Opin Genet Dev. 2002, 12 (6): 688-694. 10.1016/S0959-437X(02)00348-9.View ArticlePubMedGoogle Scholar
- Yang Z, Nielsen R, Goldman N, Pedersen A: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000, 155 (1): 431-449.PubMed CentralPubMedGoogle Scholar
- Yang Z, Wong W, Nielsen R: Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005, 22 (4): 1107-1118. 10.1093/molbev/msi097.View ArticlePubMedGoogle Scholar
- Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005, 22 (12): 2472-2479. 10.1093/molbev/msi237.View ArticlePubMedGoogle Scholar
- Anisimova M, Yang Z: Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007, 25 (4): 1219-1228. 10.1093/molbev/msm042.View ArticleGoogle Scholar
- Miller R: Simultaneous statistical inference. 1981, Springer-Verlag, New YorkView ArticleGoogle Scholar
- Whelan S, Goldman N: Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics. Mol Biol Evol. 1999, 16 (9): 1292-1299.View ArticleGoogle Scholar
- Woolley S, Johnson J, Smith M, Crandall K, McClellan D: TreeSAAP: selection on amino acid properties using phylogenetic trees. Bioinformatics. 2003, 19 (5): 671-672. 10.1093/bioinformatics/btg043.View ArticlePubMedGoogle Scholar
- McClellan DA, Palfreyman EJ, Smith MJ, Moss JL, Christensen RG, Sailsbery JK: Physicochemical evolution and molecular adaptation of the cetacean and artiodactyl cytochrome b proteins. Mol Biol Evol. 2005, 22 (3): 437-455. 10.1093/molbev/msi028.View ArticlePubMedGoogle Scholar
- Gu X: Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol. 1999, 16 (12): 1664-1674.View ArticlePubMedGoogle Scholar
- Gu X: Maximum-likelihood approach for gene family evolution under functional divergence. Mol Biol Evol. 2001, 18 (4): 453-464.View ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215 (3): 403-410.View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- Gu X: Functional divergence in protein (family) sequence evolution. Genetica. 2003, 118 (2-3): 2-3.Google Scholar
- Gu X: A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences. Mol Biol Evol. 2006, 23 (10): 1937-1945. 10.1093/molbev/msl056.View ArticlePubMedGoogle Scholar
- Yang Z, Nielsen R: Codon-Substitution Models for Detecting Molecular Adaptation at Individual Sites Along Specific Lineages. Mol Biol Evol. 2002, 19 (6): 908-917.View ArticlePubMedGoogle Scholar
- McClellan DA, McCracken KG: Estimating the influence of selection on the variable amino acid sites of the cytochrome B protein functional domains. Mol Biol Evol. 2001, 18 (6): 917-925.View ArticlePubMedGoogle Scholar
- Schwede T, Kopp J, Guex N, Peitsch MC: SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res. 2003, 31 (13): 3381-3385. 10.1093/nar/gkg520.PubMed CentralView ArticlePubMedGoogle Scholar
- Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Cryst. 1993, 26 (2): 283-291. 10.1107/S0021889892009944.View ArticleGoogle Scholar
- Moreira D, Philippe H: Molecular phylogeny: pitfalls and progress. Int Microbiol. 2000, 3 (1): 9-16.PubMedGoogle Scholar
- Anisimova M, Nielsen R, Yang Z: Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003, 164 (3): 1229-1236.PubMed CentralPubMedGoogle Scholar
- Akashi H: Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994, 136 (3): 927-935.PubMed CentralPubMedGoogle Scholar
- Akashi H: Inferring Weak Selection From Patterns of Polymorphism and Divergence at"Silent"Sites in Drosophila DNA. Genetics. 1995, 139 (2): 1067-1076.PubMed CentralPubMedGoogle Scholar
- Carlini DB, Stephan W: In Vivo Introduction of Unpreferred Synonymous Codons Into the Drosophila Adh Gene Results in Reduced Levels of ADH Protein. Genetics. 2003, 163 (1): 239-243.PubMed CentralPubMedGoogle Scholar
- Christina F, Bruce R, Michael W: Identification and analysis of evolutionary selection pressures acting at the molecular level in five forkhead subfamilies. BMC Evol Biol. 8:
- Wang M, Wang Q, Zhao H, Zhang X, Pan Y: Evolutionary selection pressure of forkhead domain and functional divergence. Gene. 2009, 432 (1-2): 19-25. 10.1016/j.gene.2008.11.018.View ArticlePubMedGoogle Scholar
- Heide van der LP, Jacobs FMJ, Burbach JPH, Hoekman MFM, Smidt MP: FoxO6 transcriptional activity is regulated by Thr26 and Ser184, independent of nucleo-cytoplasmic shuttling. Biochem J. 2005, 391 (Pt 3): 623-629.PubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.