Skip to main content

Avian phenotypic convergence is subject to low genetic constraints based on genomic evidence

Abstract

Background

Phenotypic convergence between distinct species provides an opportunity to examine the predictability of genetic evolution. Unrelated species sharing genetic underpinnings for phenotypic convergence suggests strong genetic constraints, and thus high predictability of evolution. However, there is no clear big picture of the genomic constraints on convergent evolution. Genome-based phylogenies have confirmed many cases of phenotypic convergence in birds, making them a good system for examining genetic constraints in phenotypic convergence. In this study, we used hierarchical genomic approaches to estimate genetic constraints in three convergent avian traits: nocturnality, raptorial behavior and foot-propelled diving.

Results

Phylogeny-based hypothesis tests and positive selection tests were applied to compare 16 avian genomes, representing 14 orders, and identify genes with strong convergence signals. We found 43 adaptively convergent genes (ACGs) associated with the three phenotypic convergence cases and assessed genetic constraints in all three cases, from (amino acid) site mutations to genetic pathways. We found that the avian orders shared few site mutations in the ACGs that contributed to the convergent phenotypes, and that these ACGs were not enriched in any genetic pathways. In addition, different pairs of orders with convergent foot-propelled diving or raptorial behaviors shared few ACGs. We also found that closely related orders that shared foot-propelled diving behavior did not share more ACGs than did distinct orders, suggesting that convergence among these orders could not be explained by their initial genomic backgrounds.

Conclusions

Our analyses of three avian convergence events suggest low constraints for phenotypic convergence across multiple genetic levels, implying that genetic evolution is unpredictable at the phylogenetic level of avian order. Ours is one of first studies to apply hierarchical genomic examination to multiple avian convergent cases to assess the genetic constraints in life history trait evolution.

Background

Stephen Jay Gould argued that, if the tape of life could be replayed, the outcome would be different every time [1]. This suggests that evolution is unrepeatable, but cases of phenotypic convergence—distinct lineages independently reaching a similar phenotype [2]—have long fascinated biologists because they seem to contradict Gould’s hypothesis. While the genetic mechanism underlying phenotypic convergence is largely unclear, it is critical to understanding how evolution operates. For example, investigating how species converge genetically may help us evaluate whether evolution is subject to strong constraints and is thus predictable [3,4,5], i.e., whether evolution is governed solely by probabilities or confined to a limited number of genetic variables rendering predictable phenotypic outcomes [6, 7].

The debate over the molecular underpinnings of phenotypic convergence has lasted for decades [4]. Stayton [8] used a simulation to show that simple traits can converge by chance. However, actual cases of convergence often involve traits that are coded by complex genetic pathways, making it unclear whether most examples of convergence occur by chance. On the other hand, one genetic mutation may impact multiple traits that have antagonistic effects on the organism’s fitness (i.e., pleiotropic effect), and thus the number of effective mutations could be restrained [9]. Therefore, authors argue that different species may find the same genetic solutions in response to similar environmental pressures, causing adaptive convergence [10, 11]. If this is the case, then another question is whether genetic convergence tends to occur at the level of (nucleotide or amino acid) site mutations, individual genes or genetic pathways.

The evolution of convergence in complex traits cannot be fully understood using conventional candidate gene approaches, but instead requires genome-wide analyses [12]. Liu et al. [13] found that bats and toothed whales share 14 derived amino acids in one motor protein, Prestin, which is important for sensing ultrasound and is assumed to be critical for the organisms’ convergent echolocation trait. However, Parker et al. [12] used genome sequence data to show that signatures of convergence between bats and toothed whales are detected in up to 200 genes, many of which are linked to hearing, deafness or vision. Therefore, genome-wide analyses provide a new avenue to examine the molecular mechanisms of phenotypic convergence [14,15,16] and identify novel genes associated with complex traits, even for non-model species.

The whole-genome sequences of many taxa, including almost every avian order, were recently published [17, 18], making it possible to extensively test for genome-wide signatures of convergent evolution. Avian phylogenies based on the above data [17] or genome-wide data from most avian families [19] confirm that there are many cases of phenotypic convergence in birds. Thus, birds provide a good system to examine genomic constraints in phenotypic convergence. In this study, we explore the genomic bases of avian convergence on three traits—nocturnality, raptorial behavior and foot-propelled diving. These three convergent traits are diverse and occur in various parts of the avian phylogeny, and we chose them to generate a broad view of the genetic mechanisms underlying phenotypic convergence among avian orders.

Avian lineages with each of the three convergent traits are widely distributed across all continents except the Antarctic, where only a few species occur [20]. Thus, the convergence of these traits occurs in lineages that are generally sympatric. Early molecular studies based on DNA–DNA hybridization suggested that owls were sister to nightjars and their allies, all of which are active at night [21, 22]. More recent studies have shown that the nocturnality is convergent in owls (234 owl species constituting the order Strigiformes [20]) and nightjars (98 nightjar species and 34 species of their nocturnal allies constituting five families in the order Caprimulgiformes [20]) because they belong to distinct lineages and thus evolved the trait independently [23,24,25]. Owls are more closely related to Coraciimorphae birds such as woodpeckers than to nightjars; nightjars are more closely related to hummingbirds, both of which are Caprimulgimorphae birds, than to owls [17]. Thus, owls and nightjars represent a typical case of convergent evolution [23]. Until around two decades ago, diurnal raptors—birds of prey that generally have hooked beaks and taloned feet and are active in the daytime—were considered to be monophyletic [26,27,28]; however, recent research suggests that diurnal raptors should be divided into two non-sister groups: falcons (66 falcon species constituting the order Falconiformes [20]) and other diurnal raptors (252 eagle, hawk or kite species constituting the order Accipitriformes and seven New World vulture species constituting its sister order, Cathartiformes [20]) [25]. Falcons are more closely related to parrots and passerines, all of which belong to Australaves, than to other diurnal raptors, which belong to Afroaves [17]. Thus, the evidence suggests that raptorial traits shared between falcons and other diurnal raptors have evolved independently.

Grebes (22 species constituting the order Podicipediformes [20]), loons (5 species constituting the order Gaviiformes [20]) and cormorants (40 cormorant species and their foot-propelled diving allies, including four anhinga species, constituting two families in the order Suliformes [20]), also phylogenetically distinct, independently acquired similar foot-propelled diving traits [17, 23]. Loons and grebes were once treated as each other’s closest relatives [29], partly because they both use feet instead of wings to propel through the water and have rearward positioned legs; however, three decades ago, genetic evidence revealed that the two groups are distinct [21]. Cormorants are another avian lineage characterized by foot-propelled diving behavior, but they are also not sister to grebes or loons [19].

Interestingly, loons are phylogenetically closer to cormorants [17], but more similar to grebes in morphology and behavior [30] because the two groups faced similar selective forces. In general, phylogenetically more closely related species share more similar genomic background and ancestral alleles [31], and are thus predicted to achieve convergent traits via more similar genetic bases [32]. Therefore, the foot-propelled diving birds provide a good system to test whether selective forces can surpass phylogenetic closeness and cause adaptive genes to reoccur. Note that, because no study has performed ancestral state reconstruction for nocturnality, raptorial behavior or foot-propelled diving, we cannot rule out the possibility that these traits were present in the common ancestors of the studied lineages and subsequently lost in other lineages. However, we assume that the focal traits in the studied lineages converged because it is the most parsimonious explanation.

In this study, we conducted hierarchical genomic analyses, including phylogeny-based hypothesis tests, positive selection tests and enrichment tests, on the three convergence cases—nocturnality, raptorial behavior and foot-propelled diving—across avian orders (Fig. 1). We aimed to answer two specific questions: (1) Do the three convergent traits show high or low genetic constraints at the different hierarchical levels from site mutations to genetic pathways? (2) Is it phylogenetically or ecologically closer pairs that share more genes associated with the foot-propelled diving convergence? The results help us evaluate the predictability of evolution, especially evolution of the molecular underpinnings of complex traits. In addition, we demonstrate that analyses framed by convergent genome comparison can also help identify candidate genes for phenotypic traits in non-model species.

Fig. 1
figure1

(All bird illustrations were reproduced with permission from Lynx Edicions)

Hypotheses on genomic sequence convergence. The species phylogeny of the focal species is based on Jarvis et al. [17] and was used as the null hypothesis. The three alternative convergence hypotheses, Hnoc, Hfoot and Hrap, forced nocturnal, foot-propelled diving and raptorial birds to form monophyletic clades, respectively

Results

Identification of adaptively convergent genes

We obtained the genome sequences of 16 study species (Table 1), representing 14 avian orders, from an avian genomic project database [17, 33, 34] and chose 4278 orthologous genes containing at least partial sequences for all study species for downstream analyses. We developed a two-step approach to identify “adaptively convergent genes (ACGs)” that might contribute to the convergent traits among the focal taxa. The approach was modified from Parker et al.’s [12] approach, which was criticized by two studies [35, 36]; we used these critiques to improve the approach by establishing reasonable null models for assessing the statistical significance of sequence convergence signals and adding a second step to detect positive selection signals in genes with convergent sequences to enable the identification of ACGs (see “Methods” for details).

Table 1 The 16 avian genomes used in this study

We first estimated protein sequence convergent levels between the avian taxa with phenotypic convergence by comparing the likelihoods that the sequences fit to a species tree (used as the null hypothesis, H0) and three hypothetical trees that forced the convergent taxa into a monophyletic clade (an step modified from Parker et al. [12]; Fig. 1): (1) Hnoc, nocturnal convergence between the barn owl (Tyto alba) and chuck-will’s-widow (Antrostomus carolinensis); (2) Hfoot, foot-propelled diving convergence among the red-throated loon (Gavia stellate), great cormorant (Phalacrocorax carbo) and great crested grebe (Podiceps cristatus); (3) Hrap, raptorial convergence between the peregrine falcon (Falco peregrinus) and three other raptors—the turkey vulture (Cathartes aura), white-tailed eagle (Haliaeetus albicilla) and bald eagle (Haliaeetus leucocephalus). Among the 4278 genes, 352 were found to contain convergent sequences consistent with Hnoc, 206 with Hfoot and 117 with Hrap (i.e., genes with significantly higher likelihoods fitting to the convergent hypotheses than to H0; see “Methods” for details; Fig. 2).

Fig. 2
figure2

ΔGSLS (gene-specific log-likelihood support) values for a Hnoc, b Hfoot and c Hrap and positive selection signals from 4278 coding genes. The smaller negative ΔGSLS values indicate higher support for Hfoot, Hnoc or Hrap and thus higher levels of sequence convergence. Genes showing significantly smaller ΔGSLS values are in blue, and genes showing both signally smaller ΔGSLS values and significant positive selection signals (i.e., adaptively convergent genes) are in red

We then applied the positive selection test to the above sequence-convergent genes to identify ones contributing to adaptive evolution and thus likely driving convergent phenotypical changes (the second step of the approach). For the sequence-convergent genes, positive selection signals were then detected in 24 genes for nocturnal birds, 13 for foot-propelled diving birds and 6 for diurnal raptors (see “Methods” for details;); we considered these ACGs contributing to the focal phenotypic convergence because they had convergent sequences and were under positive selection in the convergent taxa (Fig. 2 and Additional file 1: Table S1).

Evolutionary constraints from site mutations to genetic pathways

To gauge genetic constraints across different genetic hierarchical levels, we first focused on constraints below the gene level by estimating the numbers of positively-selected (amino acid) site mutations shared between convergent taxa (but not other study taxa; see “Methods” for details). We only found a few “adaptively convergent site mutations (ACSMs)” in a small part of the ACGs. We identified nine ACSMs in eight (out of 24) nocturnal ACGs, three ACSM in two (out of 13) foot-propelled diving ACGs and one ACSM in one (out of six) diurnal raptorial ACG. To understand whether the observed numbers of ACSMs surpassed the numbers of convergent site mutations that would be independently obtained by lineages under neutral evolution, we calculated locus-wise “neutral values” (see “Methods” for details) against which we compared the observed values. With neutral values summed up by hypotheses, we found that the numbers of observed ACSMs were higher than or close to the neutral expectations (= 3.0440 sites for the nocturnal ACGs, 0.0001 sites for the foot-propelled diving ACGs and 1.0878 sites for the diurnal raptorial ACGs, Additional file 1: Table S2). That is, the ACGs identified by our approach indeed had generally higher levels of sequence convergence than did the neutral expectation. However, given that ACSMs were only found in a few ACGs, divergent amino acid replacements in most ACGs might still cause phenotypes to converge, a finding consistent with previous studies [37]. The results suggest low constraint for phenotypic convergence at the site mutation level.

We then assessed constraints above the level of individual genes by examining whether the ACGs were highly enriched in particular genetic pathways or functional groups. We conducted DAVID [38] analyses and found that none of the ACGs of the three convergent characters showed significant signals of functional enrichment. Furthermore, we conducted STRING analyses [39] and found that these ACGs had no significantly enriched signal for any protein–protein interaction (PPI) or protein functionality. In addition, we conducted the enrichment tests on expanded datasets, including ACGs and sequence-convergent genes that showed a positive selection signal in only one of the lineages with the focal convergent traits (n = 40, 29 and 11 for nocturnal, foot-propelled diving and raptorial convergence, respectively; Additional file 1 Table S3). The DAVID analyses also returned no enrichment signal. The STRING analyses showed no PPI enrichment, but did return a functional enrichment signal in two genes (PLCG1 and PLCXD2; false discovery rate = 0.0029) for the raptorial convergence. However, the enriched term (PLC-like phosphodiesterase, TIM beta/alpha-barrel domain superfamily) is only related to the protein structure, and the two genes share no biological function. Take together, these results indicate low constraint above the gene level for convergent evolution.

To infer evolutionary constraints at the individual gene level, we formulated two hypothetical trees derived from Hfoot and Hrap to force different convergent pairs to form clades (i.e., Hfoot-a, Hfoot-b, Hrap-a and Hrap-b; Fig. 3; see “Methods” for details). If the ACGs supporting the two derived hypotheses largely overlapped in each case of phenotypic convergence, then the results would suggest strong genetic constraint. For the foot-propelled diving convergence, we found 12 ACGs supporting Hfoot-a and 12 supporting Hfoot-b (Fig. 4 and Additional file 1: Table S4). However, only three of these ACGs were found in both of the above lists (Fig. 4 and Additional file 1: Tables S1 and S4). To obtain a neutral expectation for the overlap ratio, we calculated the number of genes with neutrally convergent sequences (i.e., sequence-convergent genes that were not ACGs) to be 82 for Hfoot-a and 87 for Hfoot-b, and found that 42 of them overlapped. We found that the proportion of overlapping ACGs between Hfoot-a and Hfoot-b was not significantly higher than that of the neutrally sequence-convergent genes (p = 0.1299, two-tailed Fisher’s exact test). For the raptorial convergence, we found 16 ACGs supporting Hrap-a and 12 supporting Hrap-b; only one of these was found in both of the above lists (Additional file 1: Tables S1 and S4). The number of neutrally sequence-convergent genes was 130 sequences for Hrap-a and 162 for Hrap-b, and 20 of these overlapped. The proportion of overlapping ACGs between Hrap-a and Hrap-b was also not significantly higher than that of the neutrally sequence-convergent genes (p = 0.7055, two-tailed Fisher’s exact test). Thus, adaptive selection regarding foot-propelled diving and raptorial behavior did not tend to use the same genes to reach phenotypic convergence between different convergent pairs, suggesting low constraint at the gene level.

Fig. 3
figure3

(All bird illustrations were reproduced with permission from Lynx Edicions)

Derived hypotheses for genomic sequence convergence. Three convergence hypotheses (Hfoot-a, Hfoot-b and Hfoot-c) were derived from the foot-propelled diving convergence hypothesis (Hfoot in Fig. 1) and two (Hrap-a and Hrap-b) from the raptorial convergence hypothesis (Hrap in Fig. 1)

Fig. 4
figure4

Numbers of adaptively convergent genes that support Hfoot-a, Hfoot-b and Hfoot-c. The Venn diagram was generated using the R package, VennDiagram [78]

Relationships between phylogenetic distance and molecular convergence

A trend in which more ACGs are shared between the phylogenetically closer taxa than the ecologically closer taxa would suggest that genetic background plays a dominant role in convergent evolution; from there, evolutionary processes and outcomes could be predicted. To determine whether there was a trend in the present study, we formulated a third hypothesis derived from Hfoot (i.e., Hfoot-c, in which loons and cormorants were forced into a clade; Fig. 3), and found that the number of ACGs shared between loons and cormorants (n = 10) was similar to those between loons and grebes (i.e., Hfoot-a, n = 12; p = 0.67, chi-square test; Fig. 4 and Additional file 1: Table S4). That is, there were not more ACGs associated with convergence between loons and cormorants, which were phylogenetically closer, than such ACGs between loons and grebes, which were more phenotypically similar. Thus, the shared genetic mechanism underlying foot-propelled diving convergence could not be predicted by phylogenetic closeness. In addition, only one ACG overlapped between the two pairs (Fig. 4) and the overlap level was not higher than the neutral expectation (p = 1, two-tailed Fisher’s exact test).

Discussion

Even though Gould’s [1] theory about replaying the tape of evolutionary history cannot be physically tested, the convergence of lineages that face similar selective forces presents a reasonable analogy [40]. Additionally, in studying one single lineage, it is difficult to remove linage-specific historical signals from genotype–phenotype associations; however this can be overcome by comparing independent lineages with convergent traits [41]. In this study, we applied comparative genomics approaches to three cases of phenotypic convergence in birds to decipher the genetic codes for complex behavioral adaptations. Our results suggest that evolution is subject to low genetic constraint, implying that evolution is unpredictable—at least at the level of avian order.

Low genetic constraints for avian convergence, from site mutations to genetic pathways

The numbers of convergent genes identified in our study are lower than those of other genome-level studies. This study shows that the convergence of nocturnality, foot-propelled diving and raptorial behavior can be attributed to at least 24, 13 and 6 ACGs, respectively. In contrast, Parker et al. [12] found nearly 200 genes associated with echolocation convergence between bats and dolphins; Chikina et al. [41] identified hundreds of genes associated with convergence in marine mammals. However, it should be noted that our analytical approaches, which apply restrictive statistical criteria to determined ACGs (see “Methods” for details), are not identical to theirs. Furthermore, the targeted phenotypic traits in this and previous studies have different levels of complexity, making a direct comparison unproductive. Nevertheless, the three convergence cases examined in our study suggest that the numbers of ACGs underlying complex life history traits can vary. The relatively low numbers of ACGs found in our study may also imply that these taxa do not extensively share the genetic bases of their convergent phenotypes, a conclusion further supported by our other analyses, discussed below.

Most ACGs identified in this study did not have any ACSMs, and those that did only had one or two. These convergent genes were also barely enriched in any genetic pathways or biological function and showed no significant protein interactions among one another. Given that complex behavioral traits are likely determined by multiple genes associated with particular functional groups [12, 41], the above results imply that the taxa we analyzed might use different genes in their genetic pathways or functional groups to generate similar phenotypic changes. Furthermore, our tests of derived convergence hypotheses (i.e., Hfoot-a, Hfoot-b, Hrap-a and Hrap-b) showed that overlaps in ACGs among pairs of avian orders with the same convergent phenotypes were not different from the neutral expectation, indicating low constraint at the gene level. Therefore, these cases of convergent evolution are subject to low genetic constraints, from site mutations [42], to genes and genetic pathways [43]. The results indicate that the evolutionary history of these convergent traits is unlikely to be predictable from a genomic perspective; this is consistent with the contingent view of evolution proposed by Gould [1] and some recent studies ([42, 44, 45]; but see [9, 14, 40, 46]).

Selection force surpasses phylogenetic closeness in causing adaptive genes to reoccur in foot-propelled diving convergence

Another way to assess the predictability of evolution is to examine whether more closely related taxa tend to use more similar genetic mechanisms to generate convergent phenotypes [41]—i.e., whether the probability of gene reuse in convergent phenotypic evolution is determined by the initial (genomic) states of focal organisms. Even though phenotypic convergence is often treated as the results of similar selection pressures imposed by shared environmental challenges (i.e., external factors), it can also be attributed to limited genetic variations (i.e., internal factors; [31]). Phylogenetic relationships may impact the availability of shared genetic variations because more closely related taxa have a better chance of sharing genetic variations passed down from their common ancestors and thus use them to generate the same phenotypes [31]. It is also possible that the same amino acid substitutions can have the same function only under similar genomic backgrounds that tend to be found among closely related taxa [42, 47]. If phenotypic convergence is mainly caused by the limit in available genetic variations or similar genomic backgrounds, then we can expect to find more ACGs between closely related taxa than distinct taxa. In contrast, if natural selection is the dominant force driving convergent evolution, then ACGs between distinct taxa should be at least as common as those between closely related taxa [31].

In this study, we found that foot-propelled diving convergent lineages with closer phylogenetic relationship did not share more ACGs. Loons and grebes have high levels of morphological and behavioral similarities, but are distinct lineages [21, 23]; cormorants, another avian lineage characterized by foot-propelled diving behavior, are more closely related but less phenotypically similar to loons than are grebes [19]. Interestingly, our results showed that loons shared a similar number of ACGs with cormorants to they did with grebes, suggesting that adaptive convergence between the foot-propelled diving lineages is not channeled by their initial genomic backgrounds. In addition, our results also showed low ACG overlap between the two convergent hypotheses (Fig. 4), corroborating that the molecular underpinnings of the foot-propelled diving lifestyle are unrepeatable.

Several previous studies reported negative correlations between phylogenetic distance and convergent gene numbers between convergent taxa [40, 47], suggesting evolutionary processes and outcomes predictable. For example, more closely related birds were shown to have a more similar molecular basis for their hemoglobin [42]. However, some suggested that convergent evolution can only be predicted (i.e., subject to strong constraint) when focusing on a suitable phylogenetic scale, in which species share similar genomic backgrounds [48]. Our study provides evidence that natural selection can be as important as or even outweigh phylogenetic closeness, causing the same genes to reoccur in avian orders that diverged up to nearly 65 million years ago [19]. Our results suggest that shared history is not always the predominant forces driving convergent evolution.

Our findings have two opposing implications. On the one hand, evolution in birds is less predictable at phylogenetic levels equal to or higher than the level of order. This is perhaps because the different avian orders have very different genomic backgrounds, but different genotypes can yield almost identical phenotypes [42, 47]. On the other hand, we found 12 ACGs between grebes and loons, meaning that the phylogenetic distinctness between them still does not completely preclude the possibility that shared genes are responsible for their phenotypic convergence. Even if we attribute convergent evolution to natural selection, it cannot completely rule out the possible impact of constraint imposed by shared genomic backgrounds [49]. Thus, a dichotomous view of convergent evolution’s driver as either natural selection or phylogenetic closeness may not be justified [31]. Essentially, natural selection can only act on genetic variation that is available [3, 40].

Functional interpretation of ACGs and the genotype–phenotype relationship

Our approach to comparing genomes based on convergent taxa is a promising one for establishing genotype–phenotype relationships, a central issue in evolutionary biology. In the past, searching for candidate genes for phenotypic changes has relied on genes with known functions in model species [46] or genomic data from multiple populations with divergent traits [47]. Now, the increasing number of published genomes of species with convergent phenotypic traits has become an alternative resource for finding candidate genes for phenotypic changes. Here, we identify 43 candidate genes for future functional examination that are strongly associated with one of the three convergent evolution cases (Additional file 1: Table S1).

Owls and nightjars have unique, nocturnal lifestyles that might be explained by a nocturnal ACG, GRIN1, which is central in the circadian photic entrainment pathway [50]. Owls and nightjars have relatively large orbits for visual acuity at night [51]; this might be associated with INHBA, mutations of which result in enophthalmos, a condition in humans in which the orbits are enlarged [52]. Foot-propelled diving birds have morphologically and physiologically adapted to aquatic environments. For instance, auditory senses might be crucial for diving birds because dim light in deep water may obscure their vision for hunting; the great cormorant was found to have acute underwater hearing comparable to seals and toothed whales [53]. A foot-propelled diving ACG, KCNC1, which may impact auditory neuron functioning in mice, may allow these birds to acutely hear underwater [54]. Diurnal raptors are strong predators that often use their feet to grasp preys. Although the connection between their grasping behavior and muscular or neuronal adaptation remains unclear, one raptorial ACG, CREBRF, which is associated with the hind limb grasping behavior of mice [55], might contribute to this trait in raptors.

The ACGs that we found to be associated with circadian rhythms in nocturnal birds, auditory adaptation in foot-propelled diving birds and limb grasping in raptors may have medical applications. These avian ACGs are associated with relevant traits in mice or humans, suggesting that birds and mammals could use the same genes to generate convergent traits. Thus, the genotype–phenotype relationships revealed in birds may provide insights to genetic disorders in humans. A better understanding of these ACGs will grant us a better ability to detect people carrying genetic variations associated with circadian rhythm, auditory or muscle problems.

Conclusions

Our genome-wise analyses consistently identified low genetic constraints in three cases of phenotypic convergence—nocturnality, raptorial behavior and foot-propelled diving—across avian orders. The ACGs that contributed to the focal phenotypic convergence were examined from site mutation to gene and functional pathway levels. We observed (1) few ACSMs among focal taxa, (2) low levels of ACG overlap between taxon pairs with the same convergent phenotypes, and (3) no functional pathways enriched with ACGs; these observations indicate that the focal taxa use largely different molecular bases to achieve phenotypic convergence. The results suggest low genetic constraints across multiple genetic levels. We also found that the frequency of gene reuse in convergent phenotypic evolution is not determined by the initial genomic background of convergent taxa. Taken together, our findings suggest that convergent evolution among avian orders is unpredictable at the molecular level.

While some studies have addressed the genetic basis of convergent evolution using only a few genes associated with particular physiological function [11, 42, 56], our genome-wide assessment of complex traits provides a broad view to explicitly test evolutionary hypotheses such as genetic constraint in phenotypic traits. Our study also demonstrates that genome-wide comparisons among species with phenotypic convergence facilitate the identification of new genes underlying phenotypic traits for non-model species; this practice will become feasible in diverse taxa as genomic data is accumulating rapidly. Although the ACGs identified in this study require further functional tests, they may have medical applications or become new genetic models for functional trait evolution.

Methods

Data collection

The coding DNA sequences and protein sequences of 8295 orthologous genes from 48 avian species genomes were obtained from an avian genomic project dataset on the GigaScience Database (https://gigadb.org/dataset/101000; [17, 33, 34]). For the detailed methods used to identify and align the orthologous genes, please refer to the two original studies [17, 33]. In brief, the authors identified orthologous genes across the 48 avian species genomes based on sequence identity to the chicken, zebra finch and human genomes [33]. Firstly, Zhang et al. [33] used synteny in the chicken-zebra finch genome alignment from UCSC [57] to generate 12,484 orthologous genes between the two species. The protein sequences of these orthologous genes were blasted against the human genome. For each pair of chicken-zebra finch orthologous genes, they used the one with the better alignment rate as the reference gene for homology-based gene prediction in the 48 avian genomes; they then generated 13,000–18,000 predicted coding genes. Jarvis et al. [17] re-annotated the orthologous genes in the chick genome based on the zebra finch ones using Genewise [58] and vice versa. The re-annotated reference dataset was used to generate 8295 orthologous genes across the 48 avian genomes based on protein similarity, gene synteny and genome synteny. Jarvis et al. [17] then performed the alignment of orthologous genes using SATé [59] and Prank [60] for the first round and SATé and MAFET [61] for the second round. They used protein sequences to build the alignment and then translated them back to DNA sequences using a custom Perl script.

In this study, we extracted the DNA and protein sequences of 4278 genes from the above alignment dataset, which contained entire or partial sequences for all 16 study species—red-throated loon (Gavia stellate, abbreviated as GAVST), great cormorant (Phalacrocorax carbo, PHACA), great crested grebe (Podiceps cristatus, PODCR), chuck-will’s-widow (Antrostomus carolinensis; older name: Caprimulgus carolinensis, CAPCA), barn owl (Tyto alba, TYTAL), chicken (Gallus gallus, GALGA), ostrich (Struthio camelus, STRCA), zebra finch (Taeniopygia guttata, TAEGU), turkey vulture (Cathartes aura, CATAU), peregrine falcon (Falco peregrinus, FALPE), white-tailed eagle (Haliaeetus albicilla, HALAL), bald eagle (Haliaeetus leucocephalus, HALLE), Anna’s hummingbird (Calypte anna, CALAN), little egret (Egretta garzetta, EGRGA), American flamingo (Phoenicopterus ruber, PHORU) and downy woodpecker (Dryobates pubescens; older name: Picoides pubescens, PICPU; Table 1)—using a custom Perl script. Only 16 of the 48 species published by Jarvis et al. [17] were chosen for analyses because (1) these species were enough to test the proposed hypotheses (see below for details), (2) more species would lead to fewer genes that could be aligned due to missing data, and (3) we aimed to reduce biases introduced by species that show partial levels of the focal characteristics, such as semi-nocturnality [62] or diving adaptation. We removed gaps, ambiguous sites and in-frame stop codons from the coding sequences using PAL2NAL [63].

Two-step ACG identification approach (step one): testing genomic sequence convergence

We developed a two-step approach to identify genes that contribute to the phenotypic convergence of the study taxa. The first step was to find genes with sequences that were convergent among phenotypically convergent taxa (referred to as sequence-convergent genes). The second step was to identify sequence-convergent genes that showed positive selection signals in all of the focal convergent taxa; the genes identified were referred to as “adaptively convergent genes” (ACGs). The first step was modified from Parker et al.’s [12] site-specific log-likelihood support (SSLS) approach. Because the SSLS approach has been criticized by two studies [35, 36] for failing to establish reasonable null models, we modified this step to address the critiques (see below for details) and added the second step to increase the chance of identifying genes causing phenotypic convergence, analogous to Thomas and Hahn’s [35] approach.

For the first step, we estimated the level of gene-wise protein sequence convergence for avian orders with phenotypic convergence by testing three convergent hypotheses (Fig. 1)—(1) Hnoc, nocturnal convergence between one Caprimulgiformes species (CAPCA) and one Strigiformes species (TYTAL); (2) Hfoot, foot-propelled diving convergence among three water birds in Gaviiformes, Suliformes and Podicipediformes (i.e., GAVST, PHACA and PODCR, respectively); (3) Hrap, raptorial convergence between three species from Accipitriformes/Cathartiformes (HALAL, HALLE and CATAU) and one from Falconiformes (FALPE). A species tree of the 16 study species modified from Jarvis et al. [17] was used as the null hypothesis (H0). For each convergent hypothesis, we restricted the hypothetical tree topology by forcing each convergent group into a monophyly. When moving one or more of the focal species in the phylogenetic tree to generate a hypothetical convergent clade, we consistently made this clade far from, rather than close to, STRCA (i.e., the first lineage to branch out in the avian phylogeny). We used PAML 4.9 [64] to estimate gene-specific log-likelihood support (GSLS), which measured how well each gene’s protein sequences fit to the species tree (H0) and hypothetical trees (Hrap, Hfoot and Hnoc). The GSLS approach was modified from Parker et al.’s [12] SSLS approach to estimate a likelihood for each gene instead of each site. The differences between the GSLS values of H0 and alternative convergent hypotheses (ΔGSLS(Hn) = GSLS(H0) − GSLS(Hn)) were used to test whether the targeted genes showed sequence convergence among the focal convergent taxa. Lower negative ΔGSLS values indicated higher levels of support for sequence convergence.

Studies [35, 36] have criticized Parker et al.’s [12] SSLS approach for failing to establish reasonable null models for ΔSSLS values by taking into account the signals from non-convergent pairs. To address the critiques, we conducted simulations based on a branch-moving approach to generate null models for assessing the statistical significance of empirical ΔGSLS values. This approach aimed to control random chances of sequence convergence among the study taxa. We used Mesquite v3.5 [65] to randomly switch one and two branches in the species tree (H0) to generate simulated trees corresponding to the numbers of branch switching for Hnoc/Hrap and Hfoot, respectively. That is, we simulated all possible restricted monophylies for any two or three study taxa, both convergent and non-convergent ones, against the species tree with the same amount of phylogenetic distortion from H0 to Hnoc/Hrap or Hfoot. In total, there were 663 and 18,570 possible tree topologies generated in the one- and two-branch-moved simulations, respectively. Given the large number of possible topologies in the latter, we randomly sampled 1000 topologies from it for subsequent analyses (i.e., ΔGSLS calculation) to make computational time more manageable. The ΔGSLS values of the 663 one-branch-moved and 1000 two-branch-moved trees were estimated for each of the 4278 examined genes. That is, two sets of simulated ΔGSLS values (one for 663 one-branch-moved trees v.s. the species tree, and the other for 1000 two-branch-moved trees v.s. the species tree) for each gene were generated and used as the null distributions of ΔGSLS values.

For the one-branch-moved simulation (corresponding to Hnoc and Hrap), we calculated the 5th percentile of the simulated ΔGSLS distribution (i.e., the lower boundary of its one-sided 95% confidence interval, CI), and used it as a threshold value to determine whether the empirical ΔGSLS value was significantly low. The procedures were conducted separately for each of the 4278 examined genes. Only genes with an empirical ΔGSLS value smaller than their corresponding threshold values were treated as having significantly convergent sequences (i.e., sequence-convergent genes) for the focal convergence hypotheses.

For the two-branch-moved simulation (corresponding to Hfoot), given that we could not estimate ΔGSLS values for all possible simulated trees due to extremely long computing time, we used a bootstrapping method to estimate the 5th percentile of its complete ΔGSLS distribution. We bootstrapped the ΔGSLS values from the simulated datasets (i.e., randomly sampled them with replacement to their original sample size, 1000) to obtain one bootstrap resample of ΔGSLS values. We then calculated the 5th percentile of the bootstrap resample. We repeated the bootstrap procedure 1000 times and calculated the mean of the 1000 bootstrap 5th percentiles, which was then used it as a threshold value to determine the statistical significance of the empirical ΔGSLS value. The procedures were conducted separately for each of the 4278 examined genes. To confirm that this bootstrap approach could generate estimated 5th percentiles close to true values, we also applied the same procedure to the one-branch-moved simulated dataset, for which we knew the true 5th percentile of the complete ΔGSLS distribution for every examined gene. For the one-branch-moved dataset, we found that (1) true 5th percentiles were always within the 95% CIs of the bootstrap 5th percentiles for all genes, (2) the difference between the true and estimated 5th percentiles (i.e., the means of 1000 bootstrap 5th percentiles) only averaged 5% of the full ranges of the bootstrap values, and (3) genes that were concluded to be significantly sequence-convergent based on the estimated 5th percentiles were the same as those based the true 5th percentiles. The above results suggest that this bootstrap approach was reliable. We used the R package boot [66] to conduct the bootstrap analyses.

Two-step ACG identification approach (step two): testing positive selection signal

For the second step, we further examined whether the genes with significant sequence convergence between phenotypically convergent orders (i.e., with significantly low ΔGSLS values; sequence-convergent genes) were also under positive selection. Only genes with significant sequence convergence and positive selection signals in phenotypically convergent taxa were treated as contributing to the convergence of the focal traits and referred to as ACGs. The selection tests were based on the dN/dS ratios (ω; dN, non-synonymous substitutions; dS, synonymous substitutions) of DNA sequences, estimated by PAML. We conducted the branch-site models to compare the likelihood of the species tree with pre-specified branches under positive selection (i.e., foreground; the model A) with that of the same phylogeny with ω2 fixed to 1 (i.e., foreground as neutral; the corresponding null model). The foreground taxa were the focal taxa for each convergence hypothesis. We used the likelihood ratio test to examine whether the differences between the likelihood of these two models were large enough to conclude that the genes of the focal taxa have experienced positive selection. A significance level of 0.05 for the likelihood ratio corresponded to a threshold χ2 value of 3.84 [64].

To rule out the possibility that genes showing positive selection signals in the foreground taxa were also subject to selection in some of the background taxa (i.e., the non-focal taxa), we applied the site models to a species tree that excluded the focal taxa. We compared the likelihood of the M1a model (as a neutral model) against that of the M2a model (as a positive selection model) for genes with the significant results of the branch-site model A. A significance level of 0.05 for the likelihood ratio for the site models corresponded to a threshold χ2 value of 5.99 [64]. If genes showed significant results for both branch-site and site models, then they were excluded from subsequent analyses; one to five sequence-convergent genes significantly supported both models for each convergence hypothesis. Therefore, we focused on genes that showed positive selection signals in the focal taxa but not other taxa.

Ones might argue that our approach risks misidentifying AGCs. That is mainly because the results of the ΔGSLS test (i.e., sequence-convergent genes identified in step one) might also reflect phylogenetic discordance between the species tree and gene trees caused by evolutionary events other than adaptive convergence, including paralogs of genes, introgression or incomplete lineage sorting [67]. We argue that these confounding factors had limited impact on our final results because (1) the genes we examined were orthologs [17], (2) the second step of our approach could largely exclude sequence-convergent genes with no selection signal (i.e., non-adaptive genes) in the focal taxa, and (3) even if, in some cases, sequence-convergent genes caused by incomplete lineage sorting or introgression might not be excluded by the selection tests, their impact on our conclusion should be relatively minor (see below for details).

Here we used the species tree for the positive selection tests because selection may bias the estimation of gene trees, and thus the species tree better represents the evolutionary history of adaptive genes. For example, convergent taxa could be a monophyly in a reconstructed gene tree when selection independently retained convergent mutations in distinct lineages (i.e., more shared mutations do not essentially mean a closer phylogenetic relationship between two taxa in a gene’s history) [e.g., see Ref 13]. Such incorrectly reconstructed gene trees could lead positive selection tests to yield false-negative results due to an underestimated number of substitutions along the foreground branches. On the other hand, we acknowledged that use of the species tree, instead of gene trees, may potentially lead to false-positive selection test results if taxa with phenotypic convergence form a monophyly in the latter due to incomplete lineage sorting or introgression. This is because the former might assume more substitutions along foreground branches than the latter [68]. To assess the prevalence of this condition, we first reconstructed maximum likelihood gene trees for the 43ACGs identified in this study using IQ-TREE v.1.6.1 [69, 70] and found that the convergent taxa formed a monophyly in only around 1/6 of the trees (6 out 24 ACGs, 2 out of 13 ACGs and 1 out of 6 ACGs for nocturnal, foot-propelled diving and raptorial convergence, respectively) (Additional file 1: Fig. S1). Thus, false-positive results should be relatively rare in selection tests. Even if there were any false-positive results, our conclusion that these convergent phenotypes have low genetic constraint would remain unchanged because the ACGs could only decrease in number, suggesting an even lower level of constraint.

Secondly, if incomplete lineage sorting erroneously increased the identification of sequence-convergent genes and then caused false-positive results in the selection tests, we would expect the biased effect to be worse in closely related than distinct convergent taxa. That is because the coalescence time between closely related taxa is shorter than that between distinct taxa, leading to higher chances of incomplete lineage sorting. If so, the number of ACGs supporting Hfoot-c should be higher than that of Hfoot-a because the former forced a monophyly for a pair of more closely related taxa (loon-cormorant) than the latter (loon-grebe; see the section “Relationships between phylogenetic distance and molecular convergence in foot-propelled birds” for the details of the two hypotheses). However, the two hypotheses returned similar numbers of ACGs, suggesting weak impact of incomplete lineage sorting on identifying ACGs.

Thirdly, the sequences we used in this study were amino acid or coding regions, which might be less subject to incomplete lineage sorting than non-coding regions due to smaller effective population sizes [71]. Overall, we believe that the biased effect of incomplete lineage sorting or introgression on our results was relatively minor and did not affect our conclusion.

Evolutionary constraint below, above and at the gene level

We assessed the evolutionary constraints below the gene level based on the numbers of shared (amino acid) site mutations between convergent taxa. If the ACGs had a large number of convergent site mutations with positive selection signals, then the focal phenotypic convergence was likely caused by a high degree of convergence at those site mutations, suggesting strong evolutionary constraint below the level of individual genes. In contrast, a low number of convergent site mutations would suggest low evolutionary constraint below the gene level.

We used the Bayes empirical Bayes (BEB) approach implemented in PAML to calculate the posterior probabilities of site-wise positive selection signals for the ACGs identified in the last section. If the posterior probability that ω > 1 at one site was greater than 95% (i.e., significant positive selection signal), then we further examined whether its sequences converged in the focal taxa but not others using Jalview [72]. We referred to sites with significant BEB signals of positive selection and convergent sequences among the focal taxa as “adaptively convergent site mutations” (ACSMs). These ACSMs were assumed to contribute to the corresponding types of phenotypic convergence.

For each ACG, we further calculated the expected number of convergent site mutations (M) that could arise through neutral evolution and compared this to the number of identified ACSMs. We used this test to examine whether the ACGs identified by our approach had higher sequence convergent levels than the neutral expectation. For the nocturnal and the raptorial convergent hypotheses, such M values were each calculated as (D − P)/20 × L, where D, P and L were mean pairwise Poisson-correction distance, mean pairwise p distance and locus length, respectively; thus, M represented the number of multiple-substitution sites between two distinct lineages that each such site used the same amino acid residue among a total of 20 options. For the foot-propelled diving hypothesis, we calculated each M value as [(D − P)/20]2 × L (the number of multiple-substitution sites between three distinct lineages that each such site used the same amino acid residue). D and P were calculated in MEGA X [73] with pairwise deletions for gaps and a uniform rate among sites, and were computed only between non-convergent birds depending on the hypothesis considered, so as to avoid any non-neutral signals. Provided that, for each ACG, both observed (ACSM number) and expected (M) values were small and thus subject to stochastic errors, we summed up values over genes based on hypotheses to make comparisons (Additional file 1: Table S2).

We also assumed that more ACGs identified along genetic pathways meant that the focal taxa experienced stronger constraint above the gene level during their convergent evolution. Thus, we assessed the extent to which the focal taxa shared ACGs along genetic pathways or within functional groups to generate their convergent traits by examining the functional relationship among the ACGs. We did this by testing for functional enrichment in gene ontology (GO) terms, KEGG pathways, UP_TISSUE and INTERPRO databases for the ACGs using DAVID 6.8 (https://david.ncifcrf.gov/, [38]) We applied the Fisher’s exact test with the Benjamini–Hochberg correction [74]. We also used STRING 11.0 (https://string-db.org/, [39]) to assess over-representation in protein–protein interactions (PPI) and functional enrichment for the ACGs that imply functional partnerships among them. In addition, we also considered genes that contributed to adaptive convergence in only one of the focal lineages; the analyses might facilitate the identification of overlapped pathways and functions among different adaptive genes, although such involved analyses might somewhat overestimate the overlap level due to the possibility of including genes that did not cause the convergent traits. In practice, we applied the DAVID and STRING tests to datasets containing ACGs and sequence-convergent genes with positive selection signals in one (set as the foreground of branch-site models for the PAML analyses) of the convergent lineages. Both DAVID and STRING used all of the 4278 genes as a background list.

We further formulated two hypotheses derived from Hfoot and Hrap (Fig. 3) to examine the evolutionary constraint at the gene level. In one derived hypothesis (Hfoot-a) from Hfoot, we forced GAVST and PODCR together in a clade and excluded PHACA; in the other hypothesis (Hfoot-b), we forced PHACA and PODCR into a clade, without GAVST (Fig. 3). In one derived hypothesis (Hrap-a) from Hrap, we forced FALPE and CATAU into a clade, not including HALAL and HALLE; in the other hypothesis (Hrap-b), we forced FALPE, HALAL and HALLE into a clade, without CATAU (Fig. 3). We performed sequence convergence hypothesis tests for Hfoot-a, Hfoot-b, Hrap-a and Hrap-b against H0, respectively. We applied the Benjamini–Hochburg correction to the multiple derived hypotheses that focused on the same phenotypic convergence. First, we calculated the percentile rank (assuming Xth percentile) of the derived hypothesis’ ΔGSLS value in the one-branch-moved simulated values (for 663 possible trees) for each of the 4278 examined genes. We then corrected the X values with the Benjamini–Hochburg procedure using the R function “p.adjust” and concluded that genes have significantly convergent sequences (i.e., sequence-convergent genes) only if their adjusted X values < 5. Finally, we conducted positive selection tests for the sequence-convergent genes to identify ACGs, as what we did for the original hypotheses. We compared the ACGs supporting Hfoot-a with those supporting Hfoot-b, and the ACGs supporting Hrap-a with those supporting Hrap-b to measure the levels of shared ACGs between pairs of convergent lineages. We then compared the overlap proportion of ACGs with that of neutrally sequence-convergent genes (i.e., sequence-convergent genes that were not ACGs) between Hfoot-a and Hfoot-b or between Hrap-a and Hrap-b using the two-tailed Fisher’s exact test. If the ACGs supporting the two derived hypotheses had a significantly higher overlap level than did that of the neutrally sequence-convergent genes, then the results suggested that natural selection tended to use the same genes to reach to phenotypic convergence; such results indicate strong evolutionary constraint at the gene level.

Relationships between phylogenetic distance and molecular convergence in foot-propelled birds

We used the foot-propelled diving birds as a system to examine the relative effects of natural selection and phylogenetic closeness on genes that make up convergent phenotypes. We formulated a third hypothesis derived from Hfoot (Hfoot-c; the first two were Hfoot-a and Hfoot-b, shown above). In Hfoot-c, we forced GAVST and PHACA into a clade, without PODCR (Fig. 3). We performed the sequence convergence hypothesis tests with the Benjamini–Hochburg correction (see previous section) and then conducted positive selection tests to examine whether there were more ACGs supporting Hfoot-c than Hfoot-a (which forced GAVST and PODCR into a clade). By doing so, we could determine whether the level of genetic convergence was higher in the phylogenetically closer pair (GAVST and PHACA) or the ecologically closer pair (GAVST and PODCR). This examination might illuminate the predictability of evolution. If the initial background of convergent taxa could predict the convergent level of their genetic underpinnings, then we would expect the phylogenetically closer taxa to share more ACGs than the ecologically closer taxa.

Availability of data and materials

The analyzed genome sequences are publicly available and were downloaded from the GigaScience Database (https://gigadb.org/dataset/101000; [34]). The DNA and protein sequence alignments and in-house developed codes used in this study are available in the FigShare repository, https://doi.org/10.6084/m9.figshare.13175135.

Abbreviations

ACG:

Adaptively convergent gene

H0 :

Null hypothesis of phylogeny

Hnoc :

Hypothetical phylogeny for nocturnal convergence

Hfoot :

Hypothetical phylogeny for foot-propelled diving convergence

Hrap :

Hypothetical phylogeny for raptorial convergence

ACSM:

Adaptively convergent site mutation

PLCG1 :

Phospholipase C Gamma 1

PLCXD2 :

Phosphatidylinositol Specific Phospholipase C X Domain Containing 2

Hfoot-a :

Derived hypothetical phylogeny for foot-propelled diving convergence, type a

Hfoot-b :

Derived hypothetical phylogeny for foot-propelled diving convergence, type b

Hfoot-c :

Derived hypothetical phylogeny for foot-propelled diving convergence, type c

Hrap-a :

Derived hypothetical phylogeny for raptorial convergence, type a

Hrap-b :

Derived hypothetical phylogeny for raptorial convergence, type b

GRIN1 :

Glutamate Ionotropic Receptor NMDA Type Subunit 1

INHBA :

Inhibin Subunit Beta A

KCNC1 :

Potassium Voltage-Gated Channel Subfamily C Member 1

CREBRF :

CREB3 Regulatory Factor

UCSC:

University of California Santa Cruz Genome Brower database

GAVST:

Gavia stellate

PHACA:

Phalacrocorax carbo

PODCR:

Podiceps cristatus

CAPCA:

Caprimulgus carolinensis

TYTAL:

Tyto alba

GALGA:

Gallus gallus

STRCA:

Struthio camelus

TAEGU:

Taeniopygia guttata

CATAU:

Cathartes aura

FALPE:

Falco peregrinus

HALAL:

Haliaeetus albicilla

HALLE:

Haliaeetus leucocephalus

CALAN:

Calypte anna

EGRGA:

Egretta garzetta

PHORU:

Phoenicopterus ruber

PICPU:

Picoides pubescens

SSLS:

Site-specific log-likelihood support

GSLS:

Gene-specific log-likelihood support

BEB:

Bayes empirical Bayes

M :

Expected number of convergent site mutations

D :

Poisson-correction distance

P :

Mean pairwise p distance

L :

Locus length

References

  1. 1.

    Gould SJ. Wonderful life: the Burgess shale and the nature of history. New York: WW Norton & Company; 1990.

    Google Scholar 

  2. 2.

    Conway Morris S. Evolution and convergence: some wider consideration. In: The deep structure of biology: is convergence sufficiently ubiquitous to give a directional signal? West Conshohocken (PA): Templeton Foundation Press; 2009. p. 46–67.

  3. 3.

    Jacob F. Evolution and thinking. Science. 1977;196:1161–6.

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    Losos JB. Convergence, adaptation, and constraint. Evolution. 2011;65:1827–40.

    Article  PubMed  Google Scholar 

  5. 5.

    Flohr RCE, Blom CJ, Rainey Paul B, Beaumont HJE. Founder niche constrains evolutionary adaptive radiation. Proc Natl Acad Sci. 2013;110:20663–8.

    CAS  Article  PubMed  Google Scholar 

  6. 6.

    Brakefield PM. Evo-devo and constraints on selection. Trends Ecol Evol. 2006;21:362–8.

    Article  PubMed  Google Scholar 

  7. 7.

    Jaekel M, Wake DB. Developmental processes underlying the evolution of a derived foot morphology in salamanders. Proc Natl Acad Sci. 2007;104:20437–42.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Stayton CT. Is convergence surprising? An examination of the frequency of convergence in simulated datasets. J Theor Biol. 2008;252:1–14.

    Article  PubMed  Google Scholar 

  9. 9.

    Stern DL. The genetic causes of convergent evolution. Nat Rev Genet. 2013;14:751–64.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Castoe TA, de Koning AJ, Kim H-M, Gu W, Noonan BP, Naylor G, et al. Evidence for an ancient adaptive episode of convergent molecular evolution. Proc Natl Acad Sci. 2009;106:8986–91.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Chan YF, Marks ME, Jones FC, Villarreal G, Shapiro MD, Brady SD, et al. Adaptive evolution of pelvic reduction in sticklebacks by recurrent deletion of a Pitx1 enhancer. Science. 2010;327:302–5.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Parker J, Tsagkogeorga G, Cotton JA, Liu Y, Provero P, Stupka E, et al. Genome-wide signatures of convergent evolution in echolocating mammals. Nature. 2013;502:228–31.

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Liu Y, Cotton JA, Shen B, Han X, Rossiter SJ, Zhang S. Convergent sequence evolution between echolocating bats and dolphins. Curr Biol. 2010;20:R53–4.

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Brown AP, McGowan KL, Schwarzkopf EJ, Greenway R, Rodriguez LA, Tobler M, et al. Local ancestry analysis reveals genomic convergence in extremophile fishes. Philos Trans R Soc B Biol Sci. 2019;374:20180240.

    CAS  Article  Google Scholar 

  15. 15.

    Sackton TB, Clark N. Convergent evolution in the genomics era: new insights and directions. Philos Trans R Soc B Biol Sci. 2019;374:20190102.

    Article  Google Scholar 

  16. 16.

    Elmer KR, Meyer A. Adaptation in the age of ecological genomics: insights from parallelism and convergence. Trends Ecol Evol. 2011;26:298–306.

    Article  PubMed  Google Scholar 

  17. 17.

    Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science. 2014;346:1320–31.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Katta HY, Mojica A, et al. Genomes OnLine database (GOLD) vol 7: updates and new features. Nucleic Acids Res. 2019;47:D649–59.

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Prum RO, Berv JS, Dornburg A, Field DJ, Townsend JP, Lemmon EM, et al. A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature. 2015;526:569–73.

    CAS  Article  PubMed  Google Scholar 

  20. 20.

    Billerman SM, Keeney BK, Rodewald PG, Schulenberg TS. (Editors) Birds of the world. Ithaca: Cornell Laboratory of Ornithology. https://birdsoftheworld.org/bow/home. Accessed Aug 2020.

  21. 21.

    Sibley CG, Ahlquist JE. Phylogeny and classification of birds: a study in molecular evolution. London: Yale University Press; 1990.

    Google Scholar 

  22. 22.

    Bleiweiss R, Kirsch JA, Lapointe FJ. DNA–DNA hybridization-based phylogeny for" higher" nonpasserines: reevaluating a key portion of the avian family tree. Mol Phylogenet Evol. 1994;3:248–55.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Fain MG, Houde P. Parallel radiations in the primary clades of birds. Evolution. 2004;58:2558–73.

    Article  PubMed  Google Scholar 

  24. 24.

    Fidler AE, Kuhn S, Gwinner E. Convergent evolution of strigiform and caprimulgiform dark-activity is supported by phylogenetic analysis using the arylalkylamine N-acetyltransferase (Aanat) gene. Mol Phylogenet Evol. 2004;33:908–21.

    CAS  Article  PubMed  Google Scholar 

  25. 25.

    Hackett SJ, Kimball RT, Reddy S, Bowie RC, Braun EL, Braun MJ, et al. A phylogenomic study of birds reveals their evolutionary history. Science. 2008;320:1763–8.

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Peters JL. Check-list of the Birds of the World, vol. 1. Cambridge: Harvard University Press; 1931.

    Google Scholar 

  27. 27.

    Brown L, Amadon D. Eagles, Hawks and Falcons of the World. Feltham: Country Life Books; 1968.

    Google Scholar 

  28. 28.

    Dickinson EC. The howard and moore complete checklist of the birds of the world. Princeton: Princeton University Press; 2003.

    Google Scholar 

  29. 29.

    Cracraft J. Phylogenetic relationships and monophyly of loons, grebes, and hesperornithiform birds, with comments on the early history of birds. Syst Biol. 1982;31:35–56.

    Article  Google Scholar 

  30. 30.

    Kaiser GW. The inner bird: anatomy and evolution. Vancouver: The University of British Columbia Press; 2007.

    Google Scholar 

  31. 31.

    Pearce T. Convergence and parallelism in evolution: a neo-gouldian account. Br J Philos Sci. 2012;63:429–48.

    Article  Google Scholar 

  32. 32.

    Shapiro MD, Summers BR, Balabhadra S, Aldenhoven JT, Miller AL, Cunningham CB, et al. The genetic architecture of skeletal convergence and sex determination in ninespine sticklebacks. Curr Biol. 2009;19:1140–5.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, et al. Comparative genomic data of the Avian Phylogenomics Project. Science. 2014;346:1311–20.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Zhang G, Li B, Li C, Gilbert MTP, Jarvis ED, Wang J. Comparative genomic data of the Avian Phylogenomics Project. GigaScience. 2014;3:1–8.

    Article  Google Scholar 

  35. 35.

    Thomas GW, Hahn MW. Determining the null model for detecting adaptive convergence from genomic data: a case study using echolocating mammals. Mol Bio Evol. 2015;32:1232–6.

    CAS  Article  Google Scholar 

  36. 36.

    Zou Z, Zhang J. No genome-wide protein sequence convergence for echolocation. Mol Biol Evol. 2015;32:1237–41.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Rosenblum EB, Römpler H, Schöneberg T, Hoekstra HE. Molecular and functional basis of phenotypic convergence in white lizards at White Sands. Proc Natl Acad Sci. 2010;107:2113–7.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4:44–57.

    CAS  Article  Google Scholar 

  39. 39.

    Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v11: protein–protein interaction networks with increasd coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2018;47:D607–13.

    Article  CAS  Google Scholar 

  40. 40.

    Blount ZD, Lenski RE, Losos JB. Contingency and determinism in evolution: replaying life’s tape. Science. 2018;362:eaam5979.

    Article  CAS  PubMed  Google Scholar 

  41. 41.

    Chikina M, Robinson JD, Clark NL. Hundreds of genes experienced convergent shifts in selective pressure in marine mammals. Mol Biol Evol. 2016;33:2182–92.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Natarajan C, Hoffmann FG, Weber RE, Fago A, Witt CC, Storz JF. Predictable convergence in hemoglobin function has unpredictable molecular underpinnings. Science. 2016;354:336–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Cassin-Sackett L, Callicrate TE, Fleischer RC. Parallel evolution of gene classes, but not genes: evidence from Hawai’ian honeycreeper populations exposed to avian malaria. Mol Ecol. 2019;28:568–83.

    Article  PubMed  Google Scholar 

  44. 44.

    Christin P-A, Arakaki M, Osborne CP, Edwards EJ. Genetic enablers underlying the clustered evolutionary origins of C4 photosynthesis in angiosperms. Mol Biol Evol. 2015;32:846–58.

    CAS  Article  PubMed  Google Scholar 

  45. 45.

    Foote AD, Liu Y, Thomas GWC, Vinař T, Alföldi J, Deng J, et al. Convergent evolution of the genomes of marine mammals. Nat Genet. 2015;47:272–5.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Petschenka G, Wagschal V, von Tschirnhaus M, Donath A, Dobler S. Convergently evolved toxic secondary metabolites in plants drive the parallel molecular evolution of insect resistance. Am Nat. 2017;190:S29-43.

    Article  PubMed  Google Scholar 

  47. 47.

    Conte GL, Arnegard ME, Peichel CL, Schluter D. The probability of genetic parallelism and convergence in natural populations. Proc R Soc B Biol Sci. 2012;279:5039–47.

    Article  Google Scholar 

  48. 48.

    Bridgham JT. Predicting the basis of convergent evolution. Science. 2016;354:289–289.

    CAS  Article  PubMed  Google Scholar 

  49. 49.

    Agrawal AA. Toward a predictive framework for convergent evolution: integrating natural history, genetic mechanisms, and consequences for the diversity of life. Am Nat. 2017;190:S1-12.

    Article  PubMed  Google Scholar 

  50. 50.

    Ebling FJP. The role of glutamate in the photic regulation of the suprachiasmatic nucleus. Prog Neurobiol. 1996;50:109–32.

    CAS  Article  PubMed  Google Scholar 

  51. 51.

    Menegaz RA, Christopher KE. Septa and processes: convergent evolution of the orbit in haplorhine primates and strigiform birds. J Hum Evol. 2009;57:672–87.

    Article  PubMed  Google Scholar 

  52. 52.

    Brown CW, Houston-Hawkins DE, Woodruff TK, Matzuk MM. Insertion of Inhbb into the Inhba locus rescues the Inhba-null phenotype and reveals new activin functions. Nat Genet. 2000;25:453.

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Hansen KA, Maxwell A, Siebert U, Larsen ON, Wahlberg M. Great cormorants (Phalacrocorax carbo) can detect auditory cues while diving. Sci Nat. 2017;104:45.

    Article  CAS  Google Scholar 

  54. 54.

    Macica CM, Von Hehn CA, Wang L-Y, Ho C-S, Yokoyama S, Joho RH, et al. Modulation of the kv3. 1b potassium channel isoform adjusts the fidelity of the firing pattern of auditory neurons. J Neurosci. 2003;23:1133–41.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Martyn AC, Choleris E, Gillis DJ, Armstrong JN, Amor TR, McCluggage ARR, et al. Luman/CREB3 recruitment factor regulates glucocorticoid receptor activity and is essential for prolactin-mediated maternal instinct. Mol Cell Biol. 2012;32:5140–50.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Härer A, Meyer A, Torres-Dowdall J. Convergent phenotypic evolution of the visual system via different molecular routes: how neotropical cichlid fishes adapt to novel light environments. Evol Lett. 2018;2:341–54.

    Article  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, et al. The UCSC genome browser database: update 2011. Nucleic Acids Res. 2011;39:D876–82.

    CAS  Article  PubMed  Google Scholar 

  58. 58.

    Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res. 2004;14:988–95.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Liu K, Raghavan S, Nelesen S, Linder CR, Warnow T. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 2009;324:1561–4.

    CAS  Article  PubMed  Google Scholar 

  60. 60.

    Löytynoja A, Goldman N. An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA. 2005;102:10557–62.

    Article  CAS  PubMed  Google Scholar 

  61. 61.

    Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–8.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Le Duc D, Schöneberg T. Adaptation to nocturnality—learning from avian genomes. BioEssays. 2016;38:694–703.

    Article  PubMed  Google Scholar 

  63. 63.

    Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34(Web Server):W609–12.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.

    CAS  Article  PubMed  Google Scholar 

  65. 65.

    Maddison WP, Maddison DR. Mesquite v3.51. 2018.

  66. 66.

    Angelo C, Ripley B. boot: Bootstrap R (S-Plus) functions version 1.3–23. 2019.

  67. 67.

    Degnan JH, Rosenberg NA. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol. 2009;24:332–40.

    Article  PubMed  Google Scholar 

  68. 68.

    Mendes FK, Hahn Y, Hahn MW. Gene tree discordance can generate patterns of diminishing convergence over time. Mol Biol Evol. 2016;33:3299–307.

    CAS  Article  PubMed  Google Scholar 

  69. 69.

    Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol Biol Evol. 2015;32:268–74.

    CAS  Article  PubMed  Google Scholar 

  70. 70.

    Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14:587–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Hobolth A, Dutheil JY, Hawks J, Schierup MH, Mailund T. Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection. Genome Res. 2011;21:349–56.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ. Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25:1189–91.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35:1547–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multipletesting. J R Stat Soc Ser B Methodol. 1995;57:289–300.

    Google Scholar 

  75. 75.

    Zhan X, Pan S, Wang J, Dixon A, He J, Muller MG, et al. Peregrine and saker falcon genome sequences provide insights into evolution of a predatory lifestyle. Nat Genet. 2013;45:563–6.

    CAS  Article  PubMed  Google Scholar 

  76. 76.

    Warren WC, Clayton DF, Ellegren H, Arnold AP, Hillier LW, Künstner A, et al. The genome of a songbird. Nature. 2010;464:757–62.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  77. 77.

    International Chicken Polymorphism Map Consortium. A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms. Nature. 2004;432:717–22.

    Article  CAS  PubMed Central  Google Scholar 

  78. 78.

    Chen H, Boutros PC. VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R. BMC Bioinform. 2011;12:35.

    Article  Google Scholar 

Download references

Acknowledgements

We are grateful to Joe Parker for his advice on our data analyses. We thank John O’Brien for discussing the research design with us and Noah Last of the Third Draft Editing for his English language editing.

Funding

The study is supported by the internal fund of Academia Sinica. The funding body played no role in the design of the study, data collection and analysis, interpretation of data, or writing of the manuscript.

Author information

Affiliations

Authors

Contributions

CMH conceived the research idea. YCC, HCK, WSL and CMH analyzed the data. YCC, HCK and CMH wrote the manuscript. WSL commented on the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chih-Ming Hung.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

Adaptively convergent genes for nocturnal, foot-propelled diving or raptorial life history traits. Table S2. The total number of adaptively convergent site mutations (ACSMs) in adaptively convergent genes (ACGs) and the number of expected point mutations under neutral selection. Table S3. Sequence-convergent genes that show positive selection signals in parts or all of the foreground branches regarding nocturnal, foot-propelled diving or raptorial life history traits. Table S4. Adaptively convergent genes supporting for one of the derived convergence hypotheses (Hfoot-a, Hfoot-b, Hfoot-c, Hrap-a and Hrap-b, see Fig. 3). Figure S1. Maximum likelihood gene trees of 43 adaptively convergent genes, supporting one of the tree convergent hypotheses – nocturnality (Hnoc), foot-propelled diving (Hfoot) and raptorial behaviors (Hrap)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, YC., Kuo, HC., Lo, WS. et al. Avian phenotypic convergence is subject to low genetic constraints based on genomic evidence. BMC Evol Biol 20, 147 (2020). https://doi.org/10.1186/s12862-020-01711-7

Download citation

Keywords

  • Convergent evolution
  • Genomic comparison
  • Genetic constraint
  • Nocturnal birds
  • Foot-propelled diving birds
  • Raptors