- Research article
- Open Access
Structuring evolution: biochemical networks and metabolic diversification in birds
BMC Evolutionary Biology volume 16, Article number: 168 (2016)
Recurrence and predictability of evolution are thought to reflect the correspondence between genomic and phenotypic dimensions of organisms, and the connectivity in deterministic networks within these dimensions. Direct examination of the correspondence between opportunities for diversification imbedded in such networks and realized diversity is illuminating, but is empirically challenging because both the deterministic networks and phenotypic diversity are modified in the course of evolution. Here we overcome this problem by directly comparing the structure of a “global” carotenoid network – comprising of all known enzymatic reactions among naturally occurring carotenoids – with the patterns of evolutionary diversification in carotenoid-producing metabolic networks utilized by birds.
We found that phenotypic diversification in carotenoid networks across 250 species was closely associated with enzymatic connectivity of the underlying biochemical network – compounds with greater connectivity occurred the most frequently across species and were the hotspots of metabolic pathway diversification. In contrast, we found no evidence for diversification along the metabolic pathways, corroborating findings that the utilization of the global carotenoid network was not strongly influenced by history in avian evolution.
The finding that the diversification in species-specific carotenoid networks is qualitatively predictable from the connectivity of the underlying enzymatic network points to significant structural determinism in phenotypic evolution.
Only a small proportion of theoretically possible changes seemed to be realized in phenotypic evolution and diversification, with some outcomes appearing recurrently whereas others are seemingly forbidden [1–5]. Such determinism and predictability of phenotypic outcomes is surprising considering the dimensionality of the genome, the proteome, and the developmental dynamics linking them and point to the existence of constraints in phenotypic variation. Theoretical and empirical studies have suggested that such constraints may be a reflection of the connectivity of the network of interactions among elements such as genes, proteins, enzymes and metabolites (defined here as a deterministic network) caused by genomic or developmental epistasis [1, 6–11], internal integration during development [12–15], and physical stability or historical contingency of gene and protein associations [16–22]. Direct examination of the correspondence between opportunities for diversification imbedded in such networks and realized phenotypic diversity is needed to illuminate the structural properties of networks that delineate phenotypic diversity.
Phenotypic diversification on a deterministic network is the result of the gain or loss of elements and interactions that convey different fitness [1, 3, 22]. Mechanistically, the evolutionary representation and variability of network elements tends to be associated with their topological positions [23–28]. In particular, two structural properties of networks – the number of reactions per element, which represents the connectivity of the network, and the number of reactions that separate elements in a network, which defines the length of pathways between elements in the network – provide distinct ways by which elements and interactions in the network are gained or lost and result in different patterns of phenotypic diversification (Fig. 1) [29–33].
Greater connectivity of an element – the number of direct interactions it has with other elements in a network – enables an evolving lineage to include different elements that both directly interact with the same element [34–36]. In this mode of network diversification (hereafter pathway diversification), the gain of different interactions associated with the same element represents the start of divergent pathways comprised of unique elements and interactions (Fig. 1a). For example, in metabolic networks, the use of different enzymatic reactions from the same substrate metabolite produces different products resulting in distinct metabolic pathways. Theory and empirical data suggest that metabolic and protein networks commonly evolve by the preferential attachment of new enzymatic reactions or protein interactions to the most connected elements in these networks [24, 34, 37, 38]. Correspondingly, the genes underlying proteins and enzymes with greater connectivity tend to be represented in a greater number of taxa, have longer evolutionary persistence and lower rates of evolutionary change than elements with fewer direct interactions in a network [23, 39, 40]. Thus, the divergence among species’ networks should be driven by the gain or loss of interactions among highly connected elements, whereas the connected elements themselves should be conserved across species. Differences in the number of interactions that start from these conserved elements should be reflected in differences in the overall network connectivity (number of interactions per element) across species’ networks, because a greater number of opportunities exist for species to express different interactions at densely connected compounds. If pathway diversification causes divergence among species’ networks, then we expect differences in the elements and interactions present across species networks to increase with the differences in the connectivity of their networks, such that interactions and elements associated with the most connected compounds in the network should vary the most across species.
The length of pathways – the number of interactions (e.g., enzymatic reactions) that connect elements in a network – enables an evolving lineage to express different elements and reactions along the same pathway. This mode of network diversification (hereafter pathway elongation), results from differences in the number of sequential interactions from the same starting element (Fig. 1b). Most genes, proteins, and metabolites are regulated by multistep interactions [35, 41] and thus in most cases, the activation or expression of an element is dependent on several prior interactions. Changes in interactions at the beginning of a pathway may prevent the expression of interactions located further downstream in the pathway and result in shorter pathways and the loss of elements. Alternatively, the addition of a new interaction to the end of a pathway can increase the length of the pathway and produce a novel product. Models of network growth and empirical results suggest that most of the change in networks occurs at their periphery, such that terminal elements are most likely to be gained or lost, whereas the central or upstream elements are the most conserved [39, 42–44]. Longer pathways between elements in a network therefore provide more opportunities for the use of different numbers of sequential reactions from the same starting element, such that some species networks only express the intermediate elements that lie along a pathway of interactions from one element to another and the final product is never expressed. If network diversification is driven by differences in the elongation of a sequence of interactions among species, then we expect species’ networks to have different pathway lengths from the same starting element. The difference in the length of the pathways among species’ networks should be reflected in the diversification among the elements and interactions present in each species. In this case, the elements located at the beginning of pathways should be conserved across networks, and species’ networks should diverge more from each other at elements located closer to the ends of potential pathways.
Networks are often organized in discrete functional modules in which a group of metabolites, enzymes, genes, or proteins interact more often with each other than with other elements in the network [45, 46]. Functional modules play an important role in the evolvability of organisms [47–51]. Empirical studies have shown that genes in the same regulatory modules tend to be co-expressed [52–55], resulting in similar evolutionary rates of proteins in the same modules [56, 57]. Additionally, genes that underlie within-module enzymatic reactions have similar rates of evolutionary gain and loss (e.g., [58, 59]), such that multiple enzymatic reactions that comprise a pathway are gained or lost together. Therefore, another mode of network divergence among species could be the result of the gain or loss of complete functional modules (hereafter module diversification) (Fig. 1c). If this is the case, then species should differ in modules they express, and neither the connectivity of elements nor the length of a pathway between elements in a network should be related to the differences in species’ networks.
Here we examined the extent to which the structure of enzymatic reactions in the global carotenoid network – that comprises all of the documented enzymatic reactions among naturally occurring carotenoids (Additional file 1a) – is associated with patterns of avian diversification in carotenoid-producing metabolic networks. The connectivity and topology of enzymatic reactions of the global carotenoid network have evolved largely in the context of bacterial evolution (e.g., [60, 61]) and subsets of this global network are utilized in the carotenoid metabolism of all lineages studied to date, such as fungi, plants, insects and animals (e.g., [62, 63]). Here we studied the patterns of utilization of this network associated with the production of carotenoid pigmentation in the plumage and integument of 250 bird species. Specifically, we were interested in the effect of the structure of the global metabolic network on the frequency of occurrence of individual carotenoid compounds and reactions across species.
In birds, metabolism of carotenoids expressed in feathers and integument necessarily starts with the consumption of dietary carotenoids (e.g., [64, 65]). This property of avian carotenoid biosynthesis allows for the identification of the starting points of metabolic pathways in species’ networks and provides an opportunity to distinguish the effects of pathway diversification from the effects of pathway elongation and module diversification on network divergence across species. In birds, pathway diversification from the same highly connected compounds, pathway elongation starting at the same dietary compounds, or the consumption of different dietary compounds representing different functional modules in the network could produce evolutionary transitions across species’ networks. In the global carotenoid network, opportunities for pathway diversification and elongation vary across metabolic pathways that start at different dietary carotenoids (Figs. 2 and 3). Additionally, the consumption of different dietary compounds results in access to different enzymatic reactions and metabolites that could comprise different functional modules (Fig. 2). Here, we first mapped species’ carotenoid networks onto the global avian carotenoid metabolic network  and examined whether differences in enzyme connectivity or relative pathway position of individual carotenoid compounds were associated with their evolutionary representation among species. We then repeated these analyses for biochemical modules of interconnected elements and examined their evolutionary representation in relation to their structural properties. We examined the relative contribution of enzymatic connectivity, metabolic pathway lengths, and module representation on network divergence and identified the structural properties of both individual compounds and modules associated with diversification hotspots on the global carotenoid network. We discuss the extent to which the structure of the carotenoid metabolic network can be used to understand and predict patterns of realized phenotypic diversity.
Data collection and metabolic network construction
The global carotenoid biosynthesis network includes all of the enzymatic reactions that occur among naturally-occurring carotenoids in bacteria, plants, fungi and animals (Additional file 1a, ). This network delineates biochemical pathways of carotenoid biosynthesis based on the chemical properties of the compounds. We collected an exhaustive list of all the carotenoid compounds and reactions documented in birds (n = 339 species), using carotenoids that are found in plumage, integument (bill, tarsi, skin), plasma, liver, fat, feces, retina, and seminal fluid, or are known to be consumed in the diet (Additional file 1b; data current as of July 2015). The chromatography and mass spectrometry methods that are listed in Additional file 1b document the presence or absence of specific compounds against known standards. All of the distinct compounds identified in the species of birds were then used to construct the “avian subset” of the global carotenoid metabolic network, consisting of 66 carotenoids and 97 enzymatic reactions (Fig. 2). The global metabolic network was then used as a template to construct 250 species-specific carotenoid metabolic networks between known dietary carotenoid compounds (the upstream elements of carotenoid metabolic networks in birds), metabolized compounds (e.g., circulating in plasma or found in other organism tissues), and the expressed compounds identified from species’ plumage and integument (Additional file 2). Briefly, after mapping compounds found in the diet, plasma, and plumage or integument of species under this study on the “avian space” of the global carotenoid biosynthesis network (Fig. 2), we recorded biochemical pathways that link dietary, intermediate and plumage-expressed compounds for each species (Additional files 1b and 2; details and justification in Badyaev et al. , which also see for phylogenetic analyses of avian carotenoid networks). For species that had no known dietary or intermediate compounds (but not both), missing compounds and reactions were assigned based on the mapping of the species’ known compounds and reactions on the global network and recording all biochemically possible connections (e.g., between a known dietary and a known expressed compound or between a known intermediate and a known expressed compound and a possible dietary compound). Networks were not built for species if the carotenoids expressed in their plumage or integument were unknown even when all other components of the network were documented. Thus, not all of the compounds and reactions in the avian subset of the global carotenoid metabolic network (Additional file 1a, Fig. 2) were present in the species-specific networks. In the 250 species-specific complete networks that were constructed, 53 compounds and 81 enzymatic reactions occurred at least once. Species under this study represent eleven avian orders (Anseriformes, Charadriiformes, Ciconiiformes, Columbiformes, Galliformes, Passeriformes, Pelecaniformes, Phaethontifromes, Phoenicopteriformes, Piciformes, Trogoniformes) and span over 110 MYA of avian carotenoid diversification (Fig. 4a, 4b, 4c, 4d and 4e, Additional file 3) .
Metabolic distance and modularity in networks
We used a modified metabolic distance based on the Jaccard distance  and Rodrigues and Wagner  to calculate the fraction of reactions and compounds differing between any two metabolic networks. Species’ networks were coded based on the presence of compounds and reactions in the avian subset of the global carotenoid metabolic network. The uncorrected P-distance is the fraction of the number of compounds and reactions that differ between each pair of networks (d) out of the total number of compounds and reactions in the global network (N G):
The pairwise P-distances were computed in Mesquite (version 3.03)  using the PDAP:PDTREE (version 1.16) package . The metabolic distance (D) between networks represents the fraction of compounds and reactions in which two networks differ out of the total number of compounds and reactions that occur in each of the networks:
where N 1 and N 2 are the total number of compounds and reactions in networks S 1 and S 2, respectively. The 53 compounds expressed in the global carotenoid network at least once among the species’ networks were partitioned into ten structurally defined modules based on the density of the compounds’ enzymatic interconnectivity using the simulated annealing program netcarto (https://amaral.northwestern.edu/resources/software/netcarto) [71, 72]. This approach to module partitioning has previously been used to reliably assign metabolites to the correct functional pathway based only on the structural properties of the metabolites . In the avian carotenoid metabolic network, the modules are partitioned by different dietary compounds; seven of the ten modules include at least one starting, upstream dietary compound. For module assignments of the individual compounds in the global carotenoid metabolic network refer to Fig. 2 and Additional file 1c.
Network structural measurements
For each compound in the avian carotenoid network (Fig. 2) we calculated the number of directly linked enzymatic reactions  and the distance from a dietary compound (minimum number of reactions between a compound and any of the dietary compounds in the network) to represent the connectivity and the pathway position of each compound, respectively. The connectivity (C) of each of the modules in the global network and each of the species’ networks was the average number of reactions per compound:
where r is the total number of reactions in the module or network and n is the total number of compounds in the module or network. The diameter of each of the species’ networks is the shortest distance (number of reactions) between the two most distant dietary and expressed compounds in the network. The diameter of each of the modules in the global network is the fewest number of reactions between the two most distant compounds in the module. Both the connectivity of the species' networks and the modules and the diameter of the modules were computed using Cytoscape 2.8.2  with NetworkAnalyzer 2.7 [75, 76] and RandomNetworks 1.0 .
Species representation and realized phenotypic diversification
The species representation of a compound or reaction is the number of species that have this compound or reaction (e.g., ). Whereas species representation characterizes the evolutionary representation of a compound, it does not include information on species’ phylogenetic relationships, and instead enables the examination of metabolic network evolution from a structural, rather than historical perspective (e.g., ). In a companion study we found that the phylogenetic relationships among the species in this study were not reflected in the similarity of their biochemical networks; the small biochemical space on which birds diversify and the structure of the biochemical network instead leads to recurrent convergence of distantly related and ecologically distinct taxa in metabolic networks . Having examined the historical sequence of exploration of the global carotenoid network by extant avian species in that study, here we explore whether the structure of the global carotenoid network is reflected in the pattern of network exploration across avian lineages. Several other studies have taken similar approaches to compare structural features of metabolic networks across species of bacteria, eukaryotes, and archaea independently of their phylogenetic relationships (e.g., [24, 35, 78]).
The realized diversification (R) of an enzymatic reaction was measured as the fraction of species that do not have a reaction among all of the species that have the substrate compound for the reaction (n c ), where n r is the number of species that have the reaction:
An enzymatic reaction with a realized diversification score of zero represents a location in the network with little or no divergence between species’ networks along that part of a pathway; meaning that the enzyme is conserved across species that also have the enzyme’s substrate compound. The realized diversification of an enzymatic reaction with a score close to 1 represents a point of major divergence between species (i.e., the enzyme is only present in a small fraction of the total species that have the enzyme’s substrate compound).
Global carotenoid network structural properties and diversity of species’ networks
Connectivity and the distance from dietary carotenoids of compounds varied widely in the avian subset of the global carotenoid network (Figs. 2 and 3). All but one compound were associated with at least one reaction to a maximum of 10 reactions. Non-dietary compounds were one to eight reactions away from starting dietary carotenoids (Fig. 3). The species’ networks (Fig. 4a, 4b, 4c, 4d and 4e; Additional file 1b) differed widely in the number of total compounds (1-21), number of reactions (0-46), connectivity (0-4.53 average reactions per compound), diameter length (0-8 reactions), number of modules (1-6), and number of dietary carotenoids (1-6).
Structural determinants of compound occurrence among species
The connectivity of a compound contributed the most to its species representation; carotenoids with higher connectivity had greater species representation (Fig. 5a; b ST = 0.73, t = 7.63, P < 0.001, n = 55). Species representation of a compound did not vary with its distance from a dietary carotenoid (Fig. 5b; b ST = -0.07, t = -0.72, P = 0.48, n = 55).
The role of modules in compound occurrence among species
The representation of functional modules of the avian carotenoid network varied across species' networks (Fig. 6a and b). Modules of higher connectivity occurred in more species (Fig. 6a; Spearman’s ρ = 0.80, P = 0.006, n = 10), but the diameter of a module was not related to the occurrence of the module across species (Fig. 6b; ρ = 0.49, P = 0.15, n = 10). Differences in the numbers of species with each of the compounds in a module were correlated with the connectivity of the module (Fig. 6c; ρ = 0.74, P = 0.01, n = 10), but not with the diameter of the module (Fig. 6d; ρ = 0.59, P = 0.07, n = 10).
Structural determinants of metabolic distance among species networks
In pairs of species networks that shared dietary carotenoids, differences in network connectivity accounted for more of the metabolic distance between species’ networks (Fig. 7a; bST = 0.67, t = 75.24, P < 0.001, n = 4,839) than did differences in the diameters of the networks (Fig. 7b; bST = 0.28, t = 31.50, P < 0.001, n = 4,839). Pairs of networks with large differences in the average number of reactions per compound were more metabolically distinct than networks with large differences in their diameters.
Structural properties of realized diversification of enzymatic reactions
The connectivity of a substrate compound contributed to the realized diversification across species of the reactions associated with the substrate compound (Fig. 8a; b ST = 0.38, t = 3.10, P = 0.003, n = 81). The realized diversification of reactions in the network was not predicted by the distance of their substrate compounds from dietary compounds (Fig. 8b; b ST = -0.05, t = -0.39, P = 0.70, n = 81).
To what extent is the exploration of a deterministic network and its associated phenotypic diversification the result of the network’s structural properties? The divergence between species’ networks could be driven by either the exploration of pathways from conserved compounds, the elongation of conserved pathways, or the addition of different modules. Our findings suggest that pathway diversification is the main mechanism of divergence among species’ metabolic networks; differences in the enzymatic connectivity among species’ networks contributed more to their metabolic divergence than did differences in the length of their diameters (Fig. 7). In the avian subset of the global carotenoid metabolic network, the connectivity of a compound strongly contributed to further network diversification: compound connectivity contributed the most to both the frequency of compound occurrence across species (Fig. 5a) and the realized diversification of the reactions associated with the compound among species’ networks (Fig. 8a). In contrast, pathway elongation did not play a major role in the diversification of avian carotenoid networks: the relative distance from a dietary compound was not related to a compound’s representation across species (Fig. 5b) or to the realized diversification of reactions associated with the compound among species’ networks (Fig. 8b). The presence of distinct structural modules and differences in the species representation of compounds within these modules contributed to the metabolic divergence across species: the most densely connected modules were the most prevalent across species’ networks. Metabolic divergence across species, however, was not due to the concurrent gain or loss of all of the compounds in a module (Fig. 6c and d). Thus, pathway diversification strongly contributes to metabolic divergence among species: modules characterized by greater connectivity provided more opportunities for the use of distinct pathways.
A central assumption of these tests and their interpretation, is that species are co-opting elements (genes or enzymes) that comprise the global avian carotenoid metabolic network and are selectively expressing a particular subset of these elements, rather than evolving them de novo. Several lines of evidence support this assumption. First, there was no correspondence between the historical relationships across study species and their utilization of carotenoid network space (i.e., use or disuse of particular reactions and compounds; [66, 79]). Instead the structure of networks, in particular the link between pathway elongation and pathway diversification, accounted for recurrent convergence of phylogenetically distant and ecologically distinct species in the utilization of network space and expression of carotenoid compounds (ibid.). Although such a pattern could be produced by the independent evolution of enzymes with identical functions, it is highly unlikely (e.g., ). In other taxa, horizontal gene transfer [58, 81–84] and symbiotic events  accounted for enzymatic convergence in carotenoid metabolism between unrelated species, but neither of these processes play a significant role in avian carotenoid biosynthesis. Gene duplications could similarly account for the evolution of convergent enzymes [24, 83, 86, 87], but the rate of gene duplications in birds  seems orders of magnitude lower that would be required to explain the documented rates of carotenoid enzyme convergence across bird species . Instead, species-specific expression of compounds and reactions by the selective expression of different enzyme-encoding genes from the global carotenoid network, appears to be the dominant mode of avian carotenoid network evolution [88, 89], with de novo evolution of new carotenoid pathways (e.g., [90–92]) playing a secondary role (Additional file 1b). A potential mechanism that could drive pathway diversification of enzymatic reactions at these connected compounds is differences in the control of metabolic flux among species across different pathways . Alternatively, different threshold concentrations of a substrate compound associated with several enzymatic reactions may be required to activate different enzymatic reactions [94, 95], such that the diversification of these pathways among species should be dependent on changes in the concentrations of these connected compounds.
We showed that the evolutionary representation of compounds and enzymatic reactions reflected their structural properties in the global carotenoid network (Fig. 5a). Why do compounds with the greatest connectivity tend to be overrepresented across species? The longer evolutionary persistence of the most connected elements is a common property of protein and gene deterministic networks across many taxa [e.g., 23, 24, 39, 40] and could reflect their role in maintaining the overall structural cohesiveness and function of the network. The removal or modification of highly connected elements could have greater pleiotropic effects that are more harmful to the function of the network than the removal of less connected compounds [96–98]. This property can result in stronger selection against the loss of these elements (e.g., ) or, alternatively, in lesser effectiveness of purifying selection for the deletion of centrally located elements in the network [100, 101]. Further, metabolic flux theory suggests that enzymes with the highest flux control coefficients should be located at the branching points of pathways in metabolic networks [102–105]. Such enzymes experience stronger stabilizing selection than those that contribute less to the flow of metabolites through metabolic pathways (e.g., ), accounting for the link between enzymatic connectivity and evolutionary persistence found in this study (Fig. 5a). These conclusions are corroborated by the models of network evolution and empirical studies of network growth that find that new elements in a network preferentially attach to evolutionarily stable elements that have greater connectivity rather than to sparsely connected, but more evolutionary labile downstream elements [24, 28, 34, 38].
It is possible that dietary compounds – the upstream-most elements of avian carotenoid networks – are not evolutionarily stable enough to contribute to incremental pathway elongation over evolutionary time. The evolutionary rates of the gain and loss of dietary carotenoids were orders of magnitude higher than the evolutionary lability of other compounds across avian metabolic networks , and our results show that dietary compounds were no more likely to be present in a network than metabolized downstream compounds (Fig. 5b). Theory predicts that rate-limiting enzymes should occur at upstream positions in pathways (e.g., ), however the evolutionary instability of dietary compounds can decrease the effectiveness of selection on these compounds. Instead, due to the high enzymatic connectivity of some compounds in carotenoid networks, pathways from different dietary starting points can ultimately produce the same end products (Fig. 2). Thus, network robustness to evolutionary labile dietary compounds – a central feature of avian carotenoid networks [66, 107] – may also contribute to the evolutionary stability of the connected compounds and explain why the diversification of species’ networks was centered on connected compounds instead of the continued lengthening of pathways from specific dietary compounds.
Variance in the species representations of compounds and enzymatic reactions within the same modules (Fig. 6c and d) implies that the modules partitioned by their structural properties do not correspond to actual biological processes (e.g. ), despite the fact that the structural modules used in this study were associated with different dietary compounds. Differences in the number of species with each compound in a module, however, could be the result of the connectivity of each of the compounds to other modules, which has been shown to explain the evolutionary rate of genes in protein interaction networks . Furthermore, it is possible that species utilize all of the enzymatic reactions and produce all of the compounds in a module but selectively express only some of the compounds in their plumage [107, 110–112], and so the variation of the species representations of compounds in modules captures this selective compound deposition of the products of a module.
By identifying the topological structural properties in a deterministic network that underlie phenotypic differences we can begin to establish specific mechanisms for the microevolutionary sequences behind observed macroevolutionary patterns. For example, if highly connected network elements determine phenotypic differences, then phenotypic diversification in a lineage might not occur in sequential order (structural or temporal) because different pathways can be explored from the same initial conserved element, and so we would expect weak phylogenetic signal among phenotypes. If pathway elongation is the source of phenotypic differences, then the dependence between downstream and upstream elements imposes a clear sequential order to phenotypic diversification along the pathway, resulting in stronger historical associations across species’ networks. The incorporation or loss of entire modules of elements in a deterministic network may be ordered or unordered, depending on their relative positions, but either would result in recurrent bursts of diversification across lineages’ phenotypes [113–115]. Because we found no evidence of avian carotenoid network diversification due to pathway elongation, we would not expect a sequential order in patterns of realized diversification in carotenoid pathways during avian evolutionary history. Instead, our finding that differences among species’ networks were due to pathway diversification from highly connected compounds, suggests that related species should have similar carotenoid networks only when they utilize the same pathways from the same shared compound. The results of this study thus explain why phenotypic diversification in expressed carotenoids between related species was overwhelmingly due to unordered periodic bursts of biochemical diversification of several compounds at once in the same pathway module across species, with ecological divergence in the use of dietary carotenoids – the process closely associated with ecological speciation, pathway elongation, and species relatedness – playing a significantly weaker role [66, 107].
The goal of this study was to explicitly consider how the structural interactions among elements of a trait affect its diversification. Our results show that the structure of the enzymatic reactions in the avian space of the global carotenoid network delineates opportunities for diversification of expressed carotenoids in birds. Within-species studies can establish the proximate mechanisms underlying the observed association of network topology, enzymatic connectivity and evolutionary diversification in carotenoid compounds. Explicit consideration of spatial and temporal organization of interactions between genes, proteins, enzymes and other elements of deterministic networks brings us closer to an understanding of the relationship between potential and realized phenotypic diversity.
MYA, million years ago
Gavrilets S. Fitness landscapes and the origin of species. Princeton: Princeton University Press; 2004.
Gerhart J, Kirschner M. The theory of facilitated variation. Proc Natl Acad Sci U S A. 2007;104:8582–9.
Maynard SJ. Natural selection and the concept of a protein space. Nature. 1970;225:563–4.
Wagner GP. Homology, genes, and evolutionary innovation. Princeton: Princeton University Press; 2014.
Newman SA. The developmental genetic toolkit and the molecular homology—analogy paradox. Biol Theory. 2006;1:12–6.
Badyaev AV, Walsh JB. Epigenetic processes and genetic architecture in character origination and evolution. In: Charmantier A, Garant D, Kruuk LEB, editors. Quantitative genetics in the wild. Oxford: Oxford University Press; 2014. p. 177–89.
Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature. 2006;444:929–32.
Breen MS, Kemena C, Vlasov PK, Notredame C, Kondrashov FA. Epistasis as the primary factor in molecular evolution. Nature. 2012;490:535–8.
Gravner J, Pitman D, Gavrilets S. Percolation on fitness landscapes: effects of correlation, phenotype, and incompatibilities. J Theor Biol. 2007;248:627–45.
Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ. Empirical fitness landscapes reveal accessible evolutionary paths. Nature. 2007;445:383–6.
Rice SH. The evolution of developmental interactions: Epistasis, canalization, and integration. In: Wolf JB, Brodie III ED, Wade MJ, editors. Epistasis and the evolutionary process. New York: Oxford University Press; 2001. p. 82–98.
Alberch P. From genes to phenotype: Dynamical systems and evolvability. Genetica. 1991;84:5–11.
Arthur W. Developmental drive: An important determinant of the direction of phenotypic evolution. Evol Dev. 2001;3:271–8.
Forgacs G, Newman SA. Biological physics of the developing embryo. Cambridge: Cambridge University Press; 2005.
Whyte LL. Internal factors in evolution. New York: George Braziller; 1965.
Bloom JD, Labthavikul ST, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A. 2006;103:5869–74.
Bridgham JT, Ortlund EA, Thornton JW. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature. 2009;461:515–9.
Harms MJ, Thornton JW. Historical contingency and its biophysical basis in glucocorticoid receptor evolution. Nature. 2014;512:203–7.
Newman SA. Physico-genetic determinants in the evolution of development. Science. 2012;338:217–9.
Pagel M, Pomiankowski A. Evolutionary genomics and proteinomics. Sunderland: Sinauer Associates; 2008.
Povolotskaya IS, Kondrashov FA. Sequence space and the ongoing expansion of the protein universe. Nature. 2010;465:922–7.
Wagner A. The molecular origins of evolutionary innovations. Trends Genet. 2011;27:397–410.
Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. Evolutionary rate in the protein interaction network. Science. 2002;296:750–2.
Light S, Kraulis P, Elofsson A. Preferential attachment in the evolution of metabolic networks. BMC Genomics. 2005;6:159.
Liu WC, Lin WH, Davis AJ, Jordán F, Yang HT, Hwang MJ. A network perspective on the topological importance of enzymes and their phylogenetic conservation. BMC Bioinformatics. 2007;8:121.
Yamada T, Bork P. Evolution of biomolecular networks-lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol. 2009;10:791–803.
Zhao J, Ding G-H, Tao L, Yu H, Yu Z-H, Luo J-H, et al. Modular co-evolution of metabolic networks. BMC Bioinformatics. 2007;8:311.
Maslov S, Krishna S, Pang TY, Sneppen K. Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc Natl Acad Sci U S A. 2009;106:9743–8.
Banerjee A. Structural distance and evolutionary relationship of networks. BioSyst. 2011;107:186–96.
Borenstein E, Kupiec M, Feldman MW, Ruppin E. Large-scale reconstruction and phylogenetic analysis of metabolic environments. Proc Natl Acad Sci U S A. 2008;105:14482–7.
Ebenhöh O, Handorf T, Kahn D. Evolutionary changes of metabolic networks and their biosynthetic capacities. IEE P Syst Biol. 2006;153:354–8.
Mithani A, Hein J, Preston GM. Comparative analysis of metabolic networks provides insight into the evolution of plant pathogenic and non pathogenic lifestyles in Pseudomonas. Mol Biol Evol. 2011;28:483–99.
Navlakha S, Kingsford C. Network archaeology: uncovering ancient networks from present-day interactions. PLoS Comp Biol. 2011;7, e1001119.
Barabási A-L, Albert R. Emergence of scaling in random networks. Science. 1999;286:509–12.
Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási A-L. The large-scale organization of metabolic networks. Nature. 2000;407:651–4.
Thieffry D, Huerta AM, Pérez-Rueda E, Collado-Vides J. From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. Bioessays. 1998;20:433–40.
Barabási A-L. Luck or reason. Nature. 2012;489:507–8.
Eisenberg E, Levanon EY. Preferential attachment in the protein network evolution. Phys Rev Lett. 2003;91:138701.
Bernhardsson S, Gerlee P, Lizana L. Structural correlations in bacterial metabolic networks. BMC Evol Biol. 2011;11:20.
Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three Eukaryotic protein-interaction networks. Mol Biol Evol. 2005;22:803–6.
Xu K, Bezakova I, Bunimovich L, Yi SV. Path lengths in protein–protein interaction networks and biological complexity. Proteomics. 2011;11:1857–67.
Ramsay H, Rieseberg LH, Ritland K. The correlation of evolutionary rate with pathway position in plant terpenoid biosynthesis. Mol Biol Evol. 2009;26:1045–53.
Rausher MD, Miller RE, Tiffin P. Patterns of evolutionary rate variation among genes of the anthocyanin biosynthetic pathway. Mol Biol Evol. 1999;16:266–74.
Wright KM, Rausher MD. The evolution of control and distribution of adaptive mutations in a metabolic pathway. Genetics. 2010;184:483–502.
Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–52.
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi A-L. Hierarchical organization of modularity in metabolic networks. Science. 2002;297:1551–5.
Badyaev A. Evolvability and robustness in color displays: Bridging the gap between theory and data. Evol Biol. 2007;34:61–71.
Nagy L. Changing patterns of gene regulation in the evolution of arthropod morphology. Am Zool. 1998;38:818–28.
Raff EC, Raff RA. Dissociability, modularity, evolvability. Evol Dev. 2000;2:235–7.
von Dassow G, Munro E. Modularity in animal development and evolution: elements of a conceptual framework for EvoDevo. J Exp Zool. 1999;285:307–25.
Wagner GP, Altenberg L. Perspective: Complex adaptations and the evolution of evolvability. Evolution. 1996;50:967–76.
Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95:14863–8.
Halfon MS, Grad Y, Church GM, Michelson AM. Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model. Genome Res. 2002;12:1019–28.
Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N. Revealing modular organization in the yeast transcriptional network. Nat Genet. 2002;31:370–77.
Niehrs C, Pollet N. Synexpression groups in eukaryotes. Nature. 1999;402:483–7.
Campillos M, von Mering C, Jensen LJ, Bork P. Identification and analysis of evolutionarily cohesive functional modules in protein networks. Genome Res. 2006;16:374–82.
Chen Y, Dokholyan NV. The coordinated evolution of yeast proteins is constrained by functional modularity. Trends Genet. 2006;22:416–9.
Pál C, Papp B, Lercher MJ. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat Genet. 2005;37:1372–5.
Wagner A. Evolutionary constraints permeate large metabolic networks. BMC Evol Biol. 2009;9:231.
Klassen JL. Phylogenetic and evolutionary patterns in microbial carotenoid biosynthesis are revealed by comparative genomics. PLoS One. 2010;5, e11257.
Umeno D, Tobias AV, Arnold FH. Diversifying carotenoid biosynthetic pathways by directed evolution. Microbiol Mol Biol Rev. 2005;69:51–78.
Britton G, Liaaen-Jensen S, Pfander H, editors. Carotenoids. Boston: Birkhäuser Verlag; 2004.
Schmidt K, Connor A, Britton G. Analysis of pigments: carotenoids and related polyenes. In: Goodfellow M, O'Donnell AG, editors. Chemical methods in prokaryotic systematics. Chichester: John Wiley & Sons; 1994. p. 403–61.
Brush AH. Metabolism of carotenoid-pigments in birds. FASEB J. 1990;4:2969–77.
McGraw KJ. The mechanics of carotenoid coloration in birds. In: Hill GE, McGraw KJ, editors. Bird coloration volume 1: Mechanisms and measurements. Cambridge: Harvard University Press; 2006. p. 177–242.
Badyaev A, Morrison E, Belloni V, Sanderson M. Tradeoff between robustness and elaboration in carotenoid networks produces cycles of avian color diversification. Biol Direct. 2015;10:45.
Jaccard P. The distribution of the flora in the alpine zone. New Phytol. 1912;11:37–50.
Rodrigues JFM, Wagner A. Genotype networks, innovation, and robustness in sulfur metabolism. BMC Syst Biol. 2011;5:39.
Maddison WP, Maddison DR. Mesquite: a modular system for evolutionary analysis. Version 3.03. 2015. [http://mesquiteproject.wikispaces.com/]
Midford PE, Garland T, Jr., Maddison WP. PDAP package of Mesquite. Version 1.16. 2011. [http://mesquiteproject.org/pdap_mesquite/].
Guimerà R, Amaral LAN. Functional cartography of complex metabolic networks. Nature. 2005;433:895–900.
Guimerà R, Amaral LAN. Cartography of complex networks: Modules and universal roles. J Stat Mech Theor Exp. 2005;2005:P02001-1-13.
Harary F. Graph theory. Reading: Addison-Wesley; 1969.
Smoot ME, Ono K, Ruscheinski J, Wang P-L, Ideker T. Cytoscape 2.8: New features for data integration and network visualization. Bioinformatics. 2011;27:431–2.
Assenov Y, Ramírez F, Schelhorn S-E, Lengauer T, Albrecht M. Computing topological parameters of biological networks. Bioinformatics. 2008;24:282–4.
Doncheva NT, Assenov Y, Domingues FS, Albrecht M. Topological analysis and interactive visualization of biological networks and protein structures. Nat Protoc. 2012;7:670–85.
McSweeney PJ. Randomnetworks. Version 1.0. 2008. [http://apps.cytoscape.org/apps/randomnetworks]
Ebenhöh O, Handorf T, Heinrich R. A cross species comparison of metabolic network functions. Genome Inform. 2005;16:203–13.
Thomas DB, McGraw KJ, Butler MW, Carrano MT, Madden O, James HF. Ancient origins and multiple appearances of carotenoid-pigmented feathers in birds. Proc R Soc B. 2014;281:20140806.
Furnham N, Sillitoe I, Holliday GL, Cuff AL, Laskowski RA, Orengo CA, et al. Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. PLoS Comp Biol. 2012;8, e1002403.
Altincicek B, Kovacs JL, Gerardo NM. Horizontally transferred fungal carotenoid genes in the two-spotted spider mite Tetranychus urticae. Biol Lett. 2012;8:253–7.
Kreimer A, Borenstein E, Gophna U, Ruppin E. The evolution of modularity in bacterial metabolic networks. Proc Natl Acad Sci U S A. 2008;105:6976–81.
Moran NA, Jarvik T. Lateral transfer of genes from fungi underlies carotenoid production in aphids. Science. 2010;328:624–7.
Nováková E, Moran NA. Diversification of genes for carotenoid biosynthesis in aphids following an ancient transfer from a fungus. Mol Biol Evol. 2012;29:313–23.
Sloan DB, Moran NA. Endosymbiotic bacteria as a source of carotenoids in whiteflies. Biol Lett. 2012;8:986–9.
Kelley BP, Sharan R, Karp RM, Sittler T, Root DE, Stockwell BR, et al. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc Natl Acad Sci U S A. 2003;100:11394–399.
Kondrashov FA. Gene duplication as a mechanism of genomic adaptation to a changing environment. Proc R Soc B. 2012;279:5048–57.
Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, et al. Comparative genomics reveals insights into avian genome evolution and adaptation. Science. 2014;346:1311–20.
Walsh N, Dale J, McGraw KJ, Pointer MA, Mundy NI. Candidate genes for carotenoid coloration in vertebrates and their expression profiles in the carotenoid-containing plumage and bill of a wild bird. Proc R Soc B. 2012;279:58–66.
Hudon J, Anciães M, Bertacche V, Stradi R. Plumage carotenoids of the pin-tailed manakin (Ilicura militaris): Evidence for the endogenous production of rhodoxanthin from a colour variant. Comp Biochem Physiol B Biochem Mol Biol. 2007;147:402–11.
Prum RO, LaFountain AM, Berro J, Stoddard MC, Frank HA. Molecular diversity, metabolic transformation, and evolution of carotenoid feather pigments in cotingas (Aves: Cotingidae). J Comp Physiol B. 2012;182:1095–116.
Prum R, LaFountain A, Berg C, Tauber M, Frank H. Mechanism of carotenoid coloration in the brightly colored plumages of broadbills (Eurylaimidae). J Comp Physiol B. 2014;184:651–72.
Morrison ES, Badyaev AV. The landscape of evolution: Reconciling structural and dynamic properties of metabolic networks in adaptive diversifications. Integr Comp Biol. 2016;56:235-46.
Bongaerts GP, Vliegenthart JS. Effect of aminoglycoside concentration on reaction rates of aminoglycoside-modifying enzymes. Antimicrob Agents Chemother. 1988;32:740–6.
Matsuno R, Nakanishi K, Ohnishi M, Hiromi K, Kamikubo T. Threshold in a single enzyme reaction system: reaction of maltose catalyzed by saccharifying α-Amylase from B. Subtilis. J Biochem. 1978;83:859–62.
Albert R, Jeong H, Barabási A-L. Error and attack tolerance of complex networks. Nature. 2000;406:378–82.
Jeong H, Mason SP, Barabási A-L, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–2.
Schmidt S, Sunyaev S, Bork P, Dandekar T. Metabolites: a helping hand for pathway evolution? Trends Biochem Sci. 2003;28:336–41.
Aris-Brosou S. Determinants of adaptive evolution at the molecular level: The extended complexity hypothesis. Mol Biol Evol. 2005;22:200–9.
Badyaev AV. “Homeostatic hitchhiking”: A mechanism for the evolutionary retention of complex adaptations. Integr Comp Biol. 2013;53:913–22.
Kauffman S, Levin S. Towards a general theory of adaptive walks on rugged landscapes. J Theor Biol. 1987;128:11–45.
Heijnen JJ, van Gulik WM, Shimizu H, Stephanopoulos G. Metabolic flux control analysis of branch points: an improved approach to obtain flux control coefficients from large perturbation data. Metab Eng. 2004;6:391–400.
LaPorte DC, Walsh K, Koshland DE. The branch point effect. Ultrasensitivity and subsensitivity to metabolic control. J Biol Chem. 1984;259:14068–75.
Pritchard L, Kell DB. Schemes of flux control in a model of Saccharomyces cerevisiae glycolysis. Eur J Biochem. 2002;269:3894–904.
Rausher MD. The evolution of genes in branched metaoblic pathways. Evolution. 2013;67:34–48.
Flowers J, Sezgin E, Kumagai S, Duvernell D, Matzkin L, Schmidt P, et al. Adaptive evolution of metabolic pathways in Drosophila. Mol Biol Evol. 2007;24:1347–54.
Higginson DM, Belloni V, Davis SN, Morrison ES, Andrews JE, Badyaev AV. Evolution of long-term coloration trends with biochemically unstable ingredients. Proc R Soc B. 2016;283:20160403.
Wang Z, Zhang J. In search of the biological significance of modular structures in protein networks. PLoS Comp Biol. 2007;3, e107.
Fraser HB. Modularity and evolutionary constraint on proteins. Nat Genet. 2005;37:351–2.
Fox DL. Metabolic fractionation, storage and display of carotenoid pigments by flamingoes. Comp Biochem Physiol. 1962;6:1–24.
Fox DL, Smith VE, Wolfson AA. Carotenoid selectivity in blood and feathers of lesser (African), Chilean and greater (European) flamingos. Comp Biochem Physiol. 1967;23:225–32.
McGraw KJ, Beebee MD, Hill GE, Parker RS. Lutein-based plumage coloration in songbirds is a consequence of selective pigment incorporation into feathers. Comp Biochem Physiol B Biochem Mol Biol. 2003;135:689–96.
Gerhart J, Kirschner M. Evolution and evolvability. In: Cells, embryos, and evolution. Malden: Blackwell Science; 1997. p. 580–614.
Reid RGB. Biological emergence: Evolution by natural experiment. Cambridge: MIT Press; 2007.
Yang AS. Modularity, evolvability, and adaptive radiations: a comparison of the hemi- and holometabolous insects. Evol Dev. 2001;3:59–72.
Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO. The global diversity of birds in space and time. Nature. 2012;491:444–8.
We thank V. Belloni, V. Farrar and J. Andrews for help with the data collection, and R. Duckworth, M. Sanderson, D. Higginson, A. Potticary, C. Gurguis, G. Semenov and three anonymous reviewers for thorough comments on previous versions and helpful suggestions.
This work was supported by the David and Lucille Packard Foundation, Amherst College graduate fellowships, and the University of Arizona Open Access Publishing Fund.
Availability of data and material
ESM designed the study. ESM and AVB analyzed the data. ESM wrote the manuscript with help from AVB. Both authors have read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
(a) Appendix S1: Confirmed enzymatic reactions in the “avian space” of global carotenoid biosynthesis network in bacteria, plants, and animals. This appendix contains references supporting the presence of specific compounds and the enzymatic reactions that comprise the avian carotenoid biosynthesis global network. (b) Appendix S2: Characteristics of carotenoid metabolic networks for species used in the study. This appendix contains the structural measurements and references for compound identification and the method of identification for each of the species’ metabolic networks. (c) Appendix S3: Module assignments in the avian subset of the global carotenoid metabolic network. This appendix contains the module assignments for each of the compounds in the global avian carotenoid metabolic network. The number of the module corresponds to the partitioned regions in Fig. 2. (PDF 1501 kb)
Species’ binary metabolic networks. This appendix contains binary metabolic networks for each of the species included in the study. (XLSX 123 kb)
Majority rule consensus phylogeny of species included in the study. This appendix contains the Newick tree format of the majority rule consensus phylogeny visually presented in Fig. 4, 5, 6, 7 and 8. The tree is based on 1,000 randomly sampled trees from the Hackett All Species pseudo-posterior distribution downloaded from birdtree.org that is based on Jetz et al. 2012. (TXT 20 kb)