Evolutionary constraints permeate large metabolic networks
© Wagner. 2009
Received: 17 February 2009
Accepted: 11 September 2009
Published: 11 September 2009
Skip to main content
© Wagner. 2009
Received: 17 February 2009
Accepted: 11 September 2009
Published: 11 September 2009
Metabolic networks show great evolutionary plasticity, because they can differ substantially even among closely related prokaryotes. Any one metabolic network can also effectively compensate for the blockage of individual reactions by rerouting metabolic flux through other pathways. These observations, together with the continual discovery of new microbial metabolic pathways and enzymes, raise the possibility that metabolic networks are only weakly constrained in changing their complement of enzymatic reactions.
To ask whether this is the case, I characterized pairwise and higher-order associations in the co-occurrence of genes encoding metabolic enzymes in more than 200 completely sequenced representatives of prokaryotic genera. The majority of reactions show constrained evolution. Specifically, genes encoding most reactions tend to co-occur with genes encoding other reaction(s). Constrained reaction pairs occur in small sets whose number is substantially greater than expected by chance alone. Most such sets are associated with single biochemical pathways. The respective genes are not always tightly linked, which renders horizontal co-transfer of constrained reaction sets an unlikely sole cause for these patterns of association.
Even a limited number of available genomes suffices to show that metabolic network evolution is highly constrained by reaction combinations that are favored by natural selection. With increasing numbers of completely sequenced genomes, an evolutionary constraint-based approach may enable a detailed characterization of co-evolving metabolic modules.
Evolutionary history can constrain future evolution. It can constrain both the production and the preservation of phenotypic variation [1–6]. For instance, the acquisition of some traits may require the presence of other traits. To take a metabolic example, the biosynthesis of steroid hormones uses cholesterol as its starting point and prerequisite . Cholesterol, in turn is a eukaryotic invention. Conversely, the loss of certain traits constrains future evolution, because it may be difficult to reverse, as exemplified by the independent and irreversible loss of planktonic feeding stages in multiple echinoderms .
The best-studied cases of constrained variation regard macroscopic and readily observable organismal traits . However, if one wants to study genetic contributions to constrained variation, such traits are not ideal study objects. This is because hundreds to thousands of genes with often poorly understood interactions are typically involved in forming any macroscopic trait. Such incomplete characterization of genotypes, and of how exactly they produce phenotypes render genetic causes of constrained evolution difficult to understand for complex traits.
This problem suggests that more tractable genetic systems, where genotypic information is readily available, may be a useful starting point to learn more about the extent and pervasiveness of evolutionary constraints. Molecules such as proteins and RNA are the best-studied such candidate systems, but regulatory and metabolic networks are increasingly accessible with the available of genome-scale sequence and functional data [10–34]. Taken together, the phenotypic diversity of molecules and the networks they form is sufficiently rich to encapsulate the phenotypic diversity of organismal traits. Especially for metabolic networks, significant amounts of information about network genotypes and how they vary among species are available [35–37].
Studies of evolutionary constraints as applied to DNA, RNA, or protein sequences have a long history [38, 39]. They show that most amino acid or nucleotide residues of these molecules cannot vary freely, and that variation in some residues is much more constrained than in others. Only a minority of residues may be under weak or no constraint, for example those that cause silent changes in lowly expressed proteins. We know less about evolutionary constraints for biological networks such as genome-scale metabolic networks, despite intriguing experimental observations that raise many questions about such constraints. Specifically, gene knockout experiments and computational work [13, 14, 16, 22, 26, 40–44] show that in any one environment, many individual reactions of a metabolic network are expendable. Even reactions in the most central parts of metabolism, such as glycolysis or the citric acid cycle may be dispensable . One reason lies in the distributed nature of metabolic systems, where several bypasses may exist around any blocked pathway. Does that mean that metabolic networks are unconstrained, or only weakly constrained in changing their complement of enzymatic reactions on evolutionary time scales?
I will here ask this and related questions with data from more than 200 prokaryotic genome-scale metabolic networks. Such networks are central to all life. They sustain it by producing metabolic energy and biosynthetic precursors. The metabolic network of typical free-living heterotrophic organisms comprises of the order of 103 different biochemical reactions [40, 45, 46]. These reactions are catalyzed by enzymes which are encoded by genes. Variation in the structure of such a network occurs through either mutational elimination of individual reactions (enzyme-coding genes), or through addition of one or more reactions, for which horizontal gene transfer is a major mechanism in prokaryotes. Information about which reactions are catalyzed in any one organism has been assembled into various databases [36, 37, 47] through a combination of manual curation and comparative genome analysis. For example, the KEGG database whose data I use here contains information about the complement of enzymes encoded in more than 600 completely sequenced prokaryotic genomes.
Can individual reactions in a metabolic network vary independently from other such reactions? If so, what fraction of reactions can vary independently? If co-variation among reactions occurs, does it affect pairs of enzymes or larger groupings? Only a few years ago, these and similar questions could not have been addressed, because the number of completely sequenced genomes required for at least a coarse metabolic annotation  was too small. With genome-scale information for hundreds of organisms, such analysis is now becoming tractable.
The elementary unit of evolutionary change in metabolic networks is the individual chemical reaction catalyzed by an enzyme that is encoded by a metabolic gene. Except where mentioned otherwise, I here represent each such reaction on the level of the gene, as represented by orthologs of metabolic enzyme-coding genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/; ). This representation facilitates evolutionary analysis, because it is genes (and not reactions) that undergo mutations, and that are exchanged between organisms through horizontal gene transfer.
These observations suggest that metabolic networks are very diverse in their complement of metabolic reactions, even for organisms living in similar habitats. Together with the resilience of such networks to elimination of reactions [13, 14, 16, 22, 26, 40–44], which indicate enormous plasticity in metabolic network organization in a given environment, they raise the question whether the evolution of such networks is perhaps only subject to weak constraints. I first addressed this question on the smallest level of evolutionary change, that of the individual reaction.
The same kind of test can also be used to ask whether there are pairs of reactions that tend to "avoid" each other, that is, whether a network that harbors one reaction tends not to harbor the other reaction. Such reaction pairs exist, but their numbers are much smaller. Specifically, fewer than five percent of reactions show such a negative association among genomes at the Bonferroni-corrected P = 0.05, and no reaction pair shows a P-value smaller than 10-17 (Figure 2a). In sum, these associations suggest that a substantial fraction of reactions covary extensively with other reactions.
In any metabolic network, the production of important metabolites for biomass production requires the cooperation of multiple reactions. This observation calls for an extension of the above reaction-centered approach to larger units. Co-occurrence of reaction sets would suggest joint constrained evolution and joint requirement for key metabolic processes. A substantial technical problem to identifying such sets is the astronomical number of possible combinations (triples, quadruples, and so forth) of reactions, which renders an exhaustive evaluation infeasible. To circumvent this problem, I take the following graph-theoretic approach. I define a reaction constraint graph whose nodes are individual reactions. Two reactions are connected by an undirected edge in this graph if one of the reactions is associated with another reaction in the pairwise assay above, at a P-value that lies below a given threshold. A connected component in this graph is defined as a set of reactions that shows pairwise associations among set members, but where no reaction is associated with other reactions outside the set. (I eliminate isolated reactions, that is, components of size one from this graph.) Such connected components can be thought of as sets of constrained or co-occurring reactions, and their identification is the target of this part of my analysis. The size and number of components of the reaction constraint graph may vary, depending on which P-value threshold is used to define the graph. At a high threshold (low required statistical significance), the graph may consist only of one or few large components that comprise most reactions, whereas at lower thresholds, the graph may fragment into multiple components of decreasing size that indicate increasingly strongly associated sets of reactions. To have a frame of reference, I compared the structure of this graph at any given threshold to that of a randomized graph. This randomized graph was generated from the original reaction association graph through swapping of edge pairs (see Methods), which leaves the number of edges, and the number of edges per node constant, but randomizes the graph in other respects. More specifically, I generated 20 such randomized graphs and characterized the component size distributions of each of these graphs.
In sum, sets of evolutionary constrained reactions are typically small, much smaller than would be expected if reaction associations were distributed randomly across metabolic networks. These observations raise the question whether the constrained sets of reactions are largely congruent with traditional classifications of metabolic pathways. To address this question, I took advantage of the biochemical pathway classification of reactions in KEGG . For any constrained reaction set as defined above, I determined the fraction of all reaction pairs that are assigned to the same metabolic pathway. For those reaction sets at the Bonferroni-corrected P = 0.05 where reactions can be assigned to individual pathways, all reaction pairs within a set can be assigned to the same pathway for almost 90% (124/139) of sets. More than half of the remaining few (15) reaction sets have between four and 1271 reactions, and are thus biased towards larger reaction sets. For more tightly associated reaction sets (very small P-values), this bias towards constrained reactions in the same pathway increases. For example, at P = 10-15 and P = 10-20, respectively 92% (127/138) and 96.6% (85/88) constrained reaction sets belong in one pathway. In sum, individual reactions for most constrained reaction sets can be allocated to the same pathway.
The 15 most highly constrained sets of metabolic reactions.
K01468 K01712 K01745
Transmembrane sensor/Copper resistance
Pyrroloquinoline quinone (PQQ) biosynthesis
K06139 K06136 K06138
K02232 K02227 K02233 K02231
Acetyl CoA, fatty acid and amino acid metabolism
Inositol phosphate catabolism
K03335 K03336 K03337 K03338
Starch and glycogen biosynthesis
4-hydroxy 3-oxovalerate aldolase/acetaldehyde dehydrogenase
The same causes may also explain the conspicuous absence of reactions in the most central parts of metabolism in the set of Table 1, despite the importance of such reactions in life. Central carbon metabolism is highly reticulate, with many alternative metabolic routes for missing reactions , which may lead to fewer highly constrained reaction sets.
I next asked whether constrained gene pairs in this set are always tightly linked, which might indicate that they are always transferred jointly. Figure 5c shows a histogram of the distance between the genes encoding the first and third reaction in the pathway of Figure 5a, for all genomes that encode both of these reactions. Figure 5d shows an analogous histogram for the distance in terms of the number of intervening genes. Perhaps the most striking feature of this distribution is its bimodality. That is, a substantial fraction of gene pairs seems to be tightly linked, with a distance of fewer than 10 kilo base pairs and fewer than 10 intervening genes, but an equally substantial fraction is loosely linked or unlinked, with more than 100 kbp and hundreds of genes in between them. The tightly linked gene pairs likely reflect the well-known organization of histidine utilization genes into one or two linked operons, which has been observed for some organisms [56–58]. The existence of many unlinked gene pairs suggests that not all of the covariation of these genes can be explained through constrained variation. Bimodial distributions (not shown) are also observed for the other two reaction pairs analyzed in Figure 5a.
The remaining most highly constrained reaction sets of size greater than two occur in PQQ synthesis (3 reactions in 23-25 genomes; P < 10-18), whose reactions are poorly characterized, and in the catabolism of myo-inositol (4 reactions in 26-30 genomes; P < 10-17). As opposed to eukaryotes, where inositol derivatives have signaling roles, in prokaryotes they serve structural roles as membrane anchors of proteins and glycolipods, and they can aid the infectivity of pathogens . Myo-inositol can also serve as sole carbon source for several microorganisms . The reactions in question catalyze four consecutive steps in a pathway that converts myo-inositol into acetyl-CoA or glyceraldehyde-3-phosphate (KEGG pathway identifier: ko00562). In these last two examples, too few genomes contain the reactions to carry out a meaningful statistical analysis of the genomic distance distribution, but I note that also here, some of the genes encoding individual enzyme pairs are not closely linked (not shown).
Both experimental and computational work shows that metabolic networks vary greatly in their organization. For example, earlier work on the organization of the citric acid cycle in 19 completely sequenced microbes showed that almost every organism encodes a different subset of the cycle's reactions . Given the centrality of this cycle in energy metabolism, this variability is especially remarkable. The genome-scale analysis of multiple metabolic networks from Figure 1 highlights such variability. It shows that metabolic networks are highly diverse in their reaction content. In addition, metabolic networks can be quite resilient to elimination of individual reactions in any given environment [13, 14, 16, 22, 26, 40–44], partly because the blocked reactions can readily be bypassed through alternative metabolic routes. Furthermore, new metabolic pathways and reactions continue to be discovered [63, 64]. Taken together, these observations raise the possibility that there are so many different ways of organizing the flow of matter through a metabolic network -- many of which still unknown -- that individual reactions may only be weakly constrained in their evolution. In this view, the mutational elimination of any one reaction might be readily compensated by an alternative metabolic route that is either already present in the genome or can be readily transferred into the genome through horizontal transfer.
In contrast to this scenario, the merely 222 genomes studied here suffice to show that the majority of reactions are indeed highly constrained in their evolution. This is indicated by their statistically significant co-occurrence with other reactions in constrained reaction pairs. The genes in such a pair are not always tightly linked, which renders horizontal co-transfer of constrained reaction sets an unlikely sole cause for these patterns of association. Constrained reaction pairs can be grouped into small sets whose number is substantially greater than would be expected if the same number of associations occurred among randomly chosen reaction pairs in a metabolic network. The reactions in a set typically belong to the same biochemical pathway(s). Despite the distributed nature of metabolic networks, where perturbations in one part of a large network can be compensated through changes in other, superficially remote parts, clearly identifiable constrained reaction sets exist, and are usually highly localized. These observations are consistent with earlier observations of modular structures in metabolic (and other) networks [29, 51, 65–73]. For example  focused on the identification of conserved modules in metabolic pathways, and showed that many such modules exist, have a skewed size distribution, and may be hierarchically organized.
In the bioinformatics community, questions regarding the modular organization of biological networks have attracted significant attention, unlike the topic of evolutionary constraints. In contrast, whether evolutionary constraints exist, and how pervasive they are has been an important and controversial topic in evolutionary biology. Standard textbooks  would discuss these questions extensively. This paper is complementary to earlier work from the bioinformatics community [29, 51, 65–73], not only in approach, but also by focusing on the concept of evolutionary constraints. It points to the fertile ground that data on molecular networks can provide for the analysis of such constraints.
In the evolutionary biology literature, constraints are hardly ever absolute. Classical examples include the tetrapod limb, which has mostly five digits, although ichthyosaurs had more; the lower jaw of frogs, which generally lacks teeth, except for the genus Gastrotheca . Constraints are thus best viewed as statistical biases in the occurrence or co-occurrence of traits (here: chemical reactions). For metabolic networks, the reason behind the lack of absolute constraints are easily explained: Matter can flow along many alternative routes through a metabolic network. A reaction in a constrained reaction set whose presence is be essential in one metabolic network, may be dispensable in another, because its role can be assumed by other reactions. Statistical analyses like mine help avoid the opposite extreme of assuming maximal flexibility, because they show that network reactions show constrained evolution.
The tightly constrained evolution of many reaction sets raises a question about the causes of these constraints. Specifically, is it caused merely by natural selection favoring certain reaction combinations, or do metabolic reactions also co-vary in their transfer from bacterium to bacterium? An important source of variation in metabolic and other networks is horizontal gene transfer [21, 74–82]. In such transfer, immediately adjacent genes are more likely to be transferred together in any one transfer event than more distantly related genes. For example, a study on horizontal transfer among five different E. coli species showed that the majority of transfer events involved fewer than 15 kilo base pairs of DNA . Tight linkage of genes thus introduces a constraint, in addition to any constraint imposed by selection, because linked genes are not transferred independently of one another. Selection may of course itself be the ultimate cause of such tight linkage, because tight linkage may allow beneficial coregulation or co-transfer of genes [84–88]. In the latter case, selection's preference of certain gene combinations may have caused variation in these gene combinations itself to become constrained. In other words, if certain reaction combinations are favorable, then the genes involved in them might become tightly linked over time.
I asked how important gene linkage is in constraining reaction evolution by examining the most highly constrained reaction pairs and how closely together their encoding genes are linked in a genome. Not surprisingly, the results show clear evidence that some such linkage occurs. However, there are also many cases where linkage is not likely to be solely responsible for constrained variation. The reason is that covarying metabolic genes are often unlinked in many genomes in which they occur. This observation is in line with previous work , which suggested that gene clusters and operons are highly dynamic on an evolutionary time scale. They form and disintegrate readily, and their constituent genes are sometimes tightly linked, sometimes scattered throughout the genome [84, 86, 89, 90].
Horizontal gene transfer is perhaps the most rapidly acting form of change in prokaryotic metabolic networks, because it can introduce multiple new genes into a genome on short evolutionary time scales. However, metabolic networks can change also without such horizontal transfer, albeit on longer time scales. Aside from the mutational loss of reactions (which may be slow for reactions in a highly constrained reaction set), new enzymes can be created through gene duplication and subsequent sequence divergence, as well as through recombination and shuffling of domains and exons [91–94]. Such processes are the source of new enzymatic reactions that can then be "shared" among organisms through horizontal transfer. Their successful transfer will depend on whether reactions of in the same constrained reaction set are co-transferred, or are already present in the new host.
Current limitations of any statistical approach to analyze constrained evolution of metabolic networks include the limited number of available genomes. With hundreds of genomes at hand, such statistical characterization is beginning to be meaningful, but it is unlikely to resolve the fine-structure of such associations, for example by distinguishing more conserved from less conserved pathways with any confidence. The observation that some 20 percent of reactions appear to be unconstrained (Figure 2a) may be explicable through this limitation. Some reactions occur only in few of the 222 genomes I studied, and to preserve a meaningful statistical analysis, I excluded reaction pairs whose reactions occurred in fewer than five genomes. The further the number of genomes that contain each reaction in a pair deviates from this lower limit, the more readily a statistical test can reveal constraints. Thus, the number of apparently unconstrained reaction pairs will undoubtedly decrease as the number of completely sequenced genomes increases. It may well decrease to zero. Similarly, metabolic databases contain annotation errors. Their numbers are unknown, and their presence is a source of noise for such statistical analyses. Their incidence will undoubtedly decrease, as more genomes become sequenced and analyzed comparatively.
A second limitation comes from a particular class of potential network misannotation. A reaction can be catalyzed by an enzyme that is unrelated in sequence to any other known enzyme catalyzing this reaction. If so, then identification of metabolic network composition based on genome-sequence alone would miss the reaction. This is an example of convergent or parallel protein evolution, and has also been called orthologous gene replacement. For example, 5 out of 14 reactions in the citric acid cycle were reported to be subject to non-orthologous displacement . (The respective genes are now all represented in KEGG). Unfortunately, manual curation of metabolic reactions becomes impractical in surveys of many metabolic networks. The problem is of unknown magnitude. It also will be alleviated only with time, as more unrelated enzymes encoding the same reactions are discovered.
Thirdly, the occurrence of particular pathways in data such as that of Table 1 depends on pathway annotation, and in particular on pathway size. Complex pathways with many reactions are more likely to harbor constrained reaction sets than small pathways with few reactions, merely by virtue of their larger size. In addition, different databases may contain somewhat different pathway annotations. Because of the dangers of overinterpreting constraints in particular pathways, I thus here focus only on more generic features, namely whether the reactions in a constrained set occur in the same pathway.
A final limitation is that the constrained reaction sets I analyze have evolved in the context of a phylogeny, but that statistical analysis neglects this evolutionary history. Some phylogenetic methods have been developed to address correlations caused by shared evolutionary history for phenotypic and sequence data . Aside from their various limitations [95, 96], such methods would have limited applicability to the microbial genomes that I study. First, most such methods require either an implicit or explicit model of character (here: reaction) evolution. Multiple such models exist for sequence data, because enormous amounts of such data exist that can inform these models. In contrast, such models do not yet exist for metabolic network evolution. Second, and more importantly, extensive horizontal transfer -- in particular of metabolic genes [75, 78, 83] -- obscures evolutionary history, which can differ greatly among reactions. If, for example, even genomes as closely related as those of different Escherichia coli strains differ in more than a megabasepair of sequence  and some 25% of metabolic reactions, then existing methods are surely inadequate for more distantly related species. Visual inspection and manual analysis is feasible for a small number of examples like those I study (Figures 5, 6, 8), but not for large numbers of constrained reaction sets. We sorely need new methods to incorporate evolutionary history and horizontal gene transfer into the analysis of genome-scale metabolic data sets.
Despite the great apparent flexibility of metabolic networks suggested by gene knockout studies, and despite considerable network differences among closely related species, most individual reactions are not free to vary independently of other reactions. The discrepancy between these two observations can be resolved if one considers that the environment plays a crucial role in determining the effect of changing a network's reactions. Environmental variation is difficult to account for comprehensively in gene knockout studies, and its extent is often unknown for different species.
16S rDNA data and phylogenetic analysis. All publicly available complete prokaryotic genome sequences were obtained in December 2008 from the National Center for Biotechnology Information (NCBI; http://ftp.ncbi.nih.gov/genomes). One 16S rDNA sequence was extracted from the genbank file for each genome. 16S rDNA sequences were aligned using the NAST (Nearest Alignment Space Termination) algorithm, as implemented in a web server specifically designed to align 16S rDNA sequences (http://greengenes.lbl.gov; ). Pairwise 16S rDNA nucleotide divergence was calculated from this multiple alignment as the fraction of (non-gap) characters that differ between two sequences. A prokaryotic 16S rDNA maximum-likelihood tree was constructed using the package phyml , with the Hasegawa-Kishino-Yano  substitution model, where the transition-transversion ratio and the proportion of variable sites were estimated from the data. To accommodate variable substitution rates among sites, I allowed for four different substitution rates and estimated the parameter of the gamma distribution determining the rate variation from the data. A tree generated by neighbor joining  was used as the starting tree to be refined by the maximum likelihood algorithm. ITOL  was used for tree visualization.
NCBI provides a broad classification of habitat types for completely sequenced prokaryotes http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi. From this classification, I identified 10 anaerobic aquatic species, 17 aerobic terrestrial species, 24 thermophilic species, and 8 moderate halophilic species for which complete genome sequences and metabolic network information were available. In addition, I identified seven completely sequenced marine prokaryotic species from the Marine Microbiology Initiative http://www.moore.org/microgenome/strain-list.aspx with metabolic network information in KEGG. The horizontal bars in Figure 1 are based on these sets of species.
If two reactions R 1 and R 2 occur in n 1 and n 2 networks, such that n 1 ≥ n 2, and if R 2 occurs independently from R 1, then the number of metabolic networks harboring both R 1 and R 2 can be modeled as a binomially distributed random variable X with parameters n 1 and p, where p is simply the total fraction of networks that harbor R 2. To assess whether R 2 co-occurs with R 1 to a significantly greater extent than expected by chance alone, one can determine the probability P(X ≥ n 12), where n 12 is the observed number of networks that harbor both R 1 and R 2. If P(X ≥ n 12) is smaller than a pre-determined significance threshold (e.g., P = 0.05) then the association of the two reactions is deemed significant. If n 1 ≤ n 2, then the test is carried out with reversed roles of the two reactions R 1 and R 2. This test essentially asks if the less frequent of two reactions is significantly associated with the more frequent one. An exactly analogous test can be carried out to determine whether two reactions co-occur significantly less often than expected by chance alone. Specifically, values of 1- P(X ≥ n 12) that are smaller than a predetermined significance threshold would indicate reaction pairs with a tendency to not occur in the same metabolic network. In applications of these tests, I restricted myself to reaction pairs R 1 and R 2 where genes encoding these reactions occur in at least 5 genomes.
A reaction constraint graph is a graph whose nodes are reactions, and where two reactions are connected by an edge if the statistical significance of their pairwise association falls below a given significance threshold. To randomize such a graph, I used an edge swapping algorithm  that preserves each node's number of neighbors and the node's degree distribution. This algorithm first chooses two edges e 1 and e 2 at random that do not share any nodes. It then reconnects e 1 such that its source node becomes linked to the target node of e 2, and e 2 such that its source node becomes linked to the target node of e 1. I iterate this algorithm 2 E times, where E is the total number of edges in the graph, to yield one randomized reaction graph. All results for random reaction graphs reported here are based on 20 randomized graphs for each significance threshold. To evaluate whether members of a constrained reaction set can be assigned to the same pathway, I first determined for each pair of members of this set whether they share at least one KEGG pathway annotation. I then calculated the fraction of pairs in the set for which this was the case.
For any two gene pairs that show a statistically significant association, I identified the names of all known orthologs of genes encoding these reactions from the KEGG "ko" file (available at ftp://ftp.genome.jp/pub/kegg/genes/;). I then searched for the respective genes and their genomic position in the annotated genbank genome sequence files available at http://ftp.ncbi.nih.gov/genomes. In those cases where one genome contains more than one ortholog encoding the same reaction, I calculated pairwise distances for each of these orthologs separately, and include these distances in the distributions reported here.
I would like to acknowledge support from Swiss National Science Foundation grants 315200-116814 and 315200-119697, as well as from support through the SystemsX.ch project YeastX.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.