- Research article
- Open Access
On the nature of furevolution: A phylogenetic approach in Actinobacteria
BMC Evolutionary Biology volume 8, Article number: 185 (2008)
An understanding of the evolution of global transcription regulators is essential for comprehending the complex networks of cellular metabolism that have developed among related organisms. The fur gene encodes one of those regulators – the ferric uptake regulator Fur – widely distributed among bacteria and known to regulate different genes committed to varied metabolic pathways. On the other hand, members of the Actinobacteria comprise an ecologically diverse group of bacteria able to inhabit various natural environments, and for which relatively little is currently understood concerning transcriptional regulation.
BLAST analyses revealed the presence of more than one fur homologue in most members of the Actinobacteria whose genomes have been fully sequenced. We propose a model to explain the evolutionary history of fur within this well-known bacterial phylum: the postulated scenario includes one duplication event from a primitive regulator, which probably had a broad range of co-factors and DNA-binding sites. This duplication predated the appearance of the last common ancestor of the Actinobacteria, while six other duplications occurred later within specific groups of organisms, particularly in two genera: Frankia and Streptomyces. The resulting paralogues maintained main biochemical properties, but became specialised for regulating specific functions, coordinating different metal ions and binding to unique DNA sequences. The presence of syntenic regions surrounding the different fur orthologues supports the proposed model, as do the evolutionary distances and topology of phylogenetic trees built using both Neighbor-Joining and Maximum-Likelihood methods.
The proposed fur evolutionary model, which includes one general duplication and two in-genus duplications followed by divergence and specialization, explains the presence and diversity of fur genes within the Actinobacteria. Although a few rare horizontal gene transfer events have been reported, the model is consistent with the view of gene duplication as a main force of microbial genomes evolution. The parallel study of Fur phylogeny across diverse organisms offers a solid base to guide functional studies and allows the comparison between response mechanisms in relation with the surrounding environment. The survey of regulators among related genomes provides a relevant tool for understanding the evolution of one of the first lines of cellular adaptability, control of DNA transcription.
Fur proteins form a ubiquitous family of metal-responsive transcription factors known to regulate the transcription of several different genes in many diverse bacterial lineages. Upon binding to a metal ion, a conformational change is induced in the Fur regulator that promotes interaction with a cognate DNA sequence, typically known as a Fur or iron box . Initially, Fe (II) was thought to be the only metal able to play this role. In fact, Fur was initially characterised as being an iron-responsive regulator of ferric iron uptake systems in Escherichia coli [2, 3], hence its name. However, several studies have shown that Fur can bind other metals – besides iron – as co-factors, and thus the range of known regulated genes became broader than what was initially thought. Fur will bind Fe (II) and regulate iron homeostasis in several organisms [2, 4–8]. However, in addition to iron, different Fur homologues specifically require other divalent metals, including Zn2+, Ni2+, Mn2+ or Co2+, in order to bind to their cognate promoter targets [9–11]. These transition metal ions are considered fundamental for bacterial growth, given that they perform various essential functions in cellular metabolism. However, most of them are toxic at elevated levels. Therefore, a strict balance between their uptake and efflux, effected by metalloregulators like Fur, is essential for homeostasis [12–15]. Excess amounts of these metal ions elicit a number of stress conditions inside the cells, particularly oxidative stress . Accordingly, some Fur-like proteins, through sensing the availability of their specific metal co-factor, are sensitive to the redox status of the cell, establishing a relationship between these regulators and the oxidative stress response [17–20].
Recent publications support a major role for Fur in the regulation of various environmental conditions including acid shock response, detoxification of oxygen radicals, production of toxins and virulence factors, and several other metabolic functions ( and references therein). Particularly during pathogenic infections, iron and possibly other metals become generally unavailable. Therefore, bacterial metalloregulatory proteins including Fur are often crucial in pathogenesis processes. These observations have led to a growing recognition of the importance of Fur as a global transcriptional factor.
Among the Fur-like proteins, those responding to oxidative stress by regulating a downstream catalase-peroxidase have been the most studied in the Actinobacteria [19, 21–24]. However, despite detailed knowledge on catalase-peroxidase regulation and the variety of functional and structural studies on numerous microorganisms, very little is known about the origin and molecular evolution of fur. The increasingly recognised importance of Fur as a virulence factor [24–26] and as a potential target for novel antibiotics [10, 25] would be better addressed with a deeper knowledge on its evolutionary history. The enormous diversity of Fur in terms of both required co-factors and regulated genes has led to several efforts to organise the family [9, 27].
The phylum Actinobacteria is comprised of Gram-positive bacteria with an overall high Mol% G+C content. The primary habitat for many of these bacteria is the soil, where they degrade organic compounds and play an important role in mineralisation. The lineage also contains important secondary metabolite-producers and several important pathogens and symbionts. The latter groups include the mycobacterial agents of tuberculosis and leprosy and the nitrogen-fixing plant microsymbionts Frankia spp., among other ecologically and economically important microorganisms . Actinobacteria also inhabit aquatic systems, while others are associated with extreme environments such as acidic thermal springs , Antarctic regolith  or gamma  and UV irradiated biotopes . The ability to inhabit these different environments probably selects for the capacity to sense and cope with a wide range of metals, for which regulators of the Fur family are important. It is this diversity of habitats and lifestyles that makes Actinobacteria such an excellent subject for an evolutionary study of a global regulator like Fur.
The diversity of genes encoding regulators in a genome defines an organism's ability to adjust to the surrounding environment. Therefore, towards a parallel vision of the different organisms and their conserved response mechanisms, we have undertaken a phylogenetic approach to the Fur family using Actinobacteria as the model clade, intending to extend and complement previous functional and structural studies in an evolutionary perspective. We have analysed the factors that shaped Fur regulatory functions in different bacteria in order to create a bridge between origin and cellular role. An hypothesis that clarifies the presence and diversity of the Fur homologues in Actinobacteria is presented, leading to a stable protein family division based on functionality and phylogeny. Furthermore, a relationship is established between each organism's ecological niche, genome size, and number of Fur homologues.
Results and Discussion
Overview of Fur homologues
To describe the phylogenetic history of the Fur regulators, and knowing by previous reports that organisms may have more than one Fur, only completely sequenced actinobacterial genomes present in the NCBI (National Centre for Biotechnology Information) database (March/2007) were used. Since regulatory proteins are small and not highly conserved, the Fur homologues included in this study were chosen based not only on sequence similarity/identity values, but also on the presence of specific residues necessary for the in vivo regulatory activity of Fur. Functional studies on Streptomyces reticuli FurS have shown the importance of five key residues: C96 and C99 are involved in reversible S-S bond formation, Y59 is required for DNA binding and C96, H92 and H93 are implicated in zinc coordination . These residues were conserved not only in the closely related mycobacterial FurA, but also in the more distantly related Escherichia coli Fur, making them good indicators to validate the occurrence of a Fur homologue.
An initial BlastP screening against each of the 36 actinobacterial genomes yielded 82 putative homologues. These sequences were aligned with S. reticuli FurS to check for the presence of the above mentioned key residues [see Additional File 1]. Two putative homologues were eliminated at this stage, the Mjls_1895 from Mycobacterium sp. JLS, that lacks all the key residues, and the Rxyl_1224 from Rubrobacter xylanophilus DSM 9941, that lacks one of the histidines. Conversely, Lxx02790, from Leifsonia xyli subsp. xyli str. CTCB07, has an H instead of the Y59. Since histidines and tyrosines are both polar and have similar properties, this homologue was nevertheless retained. Also retained were a number of homologues that presented the two histidines corresponding to H92 and H93 in the form HXH. The remaining 80 sequences were distributed as listed in Table 1: five genomes had four Fur homologues, ten genomes had three, eleven genomes had two, eight genomes had one, and finally two genomes did not present any Fur homologue.
furgenes as part of a paralogous gene family
Since multiple fur homologues are present in many genomes, it seems logical that duplication and divergence from a common ancestor gene have occurred during their evolution. To test this hypothesis and to analyse the degree of similarity and identity among the different homologues, a multiple alignment was done using ClustalX [see Additional File 2] and a phylogenetic tree was computed using the Neighbor-Joining (NJ) method (Fig. 1). The distribution of the evolutionary distances and the tree topology strongly suggest that a duplication event took place before the divergence of the actinobacterial lineages, implying that two paralogues were already present in the last common ancestor. Duplication of fur after divergence would have yielded a tree more closely resembling the 16S tree (Fig. 2A). However, this pattern was not observed. Instead, homologues within each genome are almost always separated by a node close to the root, while orthologues from different organisms cluster together, with strong bootstrap support.
We propose a duplication-based model to explain the evolutionary history that took place leading to the various fur genes in the Actinobacteria (Fig. 2B). According to this scenario, the ancestral organism possessed a regulator that had affinity for several metals and that could bind to various DNA sequences. This ancestral fur gene underwent a paralogous duplication event, giving rise to homologues designated as A and B in the Figs. 1 and 2. After the lineage leading to Frankia diverged, the initial fur gene underwent a second paralogous duplication event in this specific lineage, giving rise to the C homologue. Finally, in the progenitor of Streptomyces, the A homologue underwent a third paralogous duplication leading to the D homologue. The E homologues (Fig. 1), which are only present in seven copies, apparently had a different origin and will be discussed below.
These duplication events were followed by specific losses and further duplications that likely modulated the response capacity of each organism to the particular set of evolutionary pressures that characterise its ecological niche. Based on the model described above, one would expect to find one homologue of each kind in each taxon. While the B homologue was generally conserved – there are only two losses recognizable in R. xylanophylus and in Tropheryma spp. – the A homologue was the object of several duplications and losses. This pattern suggests that the selective pressures acting on this homologue were more variable than the ones affecting the B homologue. Seven organisms have two A homologues and one organism has three A homologues (Table 1). Based on the principle of parsimony, and looking at the 16S tree (Fig. 2A), one can postulate four gains to explain these nine additional A homologues (Fig. 2): one in the common ancestor of M. smegmatis, Mycobacterium JLS, Mycobacterium KMS, Mycobacterium MCS and M. vanbaaleni; one in the common ancestor of N. farcinica and Rhodococcus RHA1; one M. smegmatis and another in Nocardioides. Given the synteny surrounding some orthologues of each gain (Fig. 3), duplication events are the most parsimonious explanation. This issue will be further discussed below. On the other hand, thirteen organisms do not retain the A homologue, which can be explained by eight independent gene losses (Fig. 2): one in the common ancestor of B. adolescentis and B. longum; one in the common ancestor of C. diphtheriae, C. efficiens, C. glutamicum and C. jeikeium; one in the common ancestor of Tropheryma whipplei TW08/27 and Tropheryma whipplei Twist; and one in each of S. avermitilis, Frankia CcI3, M. leprae, P. acnes and R. xylanophylus.
Taking the two major groups originating from the first duplication, one can speculate why A homologues are subjected to such strong and diverse selective forces when compared to the more conserved B homologues. As mentioned previously in the introduction, A homologues have been extensively studied in several Actinobacteria, and at least in Mycobacterium and Streptomyces spp. they control the transcription of a downstream catalase-peroxidase, having therefore a major role in the oxidative stress response. Given that this kind of stress is inherent to all oxygen-consuming organisms and able to affect many molecules inside a living cell, it is not surprising to find that anti-oxidant mechanisms and their regulators are quite sensitive to evolutionary pressures related to each specific ecological niche. Since orthologues most likely retain their function across different species, it is reasonable to argue that the A homologues are involved in oxidative stress response regulation. On the other hand, B homologues may either be involved in regulating a cellular function more conserved across organisms, or its involvement in cellular metabolism is not as broad as in a situation of oxidative stress, and therefore these homologues are more stable across organisms throughout geological time.
The E group of sequences likely originated from three HGT (horizontal gene transfer) events: one to the Streptomyces ancestor (originating SCO4180 and SAV_4029), and then from that to the Frankia – Acidothermus common ancestor (originating Acel_0061, Franean1_6149, Francci3_2661 and FRAAL2798) and to Nocardioides (originating Noca_4251) in two separate horizontal transfers. These transfers would include the fur and the surrounding genes which, in parallel with the fur-katG case, could be putatively regulated by Fur constituting an operon. In the common ancestor of Frankia CcI3 and ACN14a, a reshuffling of the genome could be responsible for disrupting of the genomic context, which is maintained among Noca_4251, Acel_0061, Franean1_6149, SCO4180 and SAV4029 (Fig. 3). This hypothesis can explain both the presence and the synteny encountered among these genes, and is supported by their %GC values – FRAAL2798, Francci3_2661, SAV4029 and SCO4180 have a %GC value that deviates from the genome (Table 1). Lastly, with the exception of A. cellulolyticus, all these organisms inhabit soil ecological niches which facilitates the occurrence of HGT.
Finally, the R. xylanophilus homologues are the only ones that are not considered in our model – they seem to have been acquired independently from other Actinobacteria by three individual HGTs. In fact, their %GC values are different both among the sequences and comparing with the genome. A single HGT followed by duplication is also an hypothesis to be considered, especially since it is known that laterally transferred genes have higher rates of duplication .
In order to validate the NJ results and to evaluate the strength of the proposed model, a Maximum-Likelihood (ML) tree was computed [see Additional File 3]. The outcome of it corroborates the NJ phylogeny. Despite the fact that the ML tree is not completely resolved, having 14.8% unresolved quartets, it is clear that the groups identified by NJ are maintained, as well as the relations between them. In fact, the C group of homologues emerges from a node close to the root, while the D group of homologues is related to the A group, which suggests that C had its origin by duplication of the ancestor gene while D had its origin on a duplication of the A.
From Actinobacteria to the big picture
One interesting question regarding evolution of Fur in Actinobacteria is whether the first duplication occurred in the ancestor of the Actinobacteria, or if it was an earlier event that would have appeared in other lineages. To address this question, we randomly selected three different species from each Eubacteria group (except in cases like Acidobacteria where fewer than three genomes are available) and investigated the presence of Fur homologues combining a relaxed BlastP screening against each genome individually with annotation information. The resulting sequences were aligned with the Fur homologues from the Actinobacteria [see Additional File 4] and a NJ tree was computed [see Additional File 5]. The first bifurcation of this tree divides it into two major groups: one of them contains the actinobacterial A homologues and the other contains the actinobacterial B homologues, indicating that the initial duplication occurred in the eubacterial common ancestor. This tree also supports a common origin for the E group of homologues, which are clustered together and separated from the other actinobacterial groups. Multiple origins for this group, either by HGT or duplication, would result in the distribution of these homologues all through the tree. Their clustering shows that either they resulted from a common duplication that was lost in all the other Actinobacteria, or, more parsimoniously, that they resulted from the described HGT. The R. xylanophylus homologues appeared later, which results in a scattered distribution in the tree, supporting an origin in independent HGT events.
Interestingly, previously published work concerning the evolutionary history of Fur and other iron and manganese-responsive transcriptional regulators in alphaproteobacteria  indicates that most of bacteria in this lineage have only one Fur, involved in iron homeostasis regulation, which evolved towards a manganese-homeostasis regulator (Mur) in Rhizobiales and Rhodobacteraceae. However, and in the same study, another regulator present in some of the alphaproteobacteria and named Irr was characterised and considered to be part of the Fur superfamily. Putative Irr-binding sites have been found upstream of genes encoding iron-homeostasis and iron-containing proteins, in particular catalase-peroxidases, suggesting that this regulator might functionally correspond to the actinobacterial oxidative stress-related Fur. Although Rodionov et. al (2006) have studied the Fur/Mur phylogeny separately from the Irr one, the fact that they are placed in the same superfamily suggests a common origin, thus supporting the hypothesis that the first Fur duplication occurred in the eubacterial common ancestor, from which a group related with oxidative stress response has emerged.
Duplicate to evolve
The proposed evolutionary scenario for fur is consistent with the current view of gene duplication as a major means of microbial genome evolution . It has been suggested that broadly functional genes are more easily duplicated than functionally established ones, and that the modifications that follow the duplications should provide the appearance of new functional specificities. In fact, although paralogues and orthologues have the same general function, paralogues usually differ in specific biochemical details such as the primary target or a required co-factor . Thus, it is reasonable to conceive that an ancestral fur gene, encoding a Fur protein with a broad range of DNA-binding motifs and ionic co-factors, through duplication and divergence, gave rise to the modern fur genes, now optimised and specialised.
Consistent with the described model are the results of independent biochemical characterization of three of the S. coelicolor homologues: one of them (SCO0561 – group A) is able to bind in vitro several divalent metals (Ni2+, Mn2+, Zn2+ and Fe2+) and regulates the downstream catalase-peroxidase in a redox-dependent manner ; a second (SCO4180 – group E) binds specifically Ni2+ and regulates the transcription of a FeSOD and a cluster of genes related to nickel-uptake ; and the third (SCO5206 – group D) binds metals yet to be identified, responds to the redox changes of the cell and regulates a monofunctional catalase . In the same way, the biochemical characterisation of the Fur homologues in M. tuberculosis have shown that the A homologue (Rv1909c) regulates the downstream catalase-peroxidase in the presence of metals and in a redox-dependent manner [21, 24], while the B homologue (Rv2359) binds Zn2+ and is likely involved in the regulation of genes responsible for zinc-uptake . Thus, the main function – transcription regulation – is maintained among paralogues and orthologues. However, functional specificities such as the metal coordinated and the genes regulated are only maintained within orthologues (SCO0561 and Rv1909c), and diverge in the paralogues of the same organism (SCO0561/SCO4180/SCO5206 and Rv1909c/Rv2359). As might be expected, paralogues resulting from recent duplications are more similar than more ancient ones. In fact, SCO5206, which according to the present model is predicted to be the result of a duplication of the A homologue, is functionally closer to its correspondent A paralogue (SCO0561) than to the others. Although they regulate different genes, both are functionally related to oxidative stress response and dependent on the redox status of the cell. Paralogues from older duplication events have had more time to evolve and have accumulated more differences than those generated from recent duplication events.
In support of the proposed evolutionary model, it can be observed that several fur orthologues are located in equivalent regions of their genomes (Fig. 3). Syntenic genes reveal the core chromosomal segments present in a common ancestor, encoding a high proportion of essential gene functions and presenting a significantly lower HGT rate . These syntenic regions are evidence for a group with a common origin, and were considered significant whenever at least two genes remained contiguous across different chromosomes . The high degree of synteny observed for the fur orthologues points toward an early origin. The identification of the regulators that were lost or gained in each specific case may provide clues concerning the metabolic properties and pathways that are common to the lineage and those which specificity is more species-related.
Correlation between genome size, ecological niche and Fur homologues
Assuming that genome size is somehow related to the selective pressures acting upon an organism, and that Fur is a global transcriptional regulator able to sense different metals and to regulate the expression of different genes, we argue that the larger the genome size and the underlying selective pressures, the higher is the need for Fur regulators and the lower the pressure toward gene loss. Indeed, the number of Fur homologues tends to increase with increasing genome size. As seen in Table 1, organisms with genomes smaller than 2 Mbps have no Fur homologues, organisms with genomes between 2 and 5 Mbps have 1 or 2 Fur homologues, while organisms with genomes between 5 and 7 Mbps have 3 Fur homologues and organisms with genomes between 7 and 9 Mbps have 4 Fur homologues, with seven exceptions:A. cellulolyticus, M. avium 104, M. ulcerans, Nocardioides JS614, Rhodococcus RHA1, R. xylanophilus and S. avermitilis. This relation finds support in statistical analysis: the Pearson Product Moment Correlation was calculated using genome size and number of Furs as variables, and the result was 0.823, statistically different from 0 with α = 0.05.
T. whipplei spp. is an interesting case. Adaptation to a strictly host-adapted lifestyle has led to gene loss and several metabolic pathways, namely those related to amino acid biosynthesis and energy production, have been lost . In this scenario, the need for regulators is reduced, leading to the loss of Fur proteins. On the other hand, the ecological niche occupied by each organism may explain the seven exceptions noted above. A. cellulolyticus, Nocardioides JS614, and R. xylanophilus have one more Fur than predicted by their chromosome size. While the first organism is a thermophile and the second one inhabits the soil, the third organism exhibits high tolerance to radiation. These considerable stressful situations may have imposed selective pressures that maintained an extra fur homologue in the chromosome, despite its size. On the other hand, M. avium 104 and M. ulcerans have one less homologue than predicted by their chromosome size. This may be explained by the fact that these organisms are host associated: it is known that host-associated bacteria tend to undergo a genome reducing process. Finally, Rhodococcus RHA1 and S. avermitilis also have one less homologue than predicted by their chromosome size. There is no obvious explanation for these cases.
Frankiae, a group of plant microsymbionts able to fix atmospheric nitrogen, illustrates what is stated above. Recently, three Frankia strains have been sequenced, ACN14a, EAN1pec and CcI3, and the presence of a high number of transcriptional regulators has been noted . Interestingly, these genomes present highly divergent sizes: ACN14a has a genome of 7.5 Mbps, encoding 6711 proteins; EAN1pec has a genome of 9.0 Mbps, encoding 7976 proteins; and finally, CcI3 has a genome of 5.4 Mbps, encoding 4499 proteins. One possible explanation proposed to account for the differences in genome size is related to the variation in their lifestyles. While ACN14a and EAN1pec survive well in the soil, CcI3 appears to be undergoing a genome reducing process, becoming more and more dependent on the plant symbiont and less able to survive by itself . The increased gene contents of ACN14a and EAN1pec provide a variety of "extra" genes that allow these strains to survive in a variety of soils and in symbiosis. CcI3 is apparently losing its ability to survive in the free-living state so selective pressures are lower and fewer regulators are needed. This is clearly in agreement with the number of Fur homologues that were identified for each strain:Frankia ACN14a and EAN1pec have 4 homologues each, while CcI3 has only 3.
As mentioned above, there are nine gains of the A homologue that cannot be explained by the proposed model. Although four independent duplications appear as the most parsimonious explanation for their origin, the hypothesis of HGT should also be considered, especially since six out of the nine homologues have a %G+C content different from the genome (MSMEG_6383, Mjls_5253, Mkms_4974, Mmcs_3447, nfa3250 and Noca_0874). The presence of synteny within at least one of the groups of homologues (MSMEG_6253, Mjls_5253, Mmcs_4885, Mkms_4974) favours the hypothesis of duplication but it does not exclude HGT, since the transfer of entire chromosomal fragments (instead of single genes) is possible.
On the other hand, one of the factors that limits HGT is that the transferred genes must outcompete indigenous ones, which are already part of a complex and adapted network, in order to be fixed in the genomes [40, 41]. One expects essential regulatory genes to be stable during evolution. In fact, it is not simple for a regulator like fur to be horizontally transferred, enter a complex network, and establish itself as a major regulator. Furthermore, recent work has suggested that HGTs seldom affects orthologues . Therefore, due to the number of the fur homologues in the genomes, their synteny and their nature as global regulators, explaining the presence of these genes by HGT should be used with care, and consequently %G+C value alone should not be the exclusive argument to sustain a HGT situation.
Regarding the E sequences, the situation is inverted. Besides %G+C values, other factors seem to favour HGT. One could hypothesise three independent duplication events as the origin of these genes: in the Streptomyces ancestor, in the Frankia – Acidothermus common ancestor and in Nocardioides. However, the presence of synteny across these different groups – and not only within them, as happens in the duplicated A homologues – indicates a common origin for all of the genes that constitute them, excluding the three independent duplications as well as three independent HGTs. Another explanatory hypothesis would include a single genomic duplication in the last common ancestor of the organisms involved. However, this last common ancestor is actually the last common ancestor to most of all the other Actinobacteria considered, and so the possibility of duplication would imply a high rate of gene loss. In fact, and in terms of number of evolutionary steps, the 3 in-tandem described HGT is the most parsimonious explanation for this group of genes.
The abundance of fur genes in Actinobacteria and their phylogenetic relationship points towards early duplications in the evolution of these regulators, along with additional HGT and later intra-species duplications. A strong synteny between fur orthologues regions is consistent with the proposed model and supported by functional studies. These observations provide clues for future studies concerning the importance of Fur in regulating other systems besides oxidative stress in organisms inhabiting diverse ecological niches and under dissimilar selective pressures. Furthermore, they help to differentiate between the basic essential processes and the species-specific ones. Exploring the phylogeny of regulators at the same time as their functionality and the organisms' ecology is a promising strategy to explore how different bacteria adapt to their various habitats and lifestyles by a fine-tune control of DNA transcription.
Blast searches and sequences retrieval
In order to identify all the fur homologues in the completely sequenced actinobacterial genomes present in NCBI database (March/2007), a two-step approach was used. Initially, BlastP analyses were performed against each genome individually, using Frankia alni ACN14a FRAAL3168 as the query sequence: only hits with an e-value below or equal to e-05 were retained for further analyses. Afterwards, the retrieved sequences were aligned with Streptomyces reticuli FurS (CAA74697), in order to check for the presence of five key residues shown by D. Ortiz de Orué Lucana et al. (2003) to be essential for Fur functionality: cysteines 96 and 99, histidines 92 and 93 and tyrosine 59.
Multiple alignments and phylogenetic trees
Multiple alignments were performed using ClustalX 1.81  with all the default parameters. The data set included the Fur homologues' amino-acid sequences, with or without Archaeoglobus fulgidus DSM 4304 Fur (NP_071057) as the outgroup, and 16S ribosomal RNA nucleotide sequences [see Additional File 6] retrieved from each genome page on NCBI. The resulting alignments were used to generate phylogenetic trees by the Neighbor-Joining (NJ) method  using the same software, and by the Maximum Likelihood (ML) method  using Tree-Puzzle 5.2 . Bootstrap values were calculated for the NJ trees using 10000 replicates to evaluate the robustness of the nodes .
For the ML analysis, the evolution model used was the WAG model , selected by Tree-Puzzle as being the one that best described our data. The parameter estimation was exact and used quartet sampling (for substitution process) and NJ data (for rate variation). The chosen tree search procedure was Quartet Puzzling and 50000 puzzling steps were computed in order to obtain the consensus tree.
For each fur homologue, the adjacent regions were visually inspected in order to detect the presence or absence of synteny. The inferred amino acid sequences for the 4 genes found upstream and downstream of each fur were used as query in a BlastP search against all the sequenced actinobacterial genomes, and those with e-values below e-05 were analysed to determine if any hit corresponded to a gene-encoding protein occupying a similar position relative to a fur orthologue in another organism.
The Pearson Product Moment Correlation was computed using XLSTAT 2008.2.02.
Escolar L, Perez-Martin J, de Lorenzo V: Opening the iron box: transcriptional metalloregulation by the Fur protein. J Bacteriol. 1999, 181 (20): 6223-6229.
Bagg A, Neilands JB: Ferric uptake regulation protein acts as a repressor, employing iron (II) as a cofactor to bind the operator of an iron transport operon in Escherichia coli. Biochemistry. 1987, 26 (17): 5471-5477. 10.1021/bi00391a039.
Hantke K: Regulation of ferric iron transport in Escherichia coli K12: isolation of a constitutive mutant. Mol Gen Genet. 1981, 182 (2): 288-292. 10.1007/BF00269672.
Holmes K, Mulholland F, Pearson BM, Pin C, McNicholl-Kennedy J, Ketley JM, Wells JM: Campylobacter jejuni gene expression in response to iron limitation and the role of Fur. Microbiology. 2005, 151 (Pt 1): 243-257. 10.1099/mic.0.27412-0.
Vliet AHMv, Ketley JM, Park SF, Penn CW: The role of iron in Campylobacter gene regulation, metabolism and oxidative stress defense. FEMS Microbiol Rev. 2002, 26 (2): 173-186.
Delany I, Spohn G, Pacheco AB, Ieva R, Alaimo C, Rappuoli R, Scarlato V: Autoregulation of Helicobacter pylori Fur revealed by functional analysis of the iron-binding site. Mol Microbiol. 2002, 46 (4): 1107-1122. 10.1046/j.1365-2958.2002.03227.x.
Delany I, Spohn G, Rappuoli R, Scarlato V: The Fur repressor controls transcription of iron-activated and -repressed genes in Helicobacter pylori. Mol Microbiol. 2001, 42 (5): 1297-1309. 10.1046/j.1365-2958.2001.02696.x.
Ochsner UA, Vasil AI, Vasil ML: Role of the ferric uptake regulator of Pseudomonas aeruginosa in the regulation of siderophores and exotoxin A expression: purification and activity on iron-regulated promoters. J Bacteriol. 1995, 177 (24): 7194-7201.
Ahn B-E, Cha J, Lee E-J, Han A-R, Thompson CJ, Roe J-H: Nur, a nickel-responsive regulator of the Fur family, regulates superoxide dismutases and nickel transport in Streptomyces coelicolor. Mol Microbiol. 2006, 59 (6): 1848-1858. 10.1111/j.1365-2958.2006.05065.x.
Lucarelli D, Russo S, Garman E, Milano A, Meyer-Klaucke W, Pohl E: Crystal structure and function of the zinc uptake regulator FurB from Mycobacterium tuberculosis. J Biol Chem. 2007, 282 (13): 9914-9922. 10.1074/jbc.M609974200.
Bellini P, Hemmings AM: In vitro characterization of a bacterial manganese uptake regulator of the fur superfamily. Biochemistry. 2006, 45 (8): 2686-2698. 10.1021/bi052081n.
Coombs JM, Barkay T: Molecular Evidence for the Evolution of Metal Homeostasis Genes by Lateral Gene Transfer in Bacteria from the Deep Terrestrial Subsurface. Appl Environ Microbiol. 2004, 70 (3): 1698-1707. 10.1128/AEM.70.3.1698-1707.2004.
Beard SJ, Hashim R, Membrillo-Hernandez J, Hughes MN, Poole RK: Zinc(II) tolerance in Escherichia coli K-12: evidence that the zntA gene (o732) encodes a cation transport ATPase. Mol Microbiol. 1997, 25 (5): 883-891. 10.1111/j.1365-2958.1997.mmi518.x.
Rensing C, Fan B, Sharma R, Mitra B, Rosen BP: CopA: An Escherichia coli Cu(I)-translocating P-type ATPase. PNAS. 2000, 97 (2): 652-656. 10.1073/pnas.97.2.652.
Rutherford JC, Cavet JS, Robinson NJ: Cobalt-dependent Transcriptional Switching by a Dual-effector MerR-like Protein Regulates a Cobalt-exporting Variant CPx-type ATPase. J Biol Chem. 1999, 274 (36): 25827-25832. 10.1074/jbc.274.36.25827.
Rodionov DA, Dubchak I, Arkin A, Alm E, Gelfand MS: Reconstruction of regulatory and metabolic pathways in metal-reducing δ-proteobacteria. Genome Biol. 2004, 5:
Ortiz de Orue Lucana D, Schrempf H: The DNA-binding characteristics of the Streptomyces reticuli regulator FurS depend on the redox state of its cysteine residues. Mol Gen Genet. 2000, 264 (3): 341-353. 10.1007/s004380000328.
Hahn JS, Oh SY, Chater KF, Cho YH, Roe JH: H2O2-sensitive fur-like repressor CatR regulating the major catalase gene in Streptomyces coelicolor. J Biol Chem. 2000, 275 (49): 38254-38260. 10.1074/jbc.M006079200.
Milano A, Forti F, Sala C, Riccardi G, Ghisotti D: Transcriptional regulation of furA and katG upon oxidative stress in Mycobacterium smegmatis. J Bacteriol. 2001, 183 (23): 6801-6806. 10.1128/JB.183.23.6801-6806.2001.
Ortiz de Orue Lucana D, Troller M, Schrempf H: Amino acid residues involved in reversible thiol formation and zinc ion binding in the Streptomyces reticuli redox regulator FurS. Mol Genet Genomics. 2003, 268 (5): 618-627.
Sala C, Forti F, Di Florio E, Canneva F, Milano A, Riccardi G, Ghisotti D: Mycobacterium tuberculosis FurA autoregulates its own expression. J Bacteriol. 2003, 185 (18): 5357-5362. 10.1128/JB.185.18.5357-5362.2003.
Hahn JS, Oh SY, Roe JH: Regulation of the furA and catC operon, encoding a ferric uptake regulator homologue and catalase-peroxidase, respectively, in Streptomyces coelicolor A3(2). J Bacteriol. 2000, 182 (13): 3767-3774. 10.1128/JB.182.13.3767-3774.2000.
Zou P, Borovok I, Ortiz de Orue Lucana D, Muller D, Schrempf H: The mycelium-associated Streptomyces reticuli catalase-peroxidase, its gene and regulation by FurS. Microbiology. 1999, 145 (Pt 3): 549-559.
Pym AS, Domenech P, Honore N, Song J, Deretic V, Cole ST: Regulation of catalase-peroxidase (KatG) expression, isoniazid sensitivity and virulence by furA of Mycobacterium tuberculosis. Mol Microbiol. 2001, 40 (4): 879-889. 10.1046/j.1365-2958.2001.02427.x.
Harvie DR, Vilchez S, Steggles JR, Ellar DJ: Bacillus cereus Fur regulates iron metabolism and is required for full virulence. Microbiology. 2005, 151 (Pt 2): 569-577. 10.1099/mic.0.27744-0.
Delany I, Rappuoli R, Scarlato V: Fur functions as an activator and as a repressor of putative virulence genes in Neisseria meningitidis. Mol Microbiol. 2004, 52 (4): 1081-1090. 10.1111/j.1365-2958.2004.04030.x.
Lee JW, Helmann JD: Functional specialization within the Fur family of metalloregulators. Biometals. 2007, 20 (3–4): 485-499. 10.1007/s10534-006-9070-7.
Prescott L, Harley JP, Klein DA: Microbiology. 1999, The McGraw-Hill Companies, 4
Mohagheghi A, Grohmann K, Himmel M, Leighton L, Updegraff MD: Isolation and characterization of Acidothermus cellulolyticus gen. nov., sp. nov., a new genus of thermophilic, acidophilic, cellulolytic bacteria. Int J Syst Bacteriol. 1986, 36: 435-443.
Mevs U, Stackebrandt E, Schumann P, Gallikowski CA, Hirsch P: Modestobacter multiseptatus gen. nov., sp. nov., a budding actinomycete from soils of the Asgard Range (Transantarctic Mountains). Int J Syst Evol Microbiol. 2000, 50 (Pt 1): 337-346.
Phillips RW, Wiegel J, Berry CJ, Fliermans C, Peacock AD, White DC, Shimkets LJ: Kineococcus radiotolerans sp. nov., a radiation-resistant, gram-positive bacterium. Int J Syst Evol Microbiol. 2002, 52 (Pt 3): 933-938. 10.1099/ijs.0.02029-0.
Warnecke F, Sommaruga R, Sekar R, Hofer JS, Pernthaler J: Abundances, identity, and growth state of actinobacteria in mountain lakes of different UV transparency. Appl Environ Microbiol. 2005, 71 (9): 5551-5559. 10.1128/AEM.71.9.5551-5559.2005.
Hooper SD, Berg OG: Duplication is more common among laterally transferred genes than among indigenous genes. Genome biology. 2003, 4 (8): R48-10.1186/gb-2003-4-8-r48.
Rodionov DA, Gelfand MS, Todd JD, Curson AR, Johnston AW: Computational Reconstruction of Iron- and Manganese-Responsive Transcriptional Networks in alpha-Proteobacteria. PLoS Comput Biol. 2006, 2 (12): e163-10.1371/journal.pcbi.0020163.
Hooper SD, Berg OG: On the nature of gene innovation: duplication patterns in microbial genomes. Mol Biol Evol. 2003, 20 (6): 945-954. 10.1093/molbev/msg101.
Mirny LA, Gelfand MS: Using orthologous and paralogous proteins to identify specifity-determining residues in bacterial transcription factors. J Mol Biol. 2002, 321: 7-20. 10.1016/S0022-2836(02)00587-9.
Guerrero G, Peralta H, Aguilar A, Diaz R, Villalobos MA, Medrano-Soto A, Mora J: Evolutionary, structural and functional relationships revealed by comparative analysis of syntenic genes in Rhizobiales. BMC Evol Biol. 2005, 5: 55-10.1186/1471-2148-5-55.
Bentley SD, Maiwald M, Murphy LD, Pallen MJ, Yeats CA, Dover LG, Norbertczak HT, Besra GS, Quail MA, Harris DE, et al: Sequencing and analysis of the genome of the Whipple's disease bacterium Tropheryma whipplei. The Lancet. 2003, 361 (9358): 637-644. 10.1016/S0140-6736(03)12597-4.
Normand P, Lapierre P, Tisa LS, Gogarten JP, Alloisio N, Bagnarol E, Bassi CA, Berry AM, Bickhart DM, Choisne N, et al: Genome characteristics of facultatively symbiotic Frankia sp. strains reflect host range and host plant biogeography. Genome Res. 2007, 17 (1): 7-15. 10.1101/gr.5798407.
Kurland CG: Something for everyone. Horizontal gene transfer in evolution. EMBO reports. 2000, 1 (2): 92-95. 10.1093/embo-reports/kvd042.
Berg OG, Kurland CG: Evolution of microbial genomes: sequence acquisition and loss. Mol Biol Evol. 2002, 19 (12): 2265-2276.
Daubin V, Moran NA, Ochman H: Phylogenetics and the cohesion of bacterial genomes. Science. 2003, 301 (5634): 829-832. 10.1126/science.1086568.
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 24: 4876-4882. 10.1093/nar/25.24.4876.
Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987, 4 (4): 406-425.
Felsenstein J: Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981, 17 (6): 368-376. 10.1007/BF01734359.
Schmidt HA, Strimmer K, Vingron M, Haeseler Av: TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 2002, 18: 502-504. 10.1093/bioinformatics/18.3.502.
Felsenstein J: Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985, 39: 783-391. 10.2307/2408678.
Whelan S, Goldman N: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol. 2001, 18 (5): 691-699.
Perrière G, Gouy M: WWW-Query: An on-line retrieval system for biological sequence banks. Biochimie. 1996, 78: 364-369. 10.1016/0300-9084(96)84768-7.
PhyloDraw: A Phylogenetic Tree Drawing System. [http://pearl.cs.pusan.ac.kr/phylodraw/]
The authors are grateful to Marta V. Mendes for careful reading of the manuscript and useful discussion. This work was funded by a research grant (POCTI/BCI/35283/2000) from Fundação para a Ciência e Tecnologia (FCT, Portugal), and a by a GRICES/CNRS mobility exchange grant. CLS was supported by the FCT fellowship SFRH/BD/21461/2005.
CLS, JV, FT and PN conceived the study. CLS collected and analysed the data and wrote the manuscript. DRB, LST, AMB and PM–F assisted in the drafting and provided substantial editorial advice and a critical revision of the manuscript. FT and PN helped in coordinating the study. All authors have read and approved the manuscript.