Evolutionary diversification and characterization of the eubacterial gene family encoding DXR type II, an alternative isoprenoid biosynthetic enzyme
© Carretero-Paulet et al.; licensee BioMed Central Ltd. 2013
Received: 14 May 2013
Accepted: 16 August 2013
Published: 3 September 2013
Isoprenoids constitute a vast family of natural compounds performing diverse and essential functions in all domains of life. In most eubacteria, isoprenoids are synthesized through the methylerythritol 4-phosphate (MEP) pathway. The production of MEP is usually catalyzed by deoxyxylulose 5-phosphate reductoisomerase (DXR-I) but a few organisms use an alternative DXR-like enzyme (DXR-II).
Searches through 1498 bacterial complete proteomes detected 130 sequences with similarity to DXR-II. Phylogenetic analysis identified three well-resolved clades: the DXR-II family (clustering 53 sequences including eleven experimentally verified as functional enzymes able to produce MEP), and two previously uncharacterized NAD(P)-dependent oxidoreductase families (designated DLO1 and DLO2 for DXR-II-like oxidoreductases 1 and 2). Our analyses identified amino acid changes critical for the acquisition of DXR-II biochemical function through type-I functional divergence, two of them mapping onto key residues for DXR-II activity. DXR-II showed a markedly discontinuous distribution, which was verified at several levels: taxonomic (being predominantly found in Alphaproteobacteria and Firmicutes), metabolic (being mostly found in bacteria with complete functional MEP pathways with or without DXR-I), and phenotypic (as no biological/phenotypic property was found to be preferentially distributed among DXR-II-containing strains, apart from pathogenicity in animals). By performing a thorough comparative sequence analysis of GC content, 3:1 dinucleotide frequencies, codon usage and codon adaptation indexes (CAI) between DXR-II sequences and their corresponding genomes, we examined the role of horizontal gene transfer (HGT), as opposed to an scenario of massive gene loss, in the evolutionary origin and diversification of the DXR-II subfamily in bacteria.
Our analyses support a single origin of the DXR-II family through functional divergence, in which constitutes an exceptional model of acquisition and maintenance of redundant gene functions between non-homologous genes as a result of convergent evolution. Subsequently, although old episodic events of HGT could not be excluded, the results supported a prevalent role of gene loss in explaining the distribution of DXR-II in specific pathogenic eubacteria. Our results highlight the importance of the functional characterization of evolutionary shortcuts in isoprenoid biosynthesis for screening specific antibacterial drugs and for regulating the production of isoprenoids of human interest.
KeywordsDXR-II Isoprenoid metabolism Horizontal gene transfer Gene loss Functional divergence
Isoprenoids are essential in all eubacteria in which they have been studied, playing key roles in several core cellular functions e.g. ubiquinones and menaquinones, which act as electron carriers of the aerobic and anaerobic respiratory chains respectively, and dolichols, which are required for cell wall peptidoglycan synthesis . Because of the essential role of the MEP pathway in most eubacteria and its absence from animals, it has been proposed as a promising new target for the development of novel antibiotics [14, 15]. Besides that, many isoprenoids also have substantial industrial, pharmacological, and nutritional interest . Therefore, understanding the biochemical and genetic plasticity of isoprenoid biosynthesis in bacteria is crucial to attempt its pharmacological block or to be used in biofactories for the production of isoprenoids of human interest.
The occurrence of alternative enzymes for isoprenoid biosynthesis in specific bacterial lineages has been previously reported . The enzyme 3-hydroxy-3-methyl-glutaryl-CoA reductase (HMGR), which catalyzes the rate-limiting step of the MVA pathway, is structurally distant from its archaebacterial and eukaryotic homologs in most eubacteria [8, 18, 19]. Similarly, two different classes of isopentenyl diphosphate isomerase (IDI), the enzyme catalyzing the isomerization of IPP to produce DMAPP, have been identified in bacteria: type I IDI (similar to its animal, fungi and plant counterparts) and type II IDI, acquired from archaebacteria and apparently unrelated to the latter [20–22]. Although IDI activity is only essential in organisms dependent on the MVA pathway for IPP and isoprenoid biosynthesis, both types of IDI have been identified in bacterial strains dependent on the MEP pathway .
We recently reported the occurrence of a group of bacteria harbouring the entire set of enzymes of the MEP pathway with the exception of 1-deoxy-d-xylulose 5-phosphate (DXP) reductoisomerase (DXR), the enzyme catalyzing the NADPH-dependent production of MEP from DXP in the first committed step of the pathway. In these species, a novel family of previously uncharacterized oxidoreductases related to homoserine dehydrogenases (HD) involved in the common pathway (CP) of amino acid biosynthesis (Figure 1), was found to perform the DXR biochemical reaction . This alternative enzyme, referred to as DXR-like (DRL) or DXR type II (DXR-II) to distinguish it from the canonical DXR (renamed DXR-I), displayed a markedly discontinuous distribution. DXR-II was found forming single or multigene families in bacterial strains from diverse taxonomic groups, independent of the presence or absence of a DXR-I sequence in their genome .
Different evolutionary scenarios might explain DXR-II emergence and evolutionary diversification. In this study we examined how the DXR-II family emerged through functional divergence from related oxidoreductase families and identified amino acid changes critical for the acquisition of its specific biochemical function. Furthermore, we assess the contrasting roles of horizontal gene transfer (HGT) and massive gene loss, major forces in microbial genome evolution known to affect other genes involved in IPP and isoprenoid biosynthesis , in the discontinuous distribution of DXR-II across eubacteria.
DXR-IIs cluster into a single clade closely related to two uncharacterized oxidoreductase families
List of DXR-II and DLO related sequences examined in this study
GenBank and RefSeq
GenBank and RefSeq
Anaerococcus prevotii DSM 20548
Frankia sp. EuI1c
Bacillus clausii KSM-K16
Gloeobacter violaceus PCC 7421
Bacillus halodurans C-125
Hirschia baltica ATCC 49814
Bacillus pumilus SAFR-032
Kineococcus radiotolerans SRS30216
Bartonella bacilliformis KC583
Methanosphaerula palustris E1-9c
Bartonella clarridgeiae 73
Nakamurella multipartita DSM 44233
Bartonella grahamii as4aup
Nostoc azollae 0708
Bartonella henselae str. Houston-1
Nostoc punctiforme PCC 73102
Bartonella quintana str. Toulouse
Nostoc sp. PCC 7120
Bartonella tribocorum CIP 105476
Pseudomonas stutzeri A1501
Brucella abortus bv. 1 str. 9-941
Pseudomonas stutzeri ATCC 17588 = LMG 11199
Brucella abortus S19
Pseudoxanthomonas spadix BD-a59
Brucella canis ATCC 23365
Ramlibacter tataouinensis TTB310
Brucella melitensis ATCC 23457
Rhodobacter sphaeroides 2.4.1
Brucella melitensis biovar Abortus 2308
Rhodobacter sphaeroides ATCC 17025
Brucella melitensis bv. 1 str. 16 M
Rhodobacter sphaeroides ATCC 17029
Brucella microti CCM 4915
Rhodobacter sphaeroides KD131
Brucella ovis ATCC 25840
Rhodothermus marinus DSM 4252
Brucella pinnipedialis B2/94
Rhodothermus marinus SG0.5JP17-172
Brucella suis 1330
Sphingomonas wittichii RW1
Brucella suis ATCC 23445
Streptomyces griseus subsp. griseus NBRC 13350
Chelativorans sp. BNC1
Xanthomonas campestris pv. campestris str. 8004
Chloroflexus aurantiacus J-10-fl
Xanthomonas campestris pv. campestris str. ATCC 33913
Chloroflexus sp. Y-400-fl
Xanthomonas campestris pv. campestris str. B100
Clostridium difficile 630
Achromobacter xylosoxidans A8
Clostridium difficile CD196
Acidiphilium cryptum JF-5
Clostridium difficile R20291
Eubacterium limosum KIST612
Acidovorax ebreus TPSY
Finegoldia magna ATCC 29328
Acidovorax sp. JS42
Actinosynnema mirum DSM 43827
Listeria innocua Clip11262
Agrobacterium sp. H13-3
Agrobacterium tumefaciens str. C58
Anaeromyxobacter sp. Fw109-5
Listeria monocytogenes 08-5923
Arthrobacter sp. FB24
Listeria monocytogenes EGD-e
Azorhizobium caulinodans ORS 571
Listeria monocytogenes HCC23
Bordetella avium 197 N
Listeria monocytogenes serotype 4b str. CLIP 80459
Bordetella bronchiseptica RB50
Listeria monocytogenes serotype 4b str. F2365
Bordetella parapertussis 12822
Listeria welshimeri serovar 6b str. SLCC5334
Bordetella petrii DSM 12804
Mesorhizobium ciceri biovar biserrulae WSM1271
Bradyrhizobium japonicum USDA 110
Mesorhizobium loti MAFF303099 (1)
Bradyrhizobium sp. BTAi1
Mesorhizobium loti MAFF303099 (2)
Bradyrhizobium sp. ORS278
Mesorhizobium opportunistum WSM2075
Candidatus Pelagibacter ubique HTCC1062
Ochrobactrum anthropi ATCC 49188 (1)
Cupriavidus necator N-1
Ochrobactrum anthropi ATCC 49188 (2)
Pelagibacterium halotolerans B2
Methylibium petroleiphilum PM1
Roseobacter litoralis Och 149
Methylobacterium nodulans ORS 2060
Sebaldella termitidis ATCC 33386
Methylobacterium radiotolerans JCM 2831
Sinorhizobium fredii NGR234
Methylobacterium sp. 4-46
Starkeya novella DSM 506
Mycobacterium smegmatis str. MC2 155
Tepidanaerobacter sp. Re1
Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111
Thermosediminibacter oceani DSM 16646
Paracoccus denitrificans PD1222
Verminephrobacter eiseniae EF01-2
Polaromonas sp. JS666
Anabaena variabilis ATCC 29413
Polymorphum gilvum SL003B-26A1
Chloroflexus aggregans DSM 9485
Polynucleobacter necessarius subsp. asymbioticus QLW-P1DMWA-1
Coraliomargarita akajimensis DSM 45221
Pusillimonas sp. T7-7
Coxiella burnetii CbuG_Q212
Rhodopseudomonas palustris BisB5
Coxiella burnetii CbuK_Q154
Rhodospirillum rubrum ATCC 11170
Coxiella burnetii Dugway 5 J108-111
Spirochaeta smaragdinae DSM 11293
Coxiella burnetii RSA 331
Spirochaeta sp. Buddy
Coxiella burnetii RSA 493
Streptomyces flavogriseus ATCC 33331
Cyanothece sp. PCC 7425
Streptomyces sp. SirexAAcpoE
Cyclobacterium marinum DSM 745
Variovorax paradoxus EPS
Deinococcus maricopensis DSM 21211
Variovorax paradoxus S110
Desulfococcus oleovorans Hxd3
Xanthobacter autotrophicus Py2
Using the amino acid sequence alignment of the resulting full dataset of 130 hits (Additional file 1), a maximum likelihood (ML) phylogenetic analysis was performed (Figure 2 and Additional file 2). Alternative methods of phylogenetic inference (Bayesian -Additional file 3- and neighbor joining -Additional file 4) were also implemented, resulting in trees with almost identical topologies (unpublished data). Three main clades were consistently retrieved with high support values (Figure 2). A clade grouping 53 sequences, including 11 encoding for functional DXR-II as shown in complementation assays in  and Additional file 5, was designated as the DXR-II family and likely corresponds to actual DXR-II sequences (Figure 2). The remaining 77 sequences cluster into two additional clades and might not be true functional DXR-II sequences (Figure 2). As such, these were tentatively designated DLO1 and DLO2, for DXR-II-Like 1 and 2 Oxidoreductases. Indeed, four representative sequences belonging to the DLO1 and 2 families had also been previously tested for DXR-II activity, failing to complement the DXR defective mutant (Figure 2) .
DXR-II and DLO sequences showed similarity to NAD(P)-dependent oxidoreductases, and particularly to HD enzymes, at a sequence  and structural level . Correspondingly, searches for INTERPRO functional domains identified the NAD-binding domain with a core Rossmann-type fold at the N-terminal region of every single protein sequence (domain 1; Figure 2). Up to five additional domains could also be found in DXR-II and DLO proteins. To examine whether these protein domains were differentially distributed across the DXR-II, DLO1, and DLO2 families, we mapped the architecture of protein domains onto the corresponding tree (Figure 2). Most sequences from the DXR-II family shared NAD-binding (domain 1) and SAF (domain 6) domains, while a significant fraction also included N-terminal NAD/NADP-binding domains of aspartate/homoserine dehydrogenase (domain 2). However, no common domain architecture was shared among proteins within families DLO1 and DLO2.
The DXR-II family emerged through functional divergence
Phylogenetic analysis revealed the shared ancestry of all functional DXR-II, supporting their common evolutionary origin, and suggested the functional divergence of this family from related oxidoreductases through the acquisition of DXR-II specific biochemical activity. To examine the role of specific amino acid substitutions in functional specialization of DXR-II protein sequences, two different statistical approaches under a ML framework were followed. The first one permits the detection of amino acid sites subjected to different evolutionary rates between families under examination, i.e., highly conserved in a family but variable in the other (type-I functional divergence) . The second approach relies on site-specific shifts of amino acid physiochemical properties in positions otherwise highly conserved in each family (type-II functional divergence) .
Analysis of functional divergence
Coefficient θ ± SE
Critical amino acid sites (Qk > 0.7; *, Qk > 0.95)
DXR-II vs DLO1
θ1 = 0.277 ± 0.045 (LRT = 83.233; p = 7.292E-20 )
35, 46, 118, 121, 146, 161*, 176, 198*, 205*, 218, 229, 234, 237, 247*, 265, 282*, 291, 297, 310, 340, 342, 351*, 353*, 376, 404*, 410, 422, 424, 429
DXR-II vs DLO2
θ1 = 0.253 ± 0.043 (LRT = 114.991; p = 7.907E-27 )
35, 47, 64, 122*, 128, 133, 197*, 202, 205, 210, 239, 248, 250*, 253*, 258*, 260, 282, 291, 296, 305, 310*, 311*, 314, 320, 324*, 330*, 346, 351*, 359, 383*, 410*, 413, 428*, 432
DXR-II vs DLO1
θ2 = −0.998 ± 0.487
DXR-II vs DLO2
θ2 = −1.115 ± 0.575
These sites were mapped onto the corresponding amino acid sequence alignment (Additional file 1 and Additional file 6: Table S1). At many of these sites, amino acid residues are highly conserved in DXR-II sequences, but are variable in the DLO1 (e.g. positions 161 and 429 in B. melitensis biovar abortus 2308 DXR-II), the DLO2 (e.g. positions 210, 248 and 324), or both the DLO1 and the DLO2 (e.g. positions 35, 64, 118, 121, 122, 133, 197, 229, 250, 291, 320, 330, 346, 351, 353, 413, 428, 429, 432) families, likely reflecting a change in their functional roles. Some apparently represented minor changes, as they involved amino acids with similar physicochemical features (e.g. positions 291 or 428). Some others involved radical amino acid changes, such as position 121, occupied by the highly conserved Gly in DXR-II proteins, but also by the unrelated Ala and Ser amino acids in DLO1 and DLO2 proteins. Another example is position 229, filled by the absolutely conserved polar amino acid Thr in DXR-II proteins, but replaced by the highly hydrophobic Leu, Ile and Val amino acids in DLO1 or the physicochemically unrelated Pro, Ser and Ala residues in DLO2. Likewise, position 250, with a basic polar His found in all but four DXR-II proteins was replaced by different hydrophobic amino acids, and finally position 351, with a conserved Val in most DXR-II proteins was substituted by different physicochemically unrelated amino acids in DLO1 and DLO2 proteins.
DXR-IIs show a discontinuous taxonomic, metabolic and phenotypic distribution among eubacteria
The markedly scattered distribution of sequences belonging to the DXR-II family across higher order eubacterial taxonomic groups was previously observed . In this up-to-date survey, DXR-IIs were found as encoded by the genomes of free-living eubacteria strains mostly from Alphaproteobacteria (26 strains, mainly from the genera Brucella, 11, and Bartonella, 6) and Firmicutes (21 strains, mainly from the genus Listeria, 9). However, genes coding for functional DXR-II representatives were also found in the genomes of three additional distantly related bacterial taxonomic lineages i.e. the Chloroflexi, Betaproteobacteria and Fusobacteria (Figure 2). Within the DXR-II family, Alphaproteobacteria, Firmicutes and Chloroflexi sequences clustered into separate subclades, while the single Betaproteobacteria and Fusobacteria representatives grouped within the Alphaproteobacteria and Firmicutes subclades, respectively (Figure 2).
We examined the distribution of functional DXR-II at lower taxonomical levels. For example, the occurrence of discontinuities was evident when we mapped DXR-II onto a tree depicting the evolutionary relationships of 72 alphaproteobacterial species (Additional file 7) . DXR-II genes could only be found in the genomes of 25 strains among the 64 with fully sequenced genomes represented in the tree. They mainly belong to the order Rhizobiales, although significant hits were also retrieved from other taxonomic ranks, such as Rhodospirillales or Rhodobacteraceae. Within these alphaproteobacterial groups, strains whose genomes contained genes both encoding and not encoding DXR-II and/or DXR-I could be found. Discontinuities in DXR-II distribution could be appreciated with, e.g., the closely related pairs of Rhodospirillales species Magnetospirillum magneticum AMB-1/Rhodospirillum rubrum ATCC 11170 and Acidiphilium cryptum JF-5/Gluconobacter oxydans 621H. More strikingly, we have retrieved a DXR-II sequence only in one out of the five examined genomes of strains from Rhodopseudomonas palustris (strain BisB5), a feature perhaps related to the metabolical versatility attributed to this species  (Additional file 7). A similar patchy distribution of DXR-II was observed when DXR-II and DXR-I were mapped onto a phylogeny of Firmicutes (Additional file 8) .
Searches for enzymes of the MEP and MVA pathways of IPP and isoprenoid biosynthesis were also performed (Additional file 6: Table S2). The 51 DXR-II-containing eubacterial strains were classified according to the distribution of enzymes of these pathways, revealing the occurrence of multiple patterns (Figure 2 and Additional file 6: Table S3). The majority of surveyed eubacterial genomes contained genes coding for enzymes of the MEP pathway, but a significant number of them had lost one or more of these enzymes. DXR-I would have been preferentially lost among Alphaproteobacterial strains, but some losses were also found in Firmicutes and Chloroflexi (class A). These species would then exclusively rely on DXR-II for IPP biosynthesis through the MEP pathway. A group, mainly composed of Firmicutes strains showed genes encoding both DXR-II and DXR-I (class B). A significant number of genomes also encoded for enzymes of the MVA pathway. Some of these strains would then use solely the MVA pathway for isoprenoid biosynthesis, such as the two Chloroflexi representatives (class C). DXR-II activity has been experimentally shown from one of these strains, Chloroflexus auranticus J-10-fl, by complementation assays (Additional file 5). Most of them also have a complete and functional MEP pathway, such as Listeria monocytogenes (class D) . Finally, in the genomes of two Firmicutes strains (Anaerococcus prevotii DSM 20548 and Finegoldia magna ATCC 29328) no genes encoding enzymes from the MEP (apart from DXR-II) or the MVA pathways could be found (class E). Interestingly however, DXR-II activity had been confirmed experimentally for the latter .
Similarly, the distribution of DXR-II was compared to that of enzymes of the CP pathway of amino acid biosynthesis. The CP represents three enzymatic steps. The first is the phosporylation of aspartate, carried out by AK leading to β-aspartyl-phosphate, which in turn is oxidized by an ASDH to aspartate semialdehyde. Subsequently, HD catalyses the reduction of aspartate beta-semialdehyde into homoserine, in the third and last step of the CP pathway (Figure 1). The evolutionary diversification of enzymes of the CP in bacteria is known to have been shaped by gene duplication and fusion events, resulting in bifunctional AK_HD proteins . Most genomes of the 51 DXR-II-containing strains encoded AK and HD. The genomes of five strains also showed bifunctional AK_HD genes, while the genomes of only three Alphaproteobacteria strains encoded for ASDH and were believed to have functional CP (class B) (Figure 1 and Additional file 6: Table S3). However, none of the genomes of DXR-II-containing strains encoded the complete set of enzymes of the CP (class A, AK, HD, AK_HD and ASDH).
Distribution of biological properties in DXR-II and non-DXR-II containing bacterial strains and statistical tests of enrichment
Number of strains
Optimal temp. a
Genome Size a
GC Content a
Comparative sequence-based analysis of HGT in DXR-II evolution
The markedly discontinuous phylogenetic distribution shown by DXR-II might be explained by recurrent events of HGT occurring between unrelated bacterial strains. So long as the DXR-II sequence retains sequence features of the donor strain significantly distinct from that of the genome of the recipient strain, they could be inferred as being acquired by HGT. Consequently, comparative nucleotide sequences analyses of DXR-II against their host genomes could yield clues about their origin and the putative role of HGT in the distribution of DXR-II across eubacteria.
Several methods and criteria were applied to identify signatures of HGT (please see Methods for a complete description). Firstly, GC content at the three codon positions, as well as the total, was estimated. As previously observed [34, 35], GC content was relatively constant among genes of a particular species’ genomes, although displaying wide variation among species (Additional file 6: Table S4). This was particularly evident at the third codon position, as the majority of these sites are synonymous and, consequently, differences due to mutational biases are higher. In contrast, the first and second codon positions appear to be more conserved between genomes and are, consequently, less informative (Additional file 6: Table S4). The GC contents of all DXR-II coding sequences were compared to the mean for all genes encoded by the corresponding genomes. DXR-II from both Chloroflexi representatives and the single Fusobacteria representative Sebaldella termitidis ATCC 33386 showed significantly lower GCt and GC3 content regarding the respective mean for all genes in the genome (Additional file 6: Table S4). A fourth bacterial strain, Rhizobium NGR234, showed higher GCt and GC3 content (Additional file 6: Table S4).
Secondly, we examined for biases in dinucleotide relative frequencies, a remarkably stable property of the DNA of an organism claimed to constitute a ‘genomic signature’ that can discriminate sequences from different organisms . We focused on the dinucleotide biases at third and first (3:1) codon positions, which are less sensitive to selective constrains . Consequently, the 3:1 dinucleotide frequencies were calculated for all DXR-II coding sequences and for the entire set of genes in the corresponding genomes. They both showed significant variation across organisms, and therefore could be used as such genomic signatures. Significance of the differences between DXR-II genes and their genomes were examined by calculating the dinucleotide relative abundance difference or σ difference (Additional file 6: Table S5) . Pairwise co-variation was further assessed through the Spearman and Kendall rank tests (Additional file 6: Table S5). In all but one example, both Spearman’s ρ and Kendall’s τ correlation coefficients indicated strong positive correlation. An exception was provided by Halanaerobium hydrogeniformans, which showed negative correlation. All tests revealed significant covariation of 3:1 dinucleotide frequencies of DXR-II with the frequencies of the corresponding genomes, contrary to the expectations of HGT.
Next, we estimated relative synonymous codon usages (RSCU) values, which provide with a simple effective measure of synonymous codon usage bias. Differences in RSCU between DXR-II genes and all other genes in each corresponding genome were assessed by means of χ2 tests (Additional file 6: Table S6) . Chloroflexi strains and S. termitidis ATCC 33386 showed the higher χ2 statistic values, revealing higher variation. However, none of the tests was significant, indicating that DXR-II genes have a codon usage patterns consistent with that of their corresponding genomes, and therefore unlikely to reflect HGT.
Finally, we examined the degree of bias in codon usage of DXR-II genes towards the codon usage of the most expressed genes by comparing Codon Adaptation Index (CAI) values. A significant deviation from the average CAI of the genome was found in strains of Chloroflexi and S. termitidis ATCC 33386 (Additional file 6: Table S7).
Discussion and conclusions
The structural and functional diversity of isoprenoids correlates with the existence of a wide biochemical and genetic plasticity for their biosynthesis . In eubacteria, this is commonly achieved through the use of alternative metabolic pathways and enzymatic steps in specific lineages. Interesting examples are provided by HMGR and IDI, which are encoded by at least two distinct gene families in bacteria. In this paper we focus in DXR-II, recently characterized as an alternative family to DXR-I in performing the second step of the MEP pathway of isoprenoid biosynthesis in a selected group of eubacteria .
Apart from the NAD-binding domain with a core Rossmann-type fold found at the N-terminal region of all oxidoreductases, no significant similarity at the sequence level was observed between DXR-I and DXR-II to infer homology . Correspondingly, the recent determination of the DXR-II crystal structure showed only slight structural relationship with DXR-I proteins and revealed a unique arrangement of the active site . Examples of enzymes catalyzing identical reactions through the same catalytic mechanisms but showing structurally unrelated active sites are known outside the isoprenoid field [38–41]. In some of these though, key catalytic residues may be conserved between functionally redundant enzymes, as also reported for DXR-I and DXR-II . DXR-I and DXR-II likely represent analogous genes that evolved redundant biochemical functions through mechanistic convergence.
Our results support the emergence of the DXR-II family through type I, but not type II, functional divergence from DLO1 and DLO2 families of previously uncharacterized oxidoreductases. These data suggest that DXR-II acquired additional structural and/or functional constraints rather than shifted constraints in amino acids that were already ancestrally constrained. Amino acid changes critical for functional divergence and acquisition of DXR-II biochemical activity were predicted, many of them corresponding to positions highly conserved in DXR-II, but otherwise variable in DLO1 and/or DLO2. Interestingly, two of these predicted amino acids, Thr229 and Arg320, had been previously identified for their role in fosmidomycin/substrate binding and in dimerization, respectively , suggesting that functional shifts in a limited number of amino acid positions could be at the origin of the acquisition of DXR-II biochemical activity.
It could be assumed that the MEP pathway is the ancestral route for IPP and isoprenoid biosynthesis in eubacteria, including the membrane-associated hopanoids, which are among the oldest known biomolecules . The entire set of genes encoding for enzymes involved in the MEP pathway, including DXR-I, has been found widespread in all eubacterial taxonomic groups . In a significant number of DXR-II-containing eubacterial genomes (31), including those from closely related strains, DXR-I has been lost. This raises the question of how DXR-II evolved in DXR-I containing strains, as acquisition of redundant biochemical activities should not be favoured by evolution. The DXR-II family could have emerged under an ecological context that conferred a selective advantage to the emergence and maintenance of a functionally redundant enzyme, e.g. when gene dosage is selectively advantageous. Due to the wide and diverse functions played by isoprenoids and their essential role for cell viability, critical situations in which their biosynthesis was absolutely required may have occurred multiple times throughout eubacterial evolution. Emergence of the DXR-II family should have occurred at an early time in evolution, as supported by the scattered distribution of DXR-II and related oxidoreductases from DLO1 and DLO2 families in distantly related lineages of eubacteria. After relaxation of that burst in selective constraints for isoprenoid biosynthesis, some strains could then have lost one redundant enzyme, commonly DXR-II, which shows less catalytic activity in vitro . In addition, maintenance of DXR-II, which shows less sensitivity to inhibition by fosmidomycin than DXR-I , might have provided a selective advantage in bacterial strains sharing the same ecological niches as those naturally producing the antibiotic (e.g. Streptomyces species ).
The taxonomic distribution of DXR-II across eubacteria showed a marked discontinuity, which was also verified at the metabolic and phenotypic level. Although most genes encoding DXR-II were found in eubacteria with the MEP pathway, their occurrence was not linked to a unique pattern of distribution of enzymes of the MEP or MVA pathways. Similarly, HD, the oxidoreductase family that showed the highest level of similarity with DXR-II, was found in most DXR-II-containing bacterial strains, but not all. In addition, examination of the distribution of biological properties across DXR-II-containing strains showed maintenance of DXR-II in the genomes was not linked to a unique pattern of ecological or phenotypic traits. The only exception was ‘pathogenic in animals’, significantly enriched among DXR-II-containing strains, reflecting the occurrence of DXR-II among pathogenic strains of Brucella, Bartonella, Listeria and Clostridium[44–47].
The outstanding phylogenetic discontinuity in DXR-II distribution across eubacteria could be explained through two alternative, though not mutually exclusive, evolutionary mechanisms, i.e., gene gain through HGT or gene loss. HGT is known to have shaped the evolution of multiple metabolic pathways, including IPP and isoprenoid biosynthesis [8, 24, 48]. However, a unique event of HGT cannot properly explain DXR-II phylogeny. According to our phylogenetic analysis, such HGT events should instead have occurred at different time points throughout eubacterial evolution, e.g. between the Alphaproteobacteria and Firmicutes phyla, between the Alphaproteobacteria and Betaproteobacteria classes within the proteobacteria phylum, between Firmicutes and specific Chloroflexi strains or between Firmicutes and specific Fusobacteria. More recently, HGT should also have occurred between closely related Alphaproteobacteria or Firmicute strains. If this was the case, HGT events should have left a signature of atypical sequence features in DXR-II genes, provided they were recent enough and occurring between distantly taxonomically related donor and acceptor bacterial strains [34, 35]. Weak signatures of HGT were found only in Chloroflexi and the Fusobacterium S. termitidis ATCC 33386 at the level of GC content and CAI values. However, no biases in dinucleotide frequencies or codon usage were observed in any strain comparison. These results suggested that HGT events were not at the origin of all discontinuities, or were so ancient that DXR-II genes ameliorated their sequence to specific base composition and codon usage of the host genome, making them indistinguishable from ancestral sequences [34, 35].
Consequently, although old episodic events of HGT cannot be excluded, the alternative hypothesis of recurrent DXR-II (or eventually DXR-I) gene loss is more likely to explain DXR-II phylogeny. This mechanism has been traditionally considered less parsimonious, as it involves a complex ancestor and gene loss events occurring independently at multiple evolutionary lineages. However, recent works suggests that, on average, gene loss might be a more likely event than gene gain through HGT [49–51].
The DXR-I/DXR-II model constitutes an exceptional natural model to experimentally test the emergence and maintenance of redundant gene function between non-homologous genes as a result of convergent evolution, as opposed to their emergence from intragenomic duplicates, or paralogs. Furthermore, our results highlight the importance of the functional characterization of evolutionary shortcuts in isoprenoid biosynthesis for screening specific antibacterial drugs and for regulating the production of isoprenoids of human interest.
Sequence and phylogenetic analysis
Sequence databases from the whole sequenced genomes of 1489 bacterial strains were downloaded from the NCBI. Orthologs of enzymes from the MEP and MVA pathways for IPP biosynthesis, as well as for enzymes of the CP of amino acid biosynthesis (Figure 1), were defined as the best reciprocal hits resulting from all-against-all local BLASTP-searches with an E-value cutoff of 1E-5 and a bit score cutoff of 50  using selected previously characterized sequences as queries (Additional file 6: Table S2). Only hits corresponding to full-length sequences were considered. Resulting hits were scanned for the presence of INTERPRO domains.
Phylogenetic analysis was performed on the basis of an alignment of protein sequences obtained using MUSCLE . Maximum Likelihood (ML) phylogenetic reconstruction was carried out in PhyML v3.0  using the LG protein evolution model  and heterogeneity of amino acid substitution rates corrected using a γ-distribution (G) with eight categories plus a proportion of invariant sites (I), selected by ProtTest v2.4 as the best-fitting amino acid substitution model according to the Akaike information criterion . Starting phylogenetic trees were constructed using the modified program BIONJ. Tree topology searching was optimized using the subtree pruning and regrafting option. The statistical support of the retrieved topology was assessed using the Shimodaira-Hasegawa-like approximate likelihood ratio test (aLRT) .
Bayesian analysis was conducted in MrBayes v3.1.2  using the WAG model  plus G with eight categories plus I. Searches were run using four Markov (MCMC) chains of length 1000000 generations sampling every 100th tree. Once stationary phase was reached (determined by the average standard deviation of split sequences approaching 0, which reflects convergence of independent tree samples), the first 2500 trees were discarded as burn-in, and a 50% majority-rule consensus tree was then constructed to evaluate Bayesian posterior probabilities on clades. Neighbor Joining phylogenetic analysis was performed in MEGA 5.0 . The evolutionary distances for Neighbor Joining phylogenetic reconstruction were computed using the Poisson correction method. To obtain statistical support on the resulting clades, a bootstrap analysis with 1000 replicates was performed. Resulting trees were represented and edited using FigTree v1.3.1.
Analysis of functional divergence
The analysis of functional divergence was performed using DIVERGE v2.0 . DIVERGE performs the ML estimation of the theta (θ) type-I and type-II coefficients of functional divergence, based on the occurrence of altered selective constraints or radical shifts of physicochemical properties, respectively [27, 28]. θ value indicates the extent of functional divergence, ranging from 0, no functional divergence to 1, representing maximum divergence. Functional divergence can be explicitly tested by comparing the fit of a model allowing for functional divergence versus a null model in which functional divergence is not permitted (θ = 0). A Likelihood Ratio Test (LRT) is then used to examine the significance of differences between the lnL values of the two nested models (calculated as 2ΔLnL -twice the difference between their lnL values) . As the LRT asymptotically follows a χ2 distribution with a number of degrees of freedom equal to one, i.e. the differences in number of parameters between the models being compared (θ), a p-value for the fitting of the model accounting for functional divergence can be computed. DIVERGE also uses a site-specific profile to estimate the posterior probabilities (Qk) of individual amino acid sites to be critical for functional divergence.
G + C% content, dinucleotide frequencies, codon usage, and CAI analyses
The following sequence features i) GC% content at three codon positions and total (GC1, GC2, GC3 and GCt), ii) dinucleotide frequencies at 3:1 codon sites (third base and first base of the succeeding codon) and iii) the relative synonymous codon usages (RSCU) were extracted for individual DXR-II sequences and the rest of genes in the corresponding genomes through PERL and R scripts using cpan and bioperl modules. Codon Adaptation Indexes (CAI)  for individual genes and genomes were calculated using the method depicted in  as implemented in DAMBE software . Comparative analyses of these sequence features between DXR-II genes and the rest of genes in the genome were performed and differences assessed using different statistical tests.
i) Differences in G and C nucleotides content were considered as significant when GC% deviated by two or more standard errors (SEs) regarding the respective means for all genes in the genome or deviations at first and third codon position were of the same sign and at least one was higher two or more SEs [35, 66].
averaged over all 16 dinucleotides . Furthermore, pairwise covariation of the 3:1 dinucleotide differences were assessed using the Spearman’s rank correlation coefficient ρ  and the Kendall’s rank correlation coefficient τ . Both are nonparametric statistics allowing testing for dependence between two variables.
For synonymous codon i of an n-fold degenerate amino acid, where X is the number of occurrences of codon i, and n the number of synonymous codons encoding for a given amino acid i.e. 1, 2, 3, 4, or 6. In the absence of any codon usage bias (i.e. all synonymous codons are used equally), the RSCU value would be 1. A codon that is used less or more frequently than expected will have an RSCU value < or > than 1, respectively. Start, stop and tryptophan codons were excluded from the analysis. To measure bias in synonymous codon usage between DXR-II and all genes in the genome, a χ2 test of RSCU with 41 degrees of freedom was implemented .
where CAIobs is the mean of the RSCUs for all codons in a particular gene, and CAImax is the mean of the RSCU for the most frequently used codons for an amino acid in a genome. CAI ranges from 0 to 1, being 1 if the gene only uses the most frequently used synonymous codons in the reference set. Differences in CAI between DXR-II and all genes in the genome were considered as significant if higher than 1.5 times the SE.
Availability of supporting data
The multiple sequence alignment and the phylogenetic tree-files supporting the results of this article have been deposited and are publicly available in the TreeBASE repository under accession numbers: S14611 (http://purl.org/phylo/treebase/phylows/study/TB2:S14611).
Aspartate semialdehyde dehydrogenase
Codon adaptation index
Horizontal gene transfer
DeoxyXylulose 5-phosphate reductoisomerase
- DXR like:
Likelihood ratio test
Relative synonymous codon usage
(taxonomy) Unique Identifier.
We thank all our laboratory members for stimulating discussions and suggestions. We thank Derek Taylor and Mario A Fares for critical reading of the manuscript and helpful comments. Financial support for this research was provided by the Spanish Ministerio de Ciencia e Innovación (grants BIO2011-23680 to MRC and BFU2011-25658 to FJS) and Generalitat de Catalunya (2009SGR-26 and XRB) to MRC.
- Croteau R, Kutchan TM, Lewis NG: Secondary Metabolites. Biochemistry & Molecular Biology of Plants. Edited by: American Society of Plant Physiologists, Buchanan WG B, Jones R. 2000, 1250-1318.Google Scholar
- Daum M, Herrmann S, Wilkinson B, Bechthold A: Genes and enzymes involved in bacterial isoprenoid biosynthesis. Curr Opin Chem Biol. 2009, 13 (2): 180-188. 10.1016/j.cbpa.2009.02.029.PubMedView ArticleGoogle Scholar
- Kuzuyama T, Seto H: Diversity of the biosynthesis of the isoprene units. Nat Prod Rep. 2003, 20 (2): 171-183. 10.1039/b109860h.PubMedView ArticleGoogle Scholar
- Rodríguez-Concepción M, Boronat A: Elucidation of the methylerythritol phosphate pathway for isoprenoid biosynthesis in bacteria and plastids. A metabolic milestone achieved through genomics. Plant Physiol. 2002, 130: 1079-1089. 10.1104/pp.007138.PubMedView ArticleGoogle Scholar
- Lange BM, Rujan T, Martin W, Croteau R: Isoprenoid biosynthesis: the evolution of two ancient and distinct pathways across genomes. Proc Natl Acad Sci U S A. 2000, 97 (24): 13172-13177. 10.1073/pnas.240454797.PubMed CentralPubMedView ArticleGoogle Scholar
- Begley M, Gahan CG, Kollas AK, Hintz M, Hill C, Jomaa H, Eberl M: The interplay between classical and alternative isoprenoid biosynthesis controls gammadelta T cell bioactivity of Listeria monocytogenes. FEBS Lett. 2004, 561 (1–3): 99-104.PubMedView ArticleGoogle Scholar
- Laupitz R, Hecht S, Amslinger S, Zepeck F, Kaiser J, Richter G, Schramek N, Steinbacher S, Huber R, Arigoni D, et al: Biochemical characterization of Bacillus subtilis type II isopentenyl diphosphate isomerase, and phylogenetic distribution of isoprenoid biosynthesis pathways. Eur J Biochem. 2004, 271 (13): 2658-2669. 10.1111/j.1432-1033.2004.04194.x.PubMedView ArticleGoogle Scholar
- Boucher Y, Doolittle WF: The role of lateral gene transfer in the evolution of isoprenoid biosynthesis pathways. Mol Microbiol. 2000, 37: 703-716. 10.1046/j.1365-2958.2000.02004.x.PubMedView ArticleGoogle Scholar
- Phillips MA, Leon P, Boronat A, Rodriguez-Concepcion M: The plastidial MEP pathway: unified nomenclature and resources. Trends Plant Sci. 2008, 13 (12): 619-623. 10.1016/j.tplants.2008.09.003.PubMedView ArticleGoogle Scholar
- Jomaa H, Wiesner J, Sanderbrand S, Altincicek B, Weidemeyer C, Hintz M, Turbachova I, Eberl M, Zeidler J, Lichtenthaler HK, et al: Inhibitors of the nonmevalonate pathway of isoprenoid biosynthesis as antimalarial drugs. Science. 1999, 285 (5433): 1573-1576. 10.1126/science.285.5433.1573.PubMedView ArticleGoogle Scholar
- Kuzuyama T, Seto H: Two distinct pathways for essential metabolic precursors for isoprenoid biosynthesis. Proc Jpn Acad Ser B Phys Biol Sci. 2012, 88 (3): 41-52. 10.2183/pjab.88.41.PubMed CentralPubMedView ArticleGoogle Scholar
- Lichtenthaler HK: The 1-Deoxy-D-Xylulose-5-Phosphate pathway of Isoprenoid Biosynthesis in plants. Annu Rev Plant Physiol Plant Mol Biol. 1999, 50: 47-65. 10.1146/annurev.arplant.50.1.47.PubMedView ArticleGoogle Scholar
- Rodríguez-Concepción M, Boronat A: Isoprenoid biosynthesis in prokaryotic organisms. Isoprenoid Synthesis in Plants and Microorganisms. Edited by: Bach TJ, Rohmer M. 2013, New York: Springer, 1-16.Google Scholar
- Rodriguez-Concepcion M: The MEP pathway: a new target for the development of herbicides, antibiotics and antimalarial drugs. Curr Pharm Des. 2004, 10 (19): 2391-2400. 10.2174/1381612043384006.PubMedView ArticleGoogle Scholar
- Rohdich F, Bacher A, Eisenreich W: Isoprenoid biosynthetic pathways as anti-infective drug targets. Biochem Soc Trans. 2005, 33 (Pt 4): 785-791.PubMedView ArticleGoogle Scholar
- Bouvier F, Rahier A, Camara B: Biogenesis, molecular regulation and function of plant isoprenoids. Prog Lipid Res. 2005, 44 (6): 357-429. 10.1016/j.plipres.2005.09.003.PubMedView ArticleGoogle Scholar
- Perez-Gil J, Rodriguez-Concepcion M: Metabolic plasticity for isoprenoid biosynthesis in bacteria. Biochem J. 2013, 452 (1): 19-25.PubMedView ArticleGoogle Scholar
- Boucher Y, Huber H, L’Haridon S, Stetter KO, Doolittle WF: Bacterial origin for the isoprenoid biosynthesis enzyme HMG-CoA reductase of the archaeal orders thermoplasmatales and archaeoglobales. Mol Biol Evol. 2001, 18 (7): 1378-1388. 10.1093/oxfordjournals.molbev.a003922.PubMedView ArticleGoogle Scholar
- Gophna U, Thompson JR, Boucher Y, Doolittle WF: Complex histories of genes encoding 3-hydroxy-3-methylglutaryl-CoenzymeA reductase. Mol Biol Evol. 2006, 23 (1): 168-178.PubMedView ArticleGoogle Scholar
- Kaneda K, Kuzuyama T, Takagi M, Hayakawa Y, Seto H: An unusual isopentenyl diphosphate isomerase found in the mevalonate pathway gene cluster from Streptomyces sp. strain CL190. Proc Natl Acad Sci USA. 2001, 98 (3): 932-937. 10.1073/pnas.98.3.932.PubMed CentralPubMedView ArticleGoogle Scholar
- Barkley SJ, Cornish RM, Poulter CD: Identification of an Archaeal type II isopentenyl diphosphate isomerase in methanothermobacter thermautotrophicus. J Bacteriol. 2004, 186 (6): 1811-1817. 10.1128/JB.186.6.1811-1817.2004.PubMed CentralPubMedView ArticleGoogle Scholar
- Barkley SJ, Desai SB, Poulter CD: Type II isopentenyl diphosphate isomerase from synechocystis sp. strain PCC 6803. J Bacteriol. 2004, 186 (23): 8156-8158. 10.1128/JB.186.23.8156-8158.2004.PubMed CentralPubMedView ArticleGoogle Scholar
- Sangari FJ, Perez-Gil J, Carretero-Paulet L, Garcia-Lobo JM, Rodriguez-Concepcion M: A new family of enzymes catalyzing the first committed step of the methylerythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis in bacteria. Proc Natl Acad Sci U S A. 2010, 107 (32): 14081-14086. 10.1073/pnas.1001962107.PubMed CentralPubMedView ArticleGoogle Scholar
- Boucher Y, Douady CJ, Papke RT, Walsh DA, Boudreau MER, Nesbø CL, Case RJ, Doolittle WF: Lateral gene transfer and the origins of prokaryotic groups. Annu Rev Genet. 2003, 37: 283-328. 10.1146/annurev.genet.37.050503.084247.PubMedView ArticleGoogle Scholar
- Moreno-Hagelsieb G, Latimer K: Choosing BLAST options for better detection of orthologs as reciprocal best hits. Bioinformatics. 2008, 24 (3): 319-324. 10.1093/bioinformatics/btm585.PubMedView ArticleGoogle Scholar
- Perez-Gil J, Calisto BM, Behrendt C, Kurz T, Fita I, Rodriguez-Concepcion M: Crystal structure of brucella abortus deoxyxylulose-5-phosphate reductoisomerase-like (DRL) enzyme involved in isoprenoid biosynthesis. J Biol Chem. 2012, 287 (19): 15803-15809. 10.1074/jbc.M112.354811.PubMed CentralPubMedView ArticleGoogle Scholar
- Gu X: Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol. 1999, 16 (12): 1664-1674. 10.1093/oxfordjournals.molbev.a026080.PubMedView ArticleGoogle Scholar
- Gu X: A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences. Mol Biol Evol. 2006, 23 (10): 1937-1945. 10.1093/molbev/msl056.PubMedView ArticleGoogle Scholar
- Humphrey W, Dalke A, Schulten K: VMD: visual molecular dynamics. J Mol Graph. 1996, 14 (1): 33-38. 10.1016/0263-7855(96)00018-5. 27–38PubMedView ArticleGoogle Scholar
- Williams KP, Sobral BW, Dickerman AW: A robust species tree for the alphaproteobacteria. J Bacteriol. 2007, 189 (13): 4578-4586. 10.1128/JB.00269-07.PubMed CentralPubMedView ArticleGoogle Scholar
- Larimer FW, Chain P, Hauser L, Lamerdin J, Malfatti S, Do L, Land ML, Pelletier DA, Beatty JT, Lang AS, et al: Complete genome sequence of the metabolically versatile photosynthetic bacterium rhodopseudomonas palustris. Nat Biotechnol. 2004, 22 (1): 55-61. 10.1038/nbt923.PubMedView ArticleGoogle Scholar
- Moreno-Letelier A, Olmedo G, Eguiarte LE, Martinez-Castilla L, Souza V: Parallel evolution and horizontal gene transfer of the pst operon in firmicutes from oligotrophic environments. Int J Evol Biol. 2011, 2011: 781642-PubMed CentralPubMedView ArticleGoogle Scholar
- Fondi M, Brilli M, Fani R: On the origin and evolution of biosynthetic pathways: integrating microarray data with structure and organization of the common pathway genes. BMC bioinformatics. 2007, 8 (Suppl 1): S12-10.1186/1471-2105-8-S1-S12.PubMed CentralPubMedView ArticleGoogle Scholar
- Lawrence JG, Ochman H: Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol. 1997, 44 (4): 383-397. 10.1007/PL00006158.PubMedView ArticleGoogle Scholar
- Lawrence JG, Ochman H: Molecular archaeology of the Escherichia coli genome. Proc Natl Acad Sci U S A. 1998, 95 (16): 9413-9417. 10.1073/pnas.95.16.9413.PubMed CentralPubMedView ArticleGoogle Scholar
- Karlin S, Burge C: Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 1995, 11 (7): 283-290. 10.1016/S0168-9525(00)89076-9.PubMedView ArticleGoogle Scholar
- Hooper SD, Berg OG: Detection of genes with atypical nucleotide sequence in microbial genomes. J Mol Evol. 2002, 54 (3): 365-375.PubMedView ArticleGoogle Scholar
- Genschel U: Coenzyme a biosynthesis: reconstruction of the pathway in archaea and an evolutionary scenario based on comparative genomics. Mol Biol Evol. 2004, 21 (7): 1242-1251. 10.1093/molbev/msh119.PubMedView ArticleGoogle Scholar
- Gherardini PF, Wass MN, Helmer-Citterich M, Sternberg MJ: Convergent evolution of enzyme active sites is not a rare phenomenon. J Mol Biol. 2007, 372 (3): 817-845. 10.1016/j.jmb.2007.06.017.PubMedView ArticleGoogle Scholar
- Kulkarni N, Lakshmikumaran M, Rao M: Xylanase II from an alkaliphilic thermophilic Bacillus with a distinctly different structure from other xylanases: evolutionary relationship to alkaliphilic xylanases. Biochem Biophys Res Commun. 1999, 263 (3): 640-645. 10.1006/bbrc.1999.1420.PubMedView ArticleGoogle Scholar
- Watanabe S, Yamada M, Ohtsu I, Makino K: alpha-ketoglutaric semialdehyde dehydrogenase isozymes involved in metabolic pathways of D-glucarate, D-galactarate, and hydroxy-L-proline. Molecular and metabolic convergent evolution. J Biol Chem. 2007, 282 (9): 6685-6695.PubMedView ArticleGoogle Scholar
- Brocks JJ, Logan GA, Buick R, Summons RE: Archean molecular fossils and the early rise of eukaryotes. Science. 1999, 285 (5430): 1033-1036. 10.1126/science.285.5430.1033.PubMedView ArticleGoogle Scholar
- Iguchi E, Okuhara M, Kohsaka M, Aoki H, Imanaka H: Studies on new phosphonic acid antibiotics. II. Taxonomic studies on producing organisms of the phosphonic acid and related compounds. J Antibiot (Tokyo). 1980, 33 (1): 19-23.View ArticleGoogle Scholar
- Guptill L: Bartonellosis. Vet Microbiol. 2010, 140 (3–4): 347-359.PubMedView ArticleGoogle Scholar
- Allerberger F, Wagner M: Listeriosis: a resurgent foodborne infection. Clin Microbiol Infect. 2010, 16 (1): 16-23. 10.1111/j.1469-0691.2009.03109.x.PubMedView ArticleGoogle Scholar
- von Bargen K, Gorvel JP, Salcedo SP: Internal affairs: investigating the brucella intracellular lifestyle. FEMS Microbiol Rev. 2012, 36 (3): 533-562. 10.1111/j.1574-6976.2012.00334.x.PubMedView ArticleGoogle Scholar
- Wells CL, Wilkins TD: Clostridia: sporeforming anaerobic bacilli. Medical Microbiology. Edited by: Baron S. 1996, Galveston (TX), 4Google Scholar
- Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405 (6784): 299-304. 10.1038/35012500.PubMedView ArticleGoogle Scholar
- Kunin V, Ouzounis CA: The balance of driving forces during genome evolution in prokaryotes. Genome Res. 2003, 13 (7): 1589-1594. 10.1101/gr.1092603.PubMed CentralPubMedView ArticleGoogle Scholar
- Kurland CG, Canback B, Berg OG: Horizontal gene transfer: a critical view. Proc Natl Acad Sci U S A. 2003, 100 (17): 9658-9662. 10.1073/pnas.1632870100.PubMed CentralPubMedView ArticleGoogle Scholar
- Mirkin BG, Fenner TI, Galperin MY, Koonin EV: Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evol Biol. 2003, 3: 2-10.1186/1471-2148-3-2.PubMed CentralPubMedView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32 (5): 1792-1797. 10.1093/nar/gkh340.PubMed CentralPubMedView ArticleGoogle Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010, 59 (3): 307-321. 10.1093/sysbio/syq010.PubMedView ArticleGoogle Scholar
- Le SQ, Gascuel O: An improved general amino acid replacement matrix. Mol Biol Evol. 2008, 25 (7): 1307-1320. 10.1093/molbev/msn067.PubMedView ArticleGoogle Scholar
- Abascal F, Zardoya R, Posada D: ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005, 21 (9): 2104-2105. 10.1093/bioinformatics/bti263.PubMedView ArticleGoogle Scholar
- Anisimova M, Gascuel O: Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol. 2006, 55 (4): 539-552. 10.1080/10635150600755453.PubMedView ArticleGoogle Scholar
- Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.PubMedView ArticleGoogle Scholar
- Whelan S, Goldman N: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol. 2001, 18 (5): 691-699. 10.1093/oxfordjournals.molbev.a003851.PubMedView ArticleGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) Software Version 4.0. Mol Biol Evol. 2007, 24 (8): 1596-1599. 10.1093/molbev/msm092.PubMedView ArticleGoogle Scholar
- Gu X, Vander Velden K: DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics. 2002, 18 (3): 500-501. 10.1093/bioinformatics/18.3.500.PubMedView ArticleGoogle Scholar
- Goldman N, Yang Z: A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994, 11 (5): 725-736.PubMedGoogle Scholar
- Sharp PM, Li WH: The codon Adaptation Index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987, 15 (3): 1281-1295. 10.1093/nar/15.3.1281.PubMed CentralPubMedView ArticleGoogle Scholar
- Xia X: An improved implementation of codon adaptation index. Evol Bioinform Online. 2007, 3: 53-58.PubMed CentralPubMedGoogle Scholar
- Xia X, Xie Z: DAMBE: software package for data analysis in molecular biology and evolution. J Hered. 2001, 92 (4): 371-373. 10.1093/jhered/92.4.371.PubMedView ArticleGoogle Scholar
- Garcia-Vallve S, Romeu A, Palau J: Horizontal gene transfer in bacterial and archaeal complete genomes. Genome Res. 2000, 10 (11): 1719-1725. 10.1101/gr.130000.PubMed CentralPubMedView ArticleGoogle Scholar
- Karlin S: Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol. 2001, 9 (7): 335-343. 10.1016/S0966-842X(01)02079-0.PubMedView ArticleGoogle Scholar
- Spearman C: The proof and measurement of association between Two things. Am J Psychol. 1904, 15 (1): 72-101. 10.2307/1412159.View ArticleGoogle Scholar
- Kendall MG: A new measure of rank correlation. Biometrika. 1938, 30 (1–2): 81-93.View ArticleGoogle Scholar
- Sharp PM, Tuohy TM, Mosurski KR: Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 1986, 14 (13): 5125-5143. 10.1093/nar/14.13.5125.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.