Horizontal gene transfer in chromalveolates
© Nosenko and Bhattacharya; licensee BioMed Central Ltd. 2007
Received: 30 March 2007
Accepted: 25 September 2007
Published: 25 September 2007
Horizontal gene transfer (HGT), the non-genealogical transfer of genetic material between different organisms, is considered a potentially important mechanism of genome evolution in eukaryotes. Using phylogenomic analyses of expressed sequence tag (EST) data generated from a clonal cell line of a free living dinoflagellate alga Karenia brevis, we investigated the impact of HGT on genome evolution in unicellular chromalveolate protists.
We identified 16 proteins that have originated in chromalveolates through ancient HGTs before the divergence of the genera Karenia and Karlodinium and one protein that was derived through a more recent HGT. Detailed analysis of the phylogeny and distribution of identified proteins demonstrates that eight have resulted from independent HGTs in several eukaryotic lineages.
Recurring intra- and interdomain gene exchange provides an important source of genetic novelty not only in parasitic taxa as previously demonstrated but as we show here, also in free-living protists. Investigating the tempo and mode of evolution of horizontally transferred genes in protists will therefore advance our understanding of mechanisms of adaptation in eukaryotes.
Horizontal gene transfer (HGT) is the movement of genetic material between different species and is considered to be one of the major driving forces of prokaryotic evolution [1–4]. Until recently, it was believed that this phenomenon was largely restricted to the prokaryotic domain. In eukaryotes, gene duplication has classically been viewed as the major source of genetic novelty [5, 6]; this paradigm of eukaryotic evolution is based on genome studies of model organisms such as multicellular plants, animals, and fungi. In the last decade, rapid accumulation of genome data from unicellular eukaryotes, protists, has allowed researchers to reassess the role of HGT in eukaryotic evolution. The results of comparative analyses of genomes of anaerobic parasitic protists provided a major breakthrough in our understanding of the impact of interdomain HGT in eukaryotes. For example, 96 potential cases of prokaryote-to-eukaryote HGT were identified in the genome of an intestinal parasite of humans and animals Entamoeba histolytica , 84 in the fish parasite Spironucleus salmonicida , 152 in a sexually transmitted human pathogen Trichomonas vaginalis , 24 in Cryptosporidium parvum , and 148 in anaerobic rumen ciliates . These numbers comprise up to 4% of genes in the extremely reduced genomes of these anaerobic protists. It is believed that the acquisition of bacterial genes by these eukaryotes accelerated their adaptation to anaerobic environments and the transition to a parasitic life style. Several recent reports indicate that HGT also plays a role in the genome evolution of free-living protists. Analysis of the complete genome sequence of the soil amoeba Dictyostelium discoideum led to the identification of 18 genes derived from prokaryotes . Several cases of HGT have been reported for dinoflagellate and chlorarachniophyte algae [13–16]. The fact that complete genome sequences are available now for a limited number of free living protists explains a significant disproportion in the study of HGT in different groups of protists. However, public databases also contain Expressed Sequence Tag (EST) libraries for over 50 species of free living unicellular eukaryotes [17, 18] that can also be used to assess the impact of HGT on genome evolution in protists.
Here we analyze EST and complete genome data to study HGT in chromalveolate protists. Chromalveolates comprise the six eukaryotic lineages, cryptophytes, haptophytes, stramenopiles, ciliates, apicomplexans, and dinoflagellates and have adapted to a wide variety of environments. They are characterized by a tremendous diversity of forms and modes of nutrition including heterotrophy, parasitism, phototrophy, and mixotrophy. According to the chromalveolate hypothesis, the common ancestor of the six constituent lineages was a free living photosynthetic organism that derived its plastid via a red algal secondary endosymbiosis . Within-chromalveolate taxon relationships and the monophyly of this group are controversial . Nuclear gene phylogenies support the monophyly of stramenopiles, ciliates, apicomplexans, and dinoflagellates and monophyly of cryptophytes and haptophytes [21, 22]. However, relationships between the two clades still remain unresolved.
Gene movement from the endosymbiont to the host nucleus is a specific instance of HGT that is referred to as endosymbiotic gene transfer (EGT). The impact of EGT on the evolution of chromalveolate genomes has been intensively studied in the last decade [23–27] and will not be considered here. We limited our research to gene transfers from non-organellar sources. To identify genes acquired by chromalveolates through HGT at different time points in their evolutionary history, we performed a broad scale phylogenetic analysis of the EST data generated for a free living phototrophic dinoflagellate alga Karenia brevis that is renowned as an agent of toxic algal blooms that annually cause massive fish and marine mammal mortality in the Gulf of Mexico . Detailed analyses of the identified genes presented in this paper suggest that recurring inter- and intradomain gene movement should be considered as an important source of genetic novelty in chromalveolates.
In this study, we used a combination of four different approaches to identify genes acquired by chromalveolates through HGT (see Methods). The major goal of this study was to discover genes uniquely present in chromalveolates and bacteria. This study is based on the assumption that HGT is the most plausible explanation for the occurrence of bacterial genes in a single eukaryotic lineage. An alternative explanation is that these bacterial genes were derived via intracellular transfer from the mitochondrial progenitor by the ancestral eukaryote and subsequently lost from most taxa. Apart from invoking independent gene losses from potentially many eukaryotic lineages, the latter scenario implies (improbably) that the genome size of the eukaryotic ancestor was far larger than in extant taxa.
Horizontal gene transfers from bacteria to chromalveolates
Protein family [Function]
Pyridoxal phosphate dependent aminotransferase [Cell envelope biogenesis, outer membrane]
NAD dependent epimerase/dehydratase [Cell envelope biogenesis, outer membrane]
Clavaminic acid synthetase [Biosynthesis of clavulanic acid]
Malate-quinone oxidoreductase [Energy metabolism]
Monomeric NADP(+)-dependent isocitrate dehydrogenase. [Energy metabolism]
Iron-containing alcohol dehydrogenase [Energy metabolism]
Additional file 1
NAD-dependent aldehyde dehydrogenases [Energy metabolism]
Additional file 1
Substrate-bound, membrane-associated, periplasmic binding protein [Substrate transport]
Additional file 1
Silent information regulator 2 [Gene silencing, DNA repair]
Additional file 1
Arylsulfatase A [Substrate transport]
Additional file 1*
Additional file 1*
Alpha-tubulin suppressor [Cell division and chromosome partitioning, cytoskeleton]
Additional file 1*
Pyridoxal phosphate biosynthetic protein PdxA [Amino acid metabolism]
Additional file 1*
Metal-dependent hydrolase of the TIM-barrel fold
Additional file 1*
Additional file 1*
Below we provide a detailed description of the most interesting cases of prokaryote-to-eukaryote HGTs. The direction of interdomain HGTs was inferred based on the relative distribution of the gene among bacteria and eukaryotes. Genes widespread among prokaryotes and rare among eukaryotes were considered to be derived from a prokaryotic donor. The identified proteins are classified in various functional groups based on known functions of their bacterial homologs including plasma membrane biogenesis and biosynthesis of secondary metabolites, energy and amino acid metabolism, substrate transport, regulation of gene expression and DNA repair. In addition, we present results of phylogenetic analyses of translation elongation factor EF2 that represents the only identified case of a recent transfer that occurred after the Karenia and Karlodinium divergence and the only example of HGT involving a gene of eukaryotic origin.
Plasma membrane biogenesis and biosynthesis of secondary metabolites
Dehydrogenase MVIM-sugar aminotransferase fusion protein
The mviM/wecE-14 genes have a bipartite structure (Fig. 1). The N-terminal region of this sequence contains a NAD-binding Rossmann fold domain typical for the GFO/IDH/MocA oxidoreductase family and shows high similarity to the dehydrogenase MVIM found in Bacteria and Archaea (55% amino acid sequence identity). The C-terminal domain of mviM/wecE-14 encodes WECE that shares 81–84% amino acid sequence identity with protein encoded by wecE-17 and about 67% amino acid sequence identity with its bacterial homologs. Using analyses of nucleotide differences in protein coding regions and insertion-deletions in 5' and 3' UTR, we identified at least seven MVIM-WECE-encoding genes in K. brevis. Because these genes share 92–99% amino acid sequence identity and retain significant sequence similarity of their 5' and 3' UTRs their origin is likely through recent gene duplications. The K. brevis culture that was used for the EST data collection was a vegetative haploid clonal cell line therefore all sequence variants were non-allelic gene copies.
Nucleotide composition of selected genes acquired by protists through HGT
GC content (%)
Study of HGT in bacteria demonstrates that genes encoding physiologically coupled reactions are often co-transferred, frequently in operons . The fact that MVIM- and WECE-encoding genes are fused in K. brevis and linked in several proteobacteria may indicate that proteins encoded by these genes are functionally coupled. Functions of the dehydrogenase MVIM are poorly characterized in bacteria. The bacterial homologs of WECE have been intensively studied for their involvement in the biosynthesis of microlide antibiotics that belong to the large family of secondary metabolites known as polyketides [34–37], outer membrane liposaccharides [38–40], and surface layer glycoproteins [41–43]. According to the results of phylogenetic analyses, WECE proteinsidentified in eukaryotes form a monophyletic clade with bacterial proteins from two distinct groups (Fig. 2A). The first group is represented by actinobacterial proteins involved in the biosynthesis of microlide antibiotics such as narbomycin, erythromycin, pikromycin, and neomethymycin (Fig. 2A). These catalyze the biosynthesis of the deoxy sugar D-desosamine, the addition of which to the actinobacterial polyketides is crucial for their antibiotic activity . The second group includes monofunctional sugar aminotransferases involved in the glycosylation of the surface layer proteins in firmicutes Aneurinibacillus thermoaerophilus (ftd B) and Thermoanaerobacterium thermosaccharolyticum (qdt B). Ftd B and qdt B encode a key enzyme of the biosynthesis of thymidine diphosphate-activated 3-acetamido-3,6-dideoxy- D- galactose (dTDP-D-Fuc p 3NAc), which serves as a precursor for the assembly of structural polysaccharides in bacteria [41, 43]. The presence of multiple WECE-encoding genes and a high expression level of these genes in K. brevis suggest that this protein is involved in physiologically important processes in this organism.
NAD dependent sugar nucleotide epimerase/dehydratase
The functions of these enzymes have not been studied in chromalveolates. In Bacteria, epimerase-I and III catalyze the biosyntheses of dTDP-D-Fuc p 3NAc and dTDP-L-ramnose, compounds that serve as precursors for the assembly of outer membrane structural polysaccharides [41–43, 45]. On the dTDP-D-Fuc p 3NAc pathway, they function upstream of the described earlier WECE proteins. We cannot exclude that WECE and epimerase-III represent a functionally coupled enzyme pair in K. brevis that might have been acquired by dinoflagellates in one HGT event (Figs. 2, 3). Available experimental data suggest that epimerase-I and II are also involved in cell wall polysaccharide biosynthesis in plants  and diplomonads . Based on these data we propose that epimerase-III performs similar function(s) in K. brevis.
Clavaminic acid synthetase-like protein
Clavaminic acid synthetase (CAS) belongs to the large family of iron and 2-oxoacid-dependent dioxygenases, an important class of enzymes that mediates a variety of oxidative reactions . Most studies of CAS have been carried out using the Streptomyces isozymes . In Streptomyces, CAS catalyzes three major steps of the clavulanic acid biosynthesis. Clavulanic acid is a natural inhibitor of β-lactamases, enzymes that confer resistance to β-lactam antibiotics in bacteria.
Iron-containing alcohol dehydrogenase and NAD-dependent aldehyde dehydrogenase
Iron-containing alcohol dehydrogenase (Fe-ADH) and NAD-dependent aldehyde dehydrogenase (PutA) are probably the best-studied proteins from the perspective of HGT in eukaryotes. Aldehyde-alcohol dehydrogenase protein (AdhE) has arisen through the fusion of two protein domains, PutA and Fe-ADH and is considered to be a key enzyme in energy metabolism in parasitic amitochondriate protists . Previous studies on parasitic protists demonstrated that AdhE-encoding genes have been subjects of multiple independent prokaryote-to-eukaryote HGTs. For information about AdhE functions and phylogeny in parasitic protists we would direct readers to references [51, 52]. In addition to previous findings, we identified Fe-ADH in free-living dinoflagellates and jakobids (see Additional file 1). These sequences share over 50% amino acid identity with bacterial homologs. Results of phylogenetic analyses suggest that the two lineages acquired Fe-ADH-encoding genes independently from closely related prokaryotes. The Fe-ADH tree obtained in this study is shown in Additional file 1. PutA has been found in free living dinoflagellates, stramenopiles, and jakobids (see Additional file 1). PutA sequences in these three lineages show over 50% amino acid identity to corresponding bacterial proteins and according to the result of phylogenetic analyses, originated through independent interdomain transfers from distinct prokaryotic donors (see Additional file 1). Phylogenetic analyses strongly support the monophyly of dinoflagellate and bacterial sequences (BPml = 100%; BPnj = 100%; BPP = 1.0) and suggest that dinoflagellates acquired PutA before the divergence of Karenia and Karlodinium.
Malate-quinone oxidoreductase (MQO) is a functional analog of the better-known NAD-dependent malate dehydrogenase (MDH) that catalyses the conversion of malate to oxaloacetate in the tricarboxylic acid (TCA) cycle. In contrast to MDH, bacterial MQO is a membrane-associated enzyme that utilizes flavin adenine dinucleotide (FAD) as a cofactor and donates the electrons from malate oxidation to quinones instead of NAD [53–55]. MQO is protein common in bacteria. Among eukaryotes, this enzyme has been previously reported only for apicomplexans . Apicomplexans lack the mitochondrial form of MDH. It has been shown that MQO compensates for mitochondrial MDH in the TCA cycle in this lineage .
Surprisingly, comparison of the MQO sequences identified in dinoflagellates and haptophytes (MQO-DH) with the apicomplexan MQO (MQO-A) show that these proteins share significant similarity only at the short N-terminal FAD-binding domains (22% overall amino acid sequence identity). The analysis of the protein distribution and phylogeny showed that MQO-DH and MQO-A have been acquired by chromalveolates from different bacterial donors through independent transfer events (Figs. 5A, 5B). The fact that MQO-A shows highest similarity to homologs in epsilon proteobacteria (all BLAST hits with e-value ≤ 10-20) suggests that apicomplexans acquired MQO-A from this bacterial group. Homologs of MQO-DH have been identified in multiple bacterial lineages including firmicutes, actinobacteria, and three proteobacterial groups: alpha-, beta-, and gamma proteobacteria. Although the tree topology (Fig. 5A) does not allow us to identify the bacterial donor of the MQO-DH in chromalveolates, the presence of N-terminal extension in both chromalveolate and proteobacterial MQO-DH sequences suggests a proteobacterial origin of this protein. Highly hydrophobic N-terminal extensions of the proteobacterial MQO sequences are likely responsible for the protein interaction with bacterial membrane. The corresponding regions of the dinoflagellate MQO sequences have a low hydrophobicity and according to the results of analyses with protein topology prediction programs do not encode mitochondrial-, plastid-, peroxisomal-targeting or signal peptides (results not shown). Most likely, MQO-DH represents a cytosolic enzyme. To verify this hypothesis we assessed the presence/absence of MDH isoforms in haptophytes and dinoflagellates. We found a mitochondrial-targeted MDH in both lineages and a cytosolic MDH in haptophytes (see Additional file 1). The cytosolic isoform is absent from EST libraries of six dinoflagellate species that have been analyzed. This observation suggests that cytosolic MDH was replaced by MQO in dinoflagellates, because haptophytes retain both cytosolic enzymes. Analogous cases have been observed in prokaryotes. For example, Escherichia coli and Corynebacterium glutamicum contain both MQO and MDH [54, 55], and Helicobacter pylori has only MQO . The study of bacterial MQO shows that reactions catalyzed by this enzyme have a very favorable standard free energy difference (ΔG°) in comparison with reactions catalyzed by MDH . In addition, MQO uses carbon and energy sources different from MDH. Therefore this enzyme may be beneficial for the cell under the conditions unfavorable for MDH activity.
Monomeric NADP-dependent isocitrate dehydrogenase
NADP-dependent isocitrate dehydrogenase (NADP-IDH) is an important enzyme of the intermediary metabolism that controls the carbon flux within the TCA cycle and supplies the cell with 2-oxoglutarate and NADPH for biosynthesis . There are several NADP-IDH isoforms in photosynthetic organisms including cytosolic, mitochondrial, plastid, and peroxisomal enzymes. These four NADP-IDH isoforms have arisen in eukaryotes from a single progenitor enzyme . Eukaryotic NADP-IDH proteins form a dimeric structure composed of identical subunits of 40–50 kDa and share about 40% identity to the prokaryotic dimeric NADP-IDH (NADP-IDH-I) [58, 59].
Substrate-bound periplasmic binding protein
Bacterial substrate-bound periplasmic binding proteins (PBPb) are components of membrane-associated complexes that transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins, and inorganic ions . We found homologs of a bacterial PBPb in three lineages of photosynthetic chromalveolates and a photosynthetic excavate Euglena gracilis (see Additional file 1). Although PBPb has a restricted distribution similar to NADP-IDH-II in photosynthetic eukaryotes, analyses with protein topology prediction programs do not support a plastid localization of PBPb. Phylogenetic analyses support the monophyly of chromalveolate and E. gracilis PBPb sequences (see Additional file 1) suggesting that, like MVIM-WECE and epimerase-II, this protein spread among eukaryotes through intradomain HGT.
Translation elongation factor 2
Recurrent HGT in protists
Gene distribution by multiple independent HGTs in eukaryotes
Multiple Independent HGTs
eukaryotic lineages involved
Fructose-bisphosphate aldolase class II, type B
Pyruvate phosphate dikinase
Translation elongation factor-1 alpha-like protein
Excavata, Chromalveolata, Opisthokonta, Plantae
Shikimate biosynthetic enzyme AroB
Chromalveolata, Opisthokonta (Fungi)
Excavata, Chromalveolata, Amoebozoa, Plantae (Green algae)
Excavata, Chromalveolata, Amoebozoa, Opisthokonta (Fungi)
Excavata, Chromalveolata, Amoebozoa, Plantae (Green algae), Opisthokonta (Fungi)
Excavata, Chromalveolata, Amoebozoa
Two features typical for phylogenetic trees resulting from the analysis of these proteins are: (1) the presence of several prokaryotic-eukaryotic clades within one tree (Fig. 3, Additional file 1) and (2) the presence of several species from distantly related eukaryotic lineages within one clade (Figs. 2, 3, 4, 5, 7, Additional file 1). These tree topologies may be explained by multiple independent inter- and intradomain transfers of genes encoding the same enzyme. The study of HGT in bacteria and parasitic protists demonstrates that adaptation to specific environments is the major force driving HGT [33, 72]. Genes beneficial under certain environmental conditions can independently be acquired by different eukaryotic lineages that occupy different niches. Reconstructing the phylogeny of proteins involved in anaerobic glycolysis in parasitic protists provides an illustration of this scenario . Here, the gene encoding fructose-bisphosphate aldolase class II, type B was acquired independently by Parabasalida and the common ancestor of Oxymonadida and Diplomonadida. Three prokaryote-to-eukaryote transfers explain the occurrence of pyruvate phosphate dikinase in Parabasalida, parasitic Euglenozoa, and Oxymonadida-Diplomonadida lineage. The aerobe-to-anaerobe transition occurred several times in the evolution of excavates. During this transition, different lineages of excavates independently acquired genes associated with anaerobic glycolysis from prokaryotes that had already inhabited corresponding niches.
The results of our study show that this scenario is applicable as well to free-living eukaryotes. Two isoforms of sugar epimerase, epimerase-II and epimerase-III that originated via an ancient gene duplication event in prokaryotes were independently acquired by dinoflagellates and stramenopiles (Fig. 3). Two independent interdomain HGTs explain the occurrence of structurally distinct isoforms of bacterial MQO in free living haptophytes and dinoflagellates and parasitic apicomplexans. The second feature of phylogenetic trees resulted from the analysis of transferred genes, the monophyly of distantly related eukaryotes, may be explained either by intradomain (eukaryote-to-eukaryote) gene transfer or by several interdomain transfers from the same prokaryotic donor. This type of phylogeny is more likely to reflect specific relationships between microorganisms that occupy (or occupied in their evolutionary past) one ecological niche. Sequential HGTs that involved a prokaryote and two distantly related anaerobic protists have been previously proposed as an explanation for the patchy distribution of alcohol dehydrogenase, alanyl-tRNA synthetase, and fructose-bisphosphate aldolase class II, type B protein among eukaryotes [51, 69, 71] (Table 3). The bacteria-derived isoform of glyceraldehyde-3-phosphate that functions as a cytosolic protein in free living dinoflagellates and Euglena and as a glycosomal protein in parasitic Euglenozoa provides another example of a protein derived by eukaryotes through sequencial HGTs  (Table 3). Gene acquisition through sequential HGTs is the most plausible scenario for the distribution of MVIM-WECE, epimerase-II, PBPb, and, possibly, MQO-DH and CAS-like protein presented in this paper. It is believed that HGT is more likely to occur between closely related lineages . Such transfers are hard to identify unless transferred genes have a limited distribution within the studied taxonomic group. CAS-like protein and MQO-CH identified in this study are present in two chromalveolate lineages, haptophytes and dinoflagellates (Figs. 4, 5). Possible interpretations of a patchy gene distribution between closely related lineages include differential gene loss and gene transfer. The gene loss scenario would assume an independent gene loss from three chromalveolate lineages: stramenopiles, ciliates, and apicomplexans. However, the fact that haptophytes provide not only a food source but also a unique pool of temporary plastids (kleptoplastids) for several species of extant dinoflagellates [74, 75] and have contributed the plastid to the common ancestor of Karenia and Karlodinium [32, 76] demonstrates that these two relatively distantly related algal lineages have been involved in specific predator-prey interactions over millions of years. This fact makes sequential HGTs a more plausible scenario for the occurrence of MQO-CH and CAS-like protein in dinoflagellates.
Genes encoding MVIM, WECE, epimerase-II, and PBPb proteins are shared by bacteria and several lineages of chromalveolates and excavates. According to the results of our phylogenetic analyses, genes encoding these proteins were acquired by one eukaryotic lineage through an ancient interdomain HGT and transferred to another via intradomain HGT. Phagotrophy is widespread in chromalveolates and excavates therefore this feeding mode may explain an increased rate of HGT in these taxa [11, 52, 72, 77]; i.e., many extant species of excavates and chromalveolates feed on bacterial and eukaryotic microorganisms [78–82]. This dynamic process has made it impossible to identify donors and recipients in these eukaryote-to-eukaryote HGTs. An inconsistency between the gene phylogeny and species phylogeny observed in the prokaryotic region of the MVIM, WECE, and epimerase trees suggests that genes encoding these proteins are subjects of frequent HGTs in bacteria. Structural analyses of bacterial gene clusters that include close homologs of the K. brevis WECE (ftdB and qdtB) and epimerase-III (gepiA and wxoA) support this scenario [41, 45]. FtdB and qdtB belong to the large cluster of genes involved in the biosynthesis of surface layer glycoproteins (SLG) in firmicutes. It has been shown that the GC content of the SLG clusters deviates significantly in many bacteria from the GC content of genome as a whole . This observation together with the fact that SLG clusters are typically flanked by several transposases or remnants thereof indicate that the entire SLG region may be a subject of HGT in bacteria. A similar conclusion resulted from the analysis of the bacterial lipopolysaccharide biosynthetic loci that includes gepiA and wxoA . This observation completes the proposed scenario of gene distribution by sequential HGTs with an additional feature that is prokaryote-to-prokaryote HGT.
Measuring the contribution of HGT to eukaryotic genomes
Several attempts to numerically estimate the contribution of HGT to eukaryotic genomes suggest a substantial inter-taxon variation in the number of horizontally derived genes [7, 12, 52, 83, 84]. Existing studies show that although extremely rare in Plantae and multicellular Opisthokonta, HGT is a common phenomenon in Amoebozoa, Excavata, and chromalveolates. However, the variation in the numbers of HGTs reported for different species within the phagotrophic lineages (see Background) reflects a difference in analytical approaches. Differences in stringency of data screening parameters, the taxonomic composition of databases used for the comparative analysis, and methods of phylogenetic analyses can significantly affect the outcome of the study. Standardization of methods for estimating the contribution of HGT in eukaryotic genomes should be based on the knowledge of tempo and mode of evolution of horizontally transferred genes. To our knowledge, these issues have never been exhaustively studied in eukaryotes.
Studies of prokaryotic genome evolution demonstrate that many recently transferred genes have very large KA/KS ratio that suggests directional selection . In addition, the rate of duplications among genes derived through HGTs is significantly higher than among indigenous ones in bacteria . The proposed scenario for the fate of transferred genes in bacteria based on these observations includes their uptake, duplication, rapid diversification of gene copies by mutations, and consequent fixation of the "best" copies and elimination of other duplicates. Is this scenario applicable for eukaryotes?
Analyses of proteins presented in these study show that nine of them are encoded by at least 2–12 genes in the K. brevis genome (Table 1). Following the standard approach of phylogenomics, we excluded from the analysis all proteins represented by multiple paralogs in several eukaryotic lineages. Therefore we investigated only those paralogs that arose from relatively recent duplications. The fact that several dinoflagellate species contain multiple highly divergent (< 50% amino acid identity) paralogs of ATS1 (Additional file 1) suggests that duplication of genes encoding this protein occurred before the divergence of dinoflagellate lineages. Phylogenetic analyses of ATS1 support this statement (see Additional file 1). The amino acid sequence identity of paralogs resulting from duplications that, according to phylogenetic analyses, occurred after the Karenia and Karlodinium divergence varies from 60% (MQO) to 99% (WECE). The comparison of genes encoding WECE protein shows that the amino acid sequence identity of different copies varies from 81 to 99%. Assuming that divergence between two paralogs is proportional to the time elapsed since gene duplication, the observed variation suggests that duplication of WECE genes is a continuous process in K. brevis.
To summarize, the results of this analysis demonstrate that the impact of HGT on genome evolution in dinoflagellates is reinforced by continuous duplications of the transferred genes and consequent diversification of the resulting paralogs. Additional studies are required to estimate relative duplication rates of foreign and indigenous genes, rates of mutations, and paralog silencing in this group of organisms.
Taking into consideration that genomic data are available for only a minuscule fraction of bacteria and protists populating our planet and that we were able to identify multiple cases of HGT of genes encoding the same proteins leads us to one simple conclusion. Horizontal gene transfer contributes significantly to protist genomes. We believe that in niches where parasitism and phagotrophy are common, beneficial genes may spread rapidly from prokaryotes to eukaryotes and provide a molecular basis for niche-specific adaptations in the latter group. It is clear however, that all genes are not transferred with equal frequency in eukaryotes with the majority of HGT candidates being involved in metabolic processes. However, given that foreign DNA fragments from eukaryotes frequently integrate in protist chromosomes, it should not be surprising that occasionally genes encoding proteins of a more universally conserved function such as EF2 and potentially EF-1 alpha-like  may also be co-transferred. Apart from the exciting ramifications for post-HGT gene evolution in eukaryotes that includes gene family evolution and selection for novel functions, our work also underlines the great care that needs to be taken when generating eukaryote-wide trees of life that include many phagotrophic or parasitic taxa.
Karenia brevis EST library
In this study, we used EST data generated from clonal K brevis Wilson cells grown under five different culture conditions: 1) under nitrate depletion, 2) under phosphate depletion, 3) in log phase under replete conditions, harvested during the light phase, 4) in the presence of oxidative metals, and 5) undergoing heat stress. For complete information about generation, sequencing, and processing of the K. brevis EST library see reference . Clustering and assembly of the EST was done using default settings of the TGICL computer program [86, 87]. The assembly resulted in 9,786 EST contigs; each representing a unique gene.
Identification of proteins acquired by chromalveolates through ancient HGTs
To identify ancient HGTs in chromalveolates, we analyzed a subset of genes present in the K. brevis EST data and in the EST data of at least one other species of dinoflagellate. This approach allowed us to exclude from the analysis possible bacterial contaminants of the EST library. Genes shared by dinoflagellates have been detected using K. brevis ESTs as an input for the sequence similarity search (BLAST; e-value ≤ 10-10) against a local database that included available data from the GenBank dbEST database  for five dinoflagellate species: Alexandrium tamarense, Amphidinium carterae, Heterocapsa triquetra, Lingulodinium polyedrum, and K. micrum. This analysis yielded 3,341 EST contigs. To detect potential HGTs, we used the defined subset of K. brevis DNA sequences as an input for the sequence similarity search (BLASTx; e-value ≤ 10-20) against the GenBank non-redundant database (nr). Sequences that showed highest similarity to prokaryotic proteins (three top hits) or chromalveolate and prokaryotic proteins have been selected for further analyses. Using this approach, we identified 95 K. brevis unigenes encoding proteins from 55 protein families putatively derived from prokaryotes at different time points of chromalveolate evolution.
In parallel, we performed a high throughput automated analysis of the subset of sequences shared by dinoflagellates. The 3,341 sequences were translated into the six open reading frames using the Transeq program in the Emboss package  and used as input for the analysis with the PhyloGenie package of computer programs . PhyloGenie serves as an automated pipeline in which the following analyses can be implemented: BLAST search against a local database, extraction of homologous sequences from the BLAST results, generation of alignments, phylogenetic tree reconstruction, and calculation of bootstrap support values for individual phylogenies. We created a local protein database for the PhyloGenie BLAST search by retrieving completed genome sequences and EST data from the National Center for Biotechnology Information (NCBI)  genomic projects web site and dbEST , DOE Joint Genome Institute (JGI) , Cyanidioschyzon merolae genome project , and The Galdieria sulphuraria genome project  for species listed below. EST sequences have been translated into six open reading frames and combined with protein sequences. The final fasta file that included all of the data was formatted using the formatdb program in the BLAST package . Our local database included complete genome and EST data for following species: Oryza sativa, Drosophila melanogaster, Saccharomyces cerevisiae, the green alga Chlamydomonas reinhardtii, red algae C. merolae and G. sulphuraria; chromalveolates Guillardia theta, Emiliania huxleyi, Thalassiosira pseudonana, Plasmodium falciparum, Toxoplasma gondii, A. tamarense, A. carterae, H. triquetra, L. polyedrum, and K. micrum, excavates G. lamblia and Trypanosoma brucei, archaea Halobacterium sp. NRC-1, Methanothermobacter thermautotrophicus, and Sulfolobus tokodaii, and eubacteria Clostridium acetobutylicum ATCC824, Escherichia coli 536, Geobacter sulfurreducens, Oceanobacillus iheyensis, Synechococcus elongatus PCC 7942, Trichodesmium erythraeum, and Nostoc sp. PCC 7120.
PhyloGenie was run using default settings except that the minimum expect "e-value" for the BLAST search of the data was set at 10. The hidden Markov model (hmm) alignments were built using all hits with an e-value below 0.01. The program TreeView  was used to visualize the resulting trees. We selected sequences represented by trees that contained only chromalveolates and bacteria and trees that contained well-defined chromalveolate-bacterial clades (at least 50% bootstrap support). Using this criterion for the gene selection, we excluded from the analyses genes of eukaryotic and mitochondrial origin that are shared by most eukaryotic organisms. In addition, this approach allowed us to exclude from the analyses genes of red algal and green algal origin acquired by chromalveolates from the genomes of plastid progenitors via EGT (see [25, 26] for detailed analyses of EGT in chromalveolates). This analysis yielded 37 unigenes encoding proteins from 23 different protein families; 22 of them represented a subset of proteins identified using the "Best hit" approach. The necessity of using a combination of two described above methods for detecting putative HGTs resulted from the fact that neither the non-redundant (nr) nor our local database included all taxa of interest. The local database for the PhyloGenie BLAST search complemented nr with complete genome and EST data for free living protists. In addition to the gene discovery, the PhyloGenie output was used to verify the phylogeny of proteins identified using the "Best hit" approach. Based on the results of PhyloGenie, eight proteins represented by 12 unigenes have been excluded from the analysis as derived from the genome of plastid progenitor through EGT. Three proteins represented by five unigenes were rejected as shared by multiple eukaryotic lineages. The remaining set of 45 proteins represented by 80 unigenes in the K. brevis EST data were subjects for detailed analyses.
Identification of bacteria-derived proteins in K. brevis involved in cell wall biogenesis
In bacteria, genes encoding physiologically coupled reactions are often transferred together, frequently in an operon . To test whether this scenario is applicable for interdomain prokaryote-to-eukaryote transfers, we screened the K. brevis EST data for the presence of homologs of bacterial proteins involved in cell envelope biogenesis. This category of proteins was chosen to identify genes that potentially may be co-transferred with genes encoding MVIM-WECE. The latter represents a rare case of gene fusion in interdomain HGT. Gene clusters involved in the surface layer protein biosynthesis in A. thermoaerophilus  and glycopeptidolipid biosynthesis in Mycobacterium avium  were retrieved from GenBank and used as an input in sequence similarity search (BLAST; e-value ≤ 10-20) against the non-redundant gene set generated from the K. brevis EST data. This analysis allowed us to identify three additional proteins represented by five unigenes in the K. brevis EST data.
Identification of eukaryotic proteins acquired by chromalveolates through intradomain HGT
Genes derived through intradomain eukaryote-to-eukaryote HGT were extremely hard to identify using high throughput phylogenomic analyses due to the limited number of taxa included in our local database. The only candidate for the transfer of a bona fide eukaryotic gene among taxa is EF2 that was identified in the course of another study of potential markers for reconstructing the eukaryotic tree of life. The detailed description of methods used for the EF2 analysis can be found in reference .
To assess the possibility that the EF2 sequence found in the K. brevis data resulted from contamination of the EST library with kinetoplastid DNA, we used the complete K. brevis EST data set as an input for a sequence similarity search (BLASTx; e-value ≤ 10-10) against the GenBank non-redundant database (nr). Sequences that showed the highest similarity to kinetoplastid genes were subjected to detailed analyses. This work did not provide any support for the kinetoplastid origin of identified sequences in K. brevis with the exception of EF2 (results not shown).
Building the final alignments
To build the final alignments, we identified homologs of the candidate K. brevis sequences using BLAST searches (e-value ≤ 10-10) against GenBank nr, dbEST, and public protist databases including the JGI database , the French National Sequencing Center, Genoscope , the Protist EST Program database , C. merolae [92, 97] and G. sulphuraria [93, 98] databases. The DNA sequences were translated, and the amino acid data for each protein were manually aligned with the identified bacterial and eukaryotic homologs using BioEdit . Only regions that were unambiguously aligned were retained for phylogenetic analysis.
Analysis of sequence structure and phylogeny
The eukaryotic structure of the identified K. brevis transcripts has been verified by translating and aligning the resulting amino acid sequences with their bacterial homologous. The subcellular localization of the studied proteins was predicted using online analyses with protein topology prediction programs SignalP [100, 101], TargetP [102, 103], MITOPROT , PSORT , and TMHMM . Percent of amino acid sequence identity between gene copies was inferred using an online tool for pair wise sequence alignment bl2seq . GC content of nucleotide sequences was identified using BioEdit. The average and range of nucleotide composition of protist genomes was inferred from the analysis of coding regions of 50 sequences from each species.
We used the maximum likelihood (ML) method to reconstruct the gene phylogenies. The ML analysis was done in PHYML V2.4.3 [106, 107] using the WAG + Γ + I evolutionary model and tree optimization. The alpha values for the gamma distribution were calculated using eight rate categories. To test the stability of monophyletic groups in the ML trees, we calculated PHYML bootstrap (100 replicates) support values . In addition, we calculated bootstrap values (500 replications) using the neighbor joining (NJ) method with JTT+Γ distance matrices using PHYLIP V3.63 . The NJ analysis was done with randomized taxon addition. Bayesian posterior probabilities for nodes in the ML tree were calculated using MrBayes V3.0b4  and the WAG + Γ model. The Metropolis-coupled Markov chain Monte Carlo from a random starting tree was run for 1,000,000 generations with trees sampled each 1,000 cycles. The initial 20,000 cycles (200 trees) were discarded as the "burn in." A consensus tree was made with the remaining 800 phylogenies to determine the posterior probabilities at the different nodes.
The names of the K. brevis proteins are derived from the names of corresponding protein domains according to the Pfam [29, 30] nomenclature. The K. brevis sequences listed in the Table 1 have been deposited in GenBank under accession numbers EF540322–EF540340.
This work was supported by grants from the National Science Foundation and the National Aeronautics and Space Administration awarded to D.B. (EF 04-31117, NNG04GM17G). T. N. was partially supported by an Avis E. Cone Fellowship from the University of Iowa. We are grateful to Frances M. Van Dolah (NOAA, Charleston, SC) for helpful advice and the US Department of Energy-Joint Genome Institute for having generated the K. brevis data for another project.
- Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405: 299-304. 10.1038/35012500.View ArticlePubMedGoogle Scholar
- Boucher Y, Douady CJ, Papke RT, Walsh DA, Boudreau ME, Nesbo CL, Case RJ, Doolittle WF: Lateral gene transfer and the origins of prokaryotic groups. Annu Rev Genet. 2003, 37: 283-328. 10.1146/annurev.genet.37.050503.084247.View ArticlePubMedGoogle Scholar
- Dagan T, Martin W: Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution. Proc Natl Acad Sci USA. 2007, 104: 870-875. 10.1073/pnas.0606318104.PubMed CentralView ArticlePubMedGoogle Scholar
- Goldenfeld N, Woese C: Biology's next revolution. Nature. 2007, 445: 369-10.1038/445369a.View ArticlePubMedGoogle Scholar
- Prince VE, Pickett FB: Splitting pairs: the diverging fates of duplicated genes. Nat Rev Genet. 2002, 3: 827-837. 10.1038/nrg928.View ArticlePubMedGoogle Scholar
- Ohno S: Evolution by Gene Duplication. 1970, Berlin: SpringerView ArticleGoogle Scholar
- Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, Amedeo P, Roncaglia P, Berriman M, Hirt RP, Mann BJ: The genome of the protist parasite Entamoeba histolytica. Nature. 2005, 433: 865-868. 10.1038/nature03291.View ArticlePubMedGoogle Scholar
- Andersson JO, Sjogren AM, Horner DS, Murphy CA, Dyal PL, Svard SG, Logsdon JM, Ragan MA, Hirt RP, Roger AJ: A genomicsurvey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution. BMC Genomics. 2007, 8: 51-10.1186/1471-2164-8-51.PubMed CentralView ArticlePubMedGoogle Scholar
- Carlton JM, Hirt RP, Silva JC, Delcher AL, Schatz M, Zhao Q, Wortman JR, Bidwell SL, Alsmark UC, Besteiro S: Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science. 2007, 315: 207-212. 10.1126/science.1132894.PubMed CentralView ArticlePubMedGoogle Scholar
- Huang J, Mullapudi N, Lancto CA, Scott M, Abrahamsen MS, Kissinger JC: Phylogenomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer in Cryptosporidium parvum. Genome Biol. 2004, 5: R88-10.1186/gb-2004-5-11-r88.PubMed CentralView ArticlePubMedGoogle Scholar
- Ricard G, McEwan NR, Dutilh BE, Jouany JP, Macheboeuf D, Mitsumori M, McIntosh FM, Michalowski T, Nagamine T, Nelson N: Horizontal gene transfer from bacteria to rumen ciliates indicates adaptation to their anaerobic, carbohydrates-rich environment. BMC Genomics. 2006, 7: 22-10.1186/1471-2164-7-22.PubMed CentralView ArticlePubMedGoogle Scholar
- Eichinger L, Pachebat JA, Glockner G, Rajandream MA, Sucgang R, Berriman M, Song J, Olsen R, Szafranski K, Xu Q: The genome of the social amoeba Dictyostelium discoideum. Nature. 2005, 435: 43-57. 10.1038/nature03481.PubMed CentralView ArticlePubMedGoogle Scholar
- Keeling PJ, Inagaki Y: A class of eukaryotic GTPase with a punctate distribution suggesting multiple functional replacements of translation elongation factor 1alpha. Proc Natl Acad Sci USA. 2004, 101: 15380-15385. 10.1073/pnas.0404505101.PubMed CentralView ArticlePubMedGoogle Scholar
- Takishita K, Ishida K, Maruyama T: An enigmatic GAPDH gene in the symbiotic dinoflagellate genus Symbiodinium and its related species (the order Suessiales): possible lateral gene transfer between two eukaryotic algae, dinoflagellate and euglenophyte. Protist. 2003, 154: 443-454. 10.1078/143446103322454176.View ArticlePubMedGoogle Scholar
- Waller RF, Slamovits CH, Keeling PJ: Lateral gene transfer of a multigene region from cyanobacteria to dinoflagellates resulting in a novel plastid-targeted fusion protein. Mol Biol Evol. 2006, 23: 1437-1443. 10.1093/molbev/msl008.View ArticlePubMedGoogle Scholar
- Archibald JM, Rogers MB, Toop M, Ishida K, Keeling PJ: Lateral gene transfer and the evolution of plastid-targeted proteins in the secondary plastid-containing alga Bigelowiella natans. Proc Natl Acad Sci USA. 2003, 100: 7678-7683. 10.1073/pnas.1230951100.PubMed CentralView ArticlePubMedGoogle Scholar
- The Protist EST Program database (TBestDB). [http://megasun.bch.umontreal.ca/pepdb/pepdb.html]
- NCBI. Expressed Sequence Tags database. [http://www.ncbi.nlm.nih.gov/dbEST/index.html]
- Cavalier-Smith T: Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid originsand the eukaryote family tree. J Eukaryot Microbiol. 1999, 46: 347-366. 10.1111/j.1550-7408.1999.tb04614.x.View ArticlePubMedGoogle Scholar
- Parfrey L, Barbero E, Lasser E, Dunthorn M, Bhattacharya D, Patterson D, Katz L: Evaluating support for the current classification of eukaryotic diversity. PLoS. 2006, 2: e220-10.1371/journal.pgen.0020220.View ArticleGoogle Scholar
- Hackett JD, Yoon HS, Li S, Reyes-Prieto A, Rummele SE, Bhattacharya D: Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes and the association of 'Rhizaria' with chromalveolates. Mol Biol Evol. 2007,Google Scholar
- Patron NJ, Inagaki Y, Keeling PJ: Multiple gene phylogenies support the monophyly of cryptomonad and haptophyte host lineages. Curr Biol. 2007, 17: 887-891. 10.1016/j.cub.2007.03.069.View ArticlePubMedGoogle Scholar
- Archibald JM, Keeling PJ: Recycled plastids: a 'green movement' in eukaryotic evolution. Trends Genet. 2002, 18: 577-584. 10.1016/S0168-9525(02)02777-4.View ArticlePubMedGoogle Scholar
- Hackett JD, Yoon HS, Soares MB, Bonaldo MF, Casavant TL, Scheetz TE, Nosenko T, Bhattacharya D: Migration of the plastid genome to the nucleus in a peridinin dinoflagellate. Curr Biol. 2004, 14: 213-218. 10.1016/S0960-9822(04)00042-9.View ArticlePubMedGoogle Scholar
- Nosenko T, Lidie KL, Van Dolah FM, Lindquist E, Cheng JF, Bhattacharya D: Chimeric plastid proteome in the Florida "red tide"dinoflagellate Karenia brevis. Mol Biol Evol. 2006, 23: 2026-2038. 10.1093/molbev/msl074.View ArticlePubMedGoogle Scholar
- Li S, Nosenko T, Hackett JD, Bhattacharya D: Phylogenomic analysis identifies red algal genes of endosymbiotic origin in the chromalveolates. Mol Biol Evol. 2006, 23: 663-674. 10.1093/molbev/msj075.View ArticlePubMedGoogle Scholar
- Patron NJ, Waller RF, Keeling PJ: A tertiary plastid uses genes from two endosymbionts. J Mol Biol. 2006, 357: 1373-1382. 10.1016/j.jmb.2006.01.084.View ArticlePubMedGoogle Scholar
- Fleming LE, Backer LC, Baden DG: Overview of aerosolized Florida red tide toxins: exposures and effects. Environ Health Perspect. 2005, 113: 618-620.PubMed CentralView ArticlePubMedGoogle Scholar
- Sonnhammer EL, Eddy SR, Durbin R: Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997, 28: 405-420. 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L.View ArticlePubMedGoogle Scholar
- Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL: The Pfam protein families database. Nucleic Acids Res. 2002, 30: 276-280. 10.1093/nar/30.1.276.PubMed CentralView ArticlePubMedGoogle Scholar
- TMHMM Server v. 2.0. Prediction of transmembrane helices in proteins. [http://www.cbs.dtu.dk/services/TMHMM-2.0/]
- Yoon HS, Hackett JD, Van Dolah FM, Nosenko T, Lidie KL, Bhattacharya D: Tertiary endosymbiosis driven genome evolution in dinoflagellate algae. Mol Biol Evol. 2005, 22: 1299-1308. 10.1093/molbev/msi118.View ArticlePubMedGoogle Scholar
- Pal C, Papp B, Lercher MJ: Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat Genet. 2005, 37: 1372-1375. 10.1038/ng1686.View ArticlePubMedGoogle Scholar
- Xue Y, Zhao L, Liu HW, Sherman DH: A gene cluster for macrolide antibiotic biosynthesis in Streptomyces venezuelae: architecture of metabolic diversity. Proc Natl Acad Sci USA. 1998, 95: 12111-12116. 10.1073/pnas.95.21.12111.PubMed CentralView ArticlePubMedGoogle Scholar
- Xue Y, Wilson D, Zhao L, Liu H, Sherman DH: Hydroxylation of macrolactones YC-17 and narbomycin is mediated by the pikC-encoded cytochrome P450 in Streptomyces venezuelae. Chem Biol. 1998, 5: 661-667. 10.1016/S1074-5521(98)90293-9.View ArticlePubMedGoogle Scholar
- Anzai Y, Saito N, Tanaka M, Kinoshita K, Koyama Y, Kato F: Organization of the biosynthetic gene cluster for the polyketide macrolide mycinamicin in Micromonospora griseorubida. FEMS Microbiol Lett. 2003, 218: 135-141. 10.1111/j.1574-6968.2003.tb11509.x.View ArticlePubMedGoogle Scholar
- Brikun IA, Reeves AR, Cernota WH, Luu MB, Weber JM: The erythromycin biosynthetic gene cluster of Aeromicrobium erythreum. J Ind Microbiol Biotechnol. 2004, 31: 335-344.View ArticlePubMedGoogle Scholar
- Awram P, Smit J: Identification of lipopolysaccharide O antigen synthesis genes required for attachment of the S-layer of Caulobacter crescentus. Microbiology. 2001, 147: 1451-1460.View ArticlePubMedGoogle Scholar
- Bastin DA, Reeves PR: Sequence and analysis of the Oantigen gene (rfb) cluster of Escherichia coli O111. Gene. 1995, 164: 17-23. 10.1016/0378-1119(95)00459-J.View ArticlePubMedGoogle Scholar
- Nesper J, Kraiß A, Schild S, Blaß J, Klose KE, Bockemühl J, Reidl J: Role of Vibrio cholerae O139 surface polysaccharides in intestinal colonization. Infect Immun. 2002, 70: 2419-2433. 10.1128/IAI.70.5.2419-2433.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Novotny R, Pfoestl A, Messner P, Schaffer C: Genetic organization of chromosomal S-layer glycan biosynthesis loci of Bacillaceae. Glycoconj J. 2004, 20: 435-447. 10.1023/B:GLYC.0000038290.74944.65.View ArticlePubMedGoogle Scholar
- Schaffer C, Messner P: Surface-layer glycoproteins: an example for the diversity of bacterial glycosylation with promising impacts on nanobiotechnology. Glycobiology. 2004, 14: 31R-42R. 10.1093/glycob/cwh064.View ArticlePubMedGoogle Scholar
- Pfoestl A, Hofinger A, Kosma P, Messner P: Biosynthesisof dTDP-3-acetamido-3,6-dideoxy-alpha-D-galactose in Aneurinibacillus thermoaerophilus L420-91T. J Biol Chem. 2003, 278: 26410-26417. 10.1074/jbc.M300858200.View ArticlePubMedGoogle Scholar
- Eckstein TM, Belisle JT, Inamine JM: Proposed pathway for the biosynthesis of serovar-specific glycopeptidolipids in Mycobacterium avium serovar 2. Microbiology. 2003, 149: 2797-2807. 10.1099/mic.0.26528-0.View ArticlePubMedGoogle Scholar
- Patil PB, Sonti RV: Variation suggestive of horizontal gene transfer at a lipopolysaccharide (lps) biosynthetic locus in Xanthomonas oryzae pv. oryzae, the bacterial leaf blight pathogen of rice. BMC Microbiol. 2004, 4: 40-10.1186/1471-2180-4-40.PubMed CentralView ArticlePubMedGoogle Scholar
- Seifert GJ: Nucleotide sugar interconversions and cell wall biosynthesis: how to bring the inside to the outside. Curr Opin Plant Biol. 2004, 7: 277-284. 10.1016/j.pbi.2004.03.004.View ArticlePubMedGoogle Scholar
- Lopez AB, Sener K, Jarroll EL, van Keulen H: Transcription regulation is demonstrated for five key enzymes in Giardia intestinalis cyst wall polysaccharide biosynthesis. Mol Biochem Parasitol. 2003, 128: 51-57. 10.1016/S0166-6851(03)00049-5.View ArticlePubMedGoogle Scholar
- Prescott AG, Lloyd MD: The iron(II) and 2-oxoacid-dependent dioxygenases and their role in metabolism. Nat Prod Rep. 2000, 17: 367-383. 10.1039/a902197c.View ArticlePubMedGoogle Scholar
- Baggaley KH, Brown AG, Schofield CJ: Chemistry and biosynthesis of clavulanic acid and other clavams. Nat Prod Rep. 1997, 14: 309-333. 10.1039/np9971400309.View ArticlePubMedGoogle Scholar
- WoLF PSORT. Protein Subcellular Localization Prediction.
- Andersson JO, Hirt RP, Foster PG, Roger AJ: Evolution of four gene families with patchy phylogenetic distributions: influx of genes into protist genomes. BMC Evol Biol. 2006, 6: 27-10.1186/1471-2148-6-27.PubMed CentralView ArticlePubMedGoogle Scholar
- Andersson JO: Lateral gene transfer in eukaryotes. Cell Mol Life Sci. 2005, 62: 1182-1197. 10.1007/s00018-005-4539-z.View ArticlePubMedGoogle Scholar
- Kather B, Stingl K, van der Rest ME, Altendorf K, Molenaar D: Another unusual type of citric acid cycle enzyme in Helicobacter pylori: the malate:quinone oxidoreductase. J Bacteriol. 2000, 182: 3204-3209. 10.1128/JB.182.11.3204-3209.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Molenaar D, van der Rest ME, Drysch A, Yucel R: Functionsof the membrane-associated and cytoplasmic malate dehydrogenases in the citric acid cycle of Corynebacterium glutamicum. J Bacteriol. 2000, 182: 6884-6891. 10.1128/JB.182.24.6884-6891.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Van der Rest ME, Frank C, Molenaar D: Functions of themembrane-associated and cytoplasmic malate dehydrogenases in the citric acid cycle of Escherichia coli. J Bacteriol. 2000, 182: 6892-6899. 10.1128/JB.182.24.6892-6899.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Gardner MJ, Shallom SJ, Carlton JM, Salzberg SL, Nene V, Shoaibi A, Ciecko A, Lynn J, Rizzo M, Weaver B: Sequence of Plasmodium falciparum chromosomes 2, 10, 11 and 14. Nature. 2002, 419: 531-534. 10.1038/nature01094.View ArticlePubMedGoogle Scholar
- Uyemura SA, Luo S, Vieira M, Moreno SN, Docampo R: Oxidative phosphorylation and rotenone-insensitive malate- and NADH-quinone oxidoreductases in Plasmodium yoelii yoelii mitochondria in situ. J Biol Chem. 2004, 279: 385-393. 10.1074/jbc.M307264200.View ArticlePubMedGoogle Scholar
- Chen RD, Gadal P: Structure, function and regulation of NAD and NADP dependent isocitrate dehydrogenase in higher plants and in other organisms. Plant Physiol Biochem. 1990, 28: 411-427.Google Scholar
- Schnarrenberger C, Martin W: Evolution of the enzymes of the citric acid cycle and the glyoxylate cycle of higher plants. A case study of endosymbiotic gene transfer. Eur J Biochem. 2002, 269: 868-883. 10.1046/j.0014-2956.2001.02722.x.View ArticlePubMedGoogle Scholar
- Lang M, Apt KE, Kroth PG: Protein transport into "complex" diatom plastids utilizes two different targeting signals. J Biol Chem. 1998, 273: 30973-30978. 10.1074/jbc.273.47.30973.View ArticlePubMedGoogle Scholar
- Patron NJ, Waller RF, Archibald JM, Keeling PJ: Complex protein targeting to dinoflagellate plastids. J Mol Biol. 2005, 348: 1015-1024. 10.1016/j.jmb.2005.03.030.View ArticlePubMedGoogle Scholar
- Suzuki M, Sahara T, Tsuruha J, Takada Y, Fukunaga N: Differential expression in Escherichia coli of the Vibrio sp. strain ABE-1 icdI and icdII genes encoding structurally different isocitrate dehydrogenase isozymes. J Bacteriol. 1995, 177: 2138-2142.PubMed CentralPubMedGoogle Scholar
- Kang CH, Shin WC, Yamagata Y, Gokcen S, Ames GF, Kim SH: Crystal structure of the lysine-, arginine-, ornithine-binding protein (LAO) from Salmonella typhimurium at 2.7-A resolution. J Biol Chem. 1991, 266: 23893-23899.PubMedGoogle Scholar
- Jorgensen R, Carr-Schmid A, Ortiz PA, Kinzy TG, Andersen GR: Purification and crystallization of the yeast elongation factor eEF2. Acta Crystallogr D Biol Crystallogr. 2002, 58: 712-715. 10.1107/S0907444902003001.View ArticlePubMedGoogle Scholar
- Moreira D, Le Guyader H, Philippe H: The origin of red algae and the evolution of chloroplasts. Nature. 2000, 405: 69-72. 10.1038/35011054.View ArticlePubMedGoogle Scholar
- Regier JC, Shultz JW: Elongation factor-2: a useful gene for arthropod phylogenetics. Mol Phylogenet Evol. 2001, 20: 136-148. 10.1006/mpev.2001.0956.View ArticlePubMedGoogle Scholar
- Kullnig-Gradinger CM, Szakacs G, Kubicek CP: Phylogenyand evolution of the genus Trichoderma: a multigene approach. Mycol Res. 2002, 106: 757-767. 10.1017/S0953756202006172.View ArticleGoogle Scholar
- Bhattacharya D, Yoon HS, Hackett JD: Chromalveolates unite: endosymbiosis connects the dots. BioEssays. 2004, 26: 50-60. 10.1002/bies.10376.View ArticlePubMedGoogle Scholar
- Liapounova NA, Hampl V, Gordon PM, Sensen CW, Gedamu L, Dacks JB: Reconstructing the mosaic glycolytic pathway of the anaerobic eukaryote Monocercomonoides. Eukaryot Cell. 2006, 5: 2138-2146. 10.1128/EC.00258-06.PubMed CentralView ArticlePubMedGoogle Scholar
- Andersson JO, Roger AJ: Evolution of glutamate dehydrogenase genes: evidence for lateral gene transfer within and between prokaryotes and eukaryotes. BMC Evol Biol. 2003, 3: 14-10.1186/1471-2148-3-14.PubMed CentralView ArticlePubMedGoogle Scholar
- Andersson JO, Sarchfield SW, Roger AJ: Gene transfers from nanoarchaeota to an ancestor of diplomonads and parabasalids. Mol Biol Evol. 2005, 22: 85-90. 10.1093/molbev/msh254.View ArticlePubMedGoogle Scholar
- Andersson J: Genome evolution of anaerobic protists: metabolic adaptation via gene acquisition. Genomics and Evolution of Microbial Eukaryotes. Edited by: Katz L, Bhattacharya D. 2006, Oxford University Press, 109-122.Google Scholar
- Hao W, Golding GB: The fate of laterally transferred genes: life in the fast lane to adaptation or death. Genome Res. 2006, 16: 636-643. 10.1101/gr.4746406.PubMed CentralView ArticlePubMedGoogle Scholar
- Koike K, Sekiguchi H, Kobiyama A, Takishita K, Kawachi M, Koike K, Ogata T: A novel type of kleptoplastidy in Dinophysis (Dinophyceae): presence of haptophyte-type plastid in Dinophysis mitra. Protist. 2005, 156: 225-237. 10.1016/j.protis.2005.04.002.View ArticlePubMedGoogle Scholar
- Gast RJ, Moran DM, Dennett MR, Caron DA: Kleptoplasty in an Antarctic dinoflagellate: caught in evolutionary transition?. Environ Microbiol. 2007, 9: 39-45. 10.1111/j.1462-2920.2006.01109.x.View ArticlePubMedGoogle Scholar
- Tengs T, Dahlberg OJ, Shalchian-Tabrizi K, Klaveness D, Rudi K, Delwiche CF, Jakobsen KS: Phylogenetic analyses indicate that the 19'Hexanoyloxy-fucoxanthin-containing dinoflagellates have tertiary plastids of haptophyte origin. Mol Biol Evol. 2000, 17: 718-729.View ArticlePubMedGoogle Scholar
- Doolittle WF: You are what you eat: a gene transferratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 1998, 14: 307-311. 10.1016/S0168-9525(98)01494-2.View ArticlePubMedGoogle Scholar
- Graham L, Wilcox L, (eds.): Algae. 2000, Upper Saddle River: Prentice Hall
- Sogayar MI, Gregorio EA: Uptake of bacteria by trophozoites of Giardia duodenalis (Say). Ann Trop Med Parasitol. 1989, 83: 63-66.PubMedGoogle Scholar
- Pereira-Neves A, Benchimol M: Phagocytosis by Trichomonas vaginalis: new insights. Biol Cell. 2007, 99: 87-101. 10.1042/BC20060084.View ArticlePubMedGoogle Scholar
- Flavin M, Nerad TA: Reclinomonas americana N. G., N. Sp., a new freshwater heterotrophic flagellate. J Eukaryot Microbiol. 1993, 40: 172-179. 10.1111/j.1550-7408.1993.tb04900.x.View ArticlePubMedGoogle Scholar
- Cohen CJ, Bacon R, Clarke M, Joiner K, Mellman I: Dictyostelium discoideum mutants with conditional defects in phagocytosis. J Cell Biol. 1994, 126: 955-966. 10.1083/jcb.126.4.955.View ArticlePubMedGoogle Scholar
- Watkins RF, Gray MW: The frequency of eubacterium-to-eukaryote lateral gene transfers shows significant cross-taxa variation within amoebozoa. J Mol Evol. 2006, 63: 801-814. 10.1007/s00239-006-0031-0.View ArticlePubMedGoogle Scholar
- Hall C, Brachat S, Dietrich FS: Contribution ofhorizontal gene transfer to the evolution of Saccharomyces cerevisiae. Eukaryot Cell. 2005, 4: 1102-1115. 10.1128/EC.4.6.1102-1115.2005.PubMed CentralView ArticlePubMedGoogle Scholar
- Hooper SD, Berg OG: Duplication is more common among laterally transferred genes than among indigenous genes. Genome Biol. 2003, 4: R48-10.1186/gb-2003-4-8-r48.PubMed CentralView ArticlePubMedGoogle Scholar
- DFCI Gene Indices Software Tools. [http://compbio.dfci.harvard.edu/tgi/software/]
- Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics. 2003, 19: 651-652. 10.1093/bioinformatics/btg034.View ArticlePubMedGoogle Scholar
- Emboss. [http://emboss.sourceforge.net/]
- Lupas N, Frickey T: PhyloGenie: automated phylome generation and analysis. Nucleic Acids Res. 2004, 32: 5231-5238. 10.1093/nar/gkh867.PubMed CentralView ArticlePubMedGoogle Scholar
- National Center for Biotechnology Information. [http://www.ncbi.nlm.nih.gov/]
- JGI. DOE Joint Genome Institute. [http://www.jgi.doe.gov/]
- Cyanidioschyzon merolae Genome Project. [http://merolae.biol.s.u-tokyo.ac.jp/]
- The Galdieria sulphuraria Genome Project. [http://genomics.msu.edu/galdieria/]
- The National Center for Biotechnology Information Basic Local Alignment Search Tool (BLAST). [http://www.ncbi.nlm.nih.gov/BLAST/]
- TreeView. Tree drawing software for Apple Macintoshand Windows. [http://taxonomy.zoology.gla.ac.uk/rod/treeview.html]
- Genoscope. The French National Sequencing Center. [http://www.genoscope.cns.fr/externe/English/corps_anglais.html]
- Matsuzaki M, Misumi O, Shin IT, Maruyama S, Takahara M, Miyagishima SY, Mori T, Nishida K, Yagisawa F, Nishida K: Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature. 2004, 428: 653-657. 10.1038/nature02398.View ArticlePubMedGoogle Scholar
- Weber AP, Oesterhelt C, Gross W, Brautigam A, Imboden LA, Krassovskaya I, Linka N, Truchina J, Schneidereit J, Voll H: EST-analysis of the thermo-acidophilic red microalga Galdieria sulphuraria reveals potential for lipid A biosynthesis and unveils the pathway of carbon export from rhodoplasts. Plant Mol Biol. 2004, 55: 17-32. 10.1007/s11103-004-0376-y.View ArticlePubMedGoogle Scholar
- BioEdit. Biological sequence alignment editor for Windows 95/98/NT/2000/XP. [http://www.mbio.ncsu.edu/BioEdit/bioedit.html]
- SignalP 3.0 Server. [http://www.cbs.dtu.dk/services/SignalP/]
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004, 340: 783-795. 10.1016/j.jmb.2004.05.028.View ArticlePubMedGoogle Scholar
- TargetP 1.1 Server. [http://www.cbs.dtu.dk/services/TargetP/]
- Emanuelsson O, Nielsen H, Brunak S, von Heijne G: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol. 2000, 300: 1005-1016. 10.1006/jmbi.2000.3903.View ArticlePubMedGoogle Scholar
- MITOPROT: Prediction of mitochondrial targeting sequences. [http://ihg.gsf.de/ihg/mitoprot.html]
- Blast 2 Sequences. [http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi]
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52: 696-704. 10.1080/10635150390235520.View ArticlePubMedGoogle Scholar
- Guindon S, Lethiec F, Duroux P, Gascuel O: PHYML Online – a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res. 2005, 33: W557-559. 10.1093/nar/gki352.PubMed CentralView ArticlePubMedGoogle Scholar
- Felsenstein J: Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985, 39: 783-791. 10.2307/2408678.View ArticleGoogle Scholar
- PHYLIP V3.63. [http://evolution.genetics.washington.edu/phylip.html]
- Huelsenbeck JP, Ronquist F: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.View ArticlePubMedGoogle Scholar
- Bhattacharya lab downloads page. [http://www.biology.uiowa.edu/debweb/downloads/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.