Our hypothesis is that a 34,102 bp long fragment, TvLF, encompassing 27 consecutive bacteria-like genes and seven un-annotated ORFs is the result of a recent gene transfer event from one single bacterial donor, presumably a close relative to the firmicute P. harei. This region stands in contrast to other LGTs detected in eukaryotic parasite genomes, which are almost always single gene occurrences embedded among genes of eukaryote origin [17, 27].
The possibility that LGTs are initially acquired in clusters in protozoa has previously been proposed, although actual cases are rare. In Giardia intestinalis isolate GS three consecutive LGTs are found, although they are absent in Giardia intestinalis isolate WG, further advocating the hypothesis of a recent transfer . In Cryptosporidium two separate pairs of genes appear to have been acquired at the same time ; and the ascomycete Trichoderma appears to have acquired a three-gene cluster resembling part of a nitrate assimilation pathway from a distantly related basidiomycete lineage . Similar phenomena have also been observed in metazoan  and in plant mitochondria .
One reason for the apparent scattering of the typical LGTs in protozoa could be that once foreign DNA has been acquired and integrated into the host chromosome, there are two possible scenarios: loss or preservation. A newly acquired gene may degrade through mutational processes and vanish. If preserved, the LGT can relocate via internal recombination events, duplicate and evolve into a functional gene, possibly with a modified function [37, 38]. This latter process, however, is as yet poorly understood, albeit the amelioration process has been shown to work rapidly on LGTs in bacteria, and would in these cases obliterate the compositional differences to the host genome . Consequently, evaluation of compositional differences has been shown to be a poor marker for LGT .
The so-called “you are what you eat” hypothesis promoted by Doolittle (1998) suggests that genetic material can be incorporated into a unicellular eukaryotes genome. This may happen by chance after phagocytosis of bacteria populating the same habitat, and is supported by the fact that phagotrophs have a higher rate of LGT than non-phagotrophs [10, 11]. Donors of LGTs in protozoa are predominantly bacteria sharing a habitat , making Bacteroides, Clostridium, and related species common contributors to T. vaginalis. Our findings support this theory, since P. harei, a firmicute and close relative of the assumed donor, shares a habitat with T. vaginalis in the urogenital tract and reproductive system of women [41–43]. We may further hypothesize that all of the LGTs in T. vaginalis that stem from, for example, the Bacteroides lineage may have been acquired in one or a few batches, in a similar mode to the TvLF, although genomic rearrangements in both donor and recipient have erased obvious evidence such as synteny.
Thus, we argue that the LGTs left in today’s protozoan genomes are the successfully fixed genes, still remaining after having passed evolutionarily driven rearrangements, and evaded gene decay. If the transfer of TvLF is as recent as indicated, it becomes reasonable to assume that this process is still ongoing, and that some genes may be retained under a selective pressure, while others evolve under more relaxed constraints and are likely to be lost in the future. In the case of TvLF we identify several instances where the latter situation occurs, for example genes disrupted by internal stop codons, resulting in pseudogenes.
Furthermore, if the physical uptake of genetic material is assumed to be random, while the fixation of genes is under a selective pressure [11, 16], it follows that it is also reasonable that a very recent LGT-event has not yet been cleansed of obsolete material and streamlined to fit the exact requirements of the recipient organism. Such modifications have been shown to take place in bacteria during the amelioration process . If we assume that similar processes are at work also in eukaryotes, then they have in TvLF presumably not yet had the time to homogenize the LGTs to resemble the remainder of the genome.
This is supported by results from the codon adaptation index (CAI) analyses for TvLF, where we show that CAI-values for genes in TvLF resemble those of the P. harei genome rather than those of the remainder of the T. vaginalis genome. The sequence similarity between the donor and the recipient is also higher than what is usually observed among LGTs. Similarly, calculation of cumulative GC values nominates the region between T. vaginalis specific repeated genes and TvLF, immediately adjacent to the locus of the first Peptoniphilus-like gene, as a site for incorporation of foreign DNA. Such strong segmentation points are described, for instance, in genomic islands found in bacteria with uniform GC-content [44, 45].
The functional categorization of previously detected LGTs in bacteria and eukaryotes shows that most LGTs are active in metabolic processes, while informational genes are rare [11, 13, 16]. This phenomenon may reflect that, although the uptake happens by chance, the fixation does not, and that LGT predominantly is important for adaptation processes such as utilization of new metabolites, but less important for optimization of informational processes already encoded for by the cell.
In TvLF we have identified eight genes known to be active in genetic informational processes, such as transcription and cell cycle control; however, the majority of these have accumulated stop codons in one or more of the strains (five out of eight). This observation strengthens both the assumption that the TvLF region is evolving rapidly to remove undesirable genes and that the informational genes, to a certain extent, are within this less desirable category of genes. An additional observation that further confirms this hypothesis is that TvLF harbors only four genes known to be involved in metabolism, and all of these genes are intact. All four genes have previously been shown to be expressed , thus indicating that they may be functionally active.
More surprisingly, the TvLF encompass seven genes involved in transport, whereof six are intact in all of the strains investigated. A closer look at the functional annotation of these transport genes reveals that several are homologous to genes involved in antibiotic resistance in bacteria. Development of antibiotic resistance in bacteria is often achieved via LGT, but so far, to our knowledge, no such cases have been reported in protozoa. Whether some or all of these genes are actually active in T. vaginalis remains to be investigated.
Furthermore, in TvLF we have also observed two transposases, genes that are associated with transposition of genetic material, and may thus, hypothetically, be involved in the incorporation process during the actual uptake of TvLF. These transposases are not present in the corresponding region in P. harei, but are found in other Firmicutes of the Peptoniphilus lineage.