The gain and loss of chromosomal integron systems in the Treponema species
© Wu et al.; licensee BioMed Central Ltd. 2013
Received: 26 July 2012
Accepted: 11 January 2013
Published: 22 January 2013
Skip to main content
© Wu et al.; licensee BioMed Central Ltd. 2013
Received: 26 July 2012
Accepted: 11 January 2013
Published: 22 January 2013
Integron systems are now recognized as important agents of bacterial evolution and are prevalent in most environments. One of the human pathogens known to harbor chromosomal integrons, the Treponema spirochetes are the only clade among spirochete species found to carry integrons. With the recent release of many new Treponema genomes, we were able to study the distribution of chromosomal integrons in this genus.
We find that the Treponema spirochetes implicated in human periodontal diseases and those isolated from cow and swine intestines contain chromosomal integrons, but not the Treponema species isolated from termite guts. By examining the species tree of selected spirochetes (based on 31 phylogenetic marker genes) and the phylogenetic tree of predicted integron integrases, and assisted by our analysis of predicted integron recombination sites, we found that all integron systems identified in Treponema spirochetes are likely to have evolved from a common ancestor—a horizontal gain into the clade. Subsequent to this event, the integron system was lost in the branch leading to the speciation of T. pallidum and T. phagedenis (the Treponema sps. implicated in sexually transmitted diseases). We also find that the lengths of the integron attC sites shortened through Treponema speciation, and that the integron gene cassettes of T. denticola are highly strain specific.
This is the first comprehensive study to characterize the chromosomal integron systems in Treponema species. By characterizing integron distribution and cassette contents in the Treponema sps., we link the integrons to the speciation of the various species, especially to the pathogens T. pallidum and T. phagedenis.
As important agents of bacterial evolution, integrons are genetic elements that aggregate mobile gene cassettes via site-specific recombination. The functional platform of integron systems consists of a site-specific tyrosine recombinase (intI), its primary recombination site (attI), and a transcriptional promoter for the cassettes . Integrons are capable of acquiring, rearranging, and expressing genes contained in gene cassettes sampled from a near-infinite environmental bank of cassettes . Each cassette carries one or a small number of genes (some cassettes lack open reading frames) linked to a recombination site termed attC. The genes in cassettes are highly diverse and mostly of unknown function , and they are usually promoterless and hence rely on the integron’s promoter for transcription. Accordingly, insertions of gene cassettes in the cassette array are highly orientation specific such that integron genes are transcribed from the integron promoter. Studies have revealed that gene cassette composition is extremely dynamic within and between environments [2–5] and even between closely related strains as observed in the Vibrio cholerae species .
There are two primary types of integrons: mobile integrons and chromosomal integrons . Mobile integrons (including integrons belonging to classes 1, 2 and 3, as defined by their respective intI genes) are found commonly on plasmids and are characterized by frequent lateral gene transfers (LGT); most other classes of integrons are nonmobile and so confined to chromosomes and specific bacterial lineages, and are therefore called chromosomal integrons . First discovered in the 1980s , mobile integrons contain mostly antibiotic resistance genes  and their ability to acquire a number of resistance cassettes leads to most clinical multidrug resistance. By contrast, chromosomal integrons were first found in Vibrio cholerae in the 1990s , and typically carry more gene cassettes than mobile integrons, of more diverse functions. In some species the chromosomal integrons constitute a significant fraction of the genome (for example, the total length of the gene cassette pools from five Vibrio chromosomal integrons is equivalent to a small genome) [1, 2].
Chromosomal integrons have been found in a wide range of bacterial species and environmental samples. An analysis surveying 603 sequenced bacterial genomes revealed that 9% carried integrons . Phylogenetic analysis of the integron integrase (IntI) suggests that integrons are ancient structures that have contributed to the evolution of bacterial genomes for hundreds of millions of years, primarily by vertical inheritance [11, 12]. Horizontal transfer of integrons has also been proposed, as discrepancies are found between the 16S rRNA gene tree and the integron integrase tree for the species Vibrio fischeri, Shewanella denitrificans, Nitrosococcus mobilis, and the Xanthomonas lineage . At a larger scale, two major clades of integrase were identified, and found to be associated with different environments (soil or ocean), consistent with their exchange among bacterial species occupying common environments [7, 11].
Treponema species belong to the spirochete family and many are involved in human diseases: T. pallidum is the cause of syphilis and yaws [14, 15]; and the oral pathogen T. denticola is associated with periodontal diseases [16, 17]. Different from other chromosomal integron-containing species (Vibrio cholerae, Vibrio vulnificus and Vibrio parahaemolyticus) that cause serious infectious diseases, Treponema species are widely found in healthy human populations . By building and examining the phylogenetic distribution of chromosomal integron-containing species among a wider selection of spirochetes, and a phylogenetic tree of integron integrases (IntI) from genomes that contain chromosomal integrons—assisted by an analysis of attC sites—we hypothesize that the chromosomal integron system has undergone at least one gain and one loss in the evolutionary history of the Treponema species. The gain happened after the speciation leading to the two termite-gut-associated species, Treponema azotonutricium and Treponema primitia. The persistent existence of the chromosomal integrons in the human-associated Treponema species, especially in those associated with oral sites, indicate that these species may gain evolutionary advantages by having integron gene cassettes. The loss happened in the common ancestor of T. pallidum and T. phagedenis: the integron system has been entirely deleted in T. pallidum (which has one of the smallest known bacterial genomes), while in the T. phagedenis genome several attC sites are present, but the intI gene seems to have been lost.
In order to understand the evolutionary history of the Treponema species, we collected the 31 marker genes described in  for the Treponema species, as well as for species serving as out-groups. The accession numbers and websites for downloading the draft genomes are listed in Additional file 1: Table S1. The 31 reference genes were extracted from the complete genomes according to their annotations. For unannotated draft genomes, a similarity-based search method (BLAST) was employed to find the genes in the contigs. We translated the genes into amino acid sequences using the standard codon table and aligned them using MUSCLE —separately for each gene. The aligned amino acid sequences were then concatenated and imported into MEGA5  for tree construction, with the model set to JTT (Jones-Taylor-Thornton model, the default amino acid model for MEGA5). The bootstrap number was 500.
We looked for integron systems by first detecting the presence of the integron intI. To find intI genes, we used genome annotations as well as performing similarity searches using the intI gene sequence from T. denticola ATCC 35405 as reference. After collecting intI gene sequences we built a tree for intI from all species that contain integrons, to ask how the integron system has evolved, using the same procedure as for the species tree described above.
Besides looking for intI, we also identified attC sites in the genomes, using both similarity-search-based and ab initio approaches. While attI sites are one of the core components of integron systems, they are less conserved than attC sites , and there is only one attI site in each integron. Thus, we focused on the analysis of attC recombination sites, taking advantage of their conservation and the fact that there are often multiple attC sites in an integron. First, we used eight recombination site sequences that represent all attC sites in the T. denticola ATCC 35405 integron  and performed similarity searches with all eight sequences, since not all recombination sites are the same (all attC sites in T. denticola ATCC 35405 genome can be aligned to at least one of the representative sequences with > 85% sequence identity; the sequences of these eight attC sites are shown in Additional file 1: Table S2). Note that these eight representative attC sequences have been used to effectively recover Treponema integron gene cassettes from human metagenomic samples .
Second, we developed an ab initio approach to identifying chromosomal integrons, by searching for arrays of potential attC sites. Existing computational methods—including the INTEGRALL database  and a context-driven approach using a computational grammar —focus on the identification of mobile integrons, especially those of class I, the most widespread and clinically important type. Our method targets chromosomal integrons, utilizing the unique features of the attC sites’ secondary structure: a single-stranded attC sequence forms a structure with two stems, an R box and a L box (with one protruding G), connected by a loop of varying length . Our method first scans an input genome for regions that are capable of forming the typical attC structure, and then uses these regions’ sequences to search for additional potential attC sites (which may be degenerate and thus unable to form the typical attC structure), and finally we keep only the candidates that form clusters (with at least two attC sites within a 10 K region), to reduce false positives.
Except for the annotated T. denticola ATCC 35405 genomes, for which the gene cassettes are described , the integron gene cassettes were extracted from the contigs of the draft genomes by identifying the attC recombination sites and extracting genes bounded by these sites. First, the attC recombination sites were detected using similarity searches with the eight representative attC sequences, with identity threshold set to 70% and coverage set to 50%. We then predicted the cassette genes using FragGeneScan , and extracted genes that were exactly bounded by two attC sites (overlap between attC sites and predicted genes was not allowed). We also set the maximum number of genes between any two attC sites to three, which is the maximum number of genes so far reported between any two integron recombination sites . Note that this approach has been applied to human metagenome samples and successfully recovered integron gene cassettes from the metagenomes , demonstrating that it is able to effectively collect genes from integron gene cassettes for both complete and fragmented genomic sequences.
Integron systems were detected by looking for intI genes in sequenced genomes or contigs (see Additional file 1: Figure S1 for a multiple alignment of predicted integron integrases). Figure 1 shows the phylogenetic tree of the species, with integron-containing species marked with stars, and with paths leading to these species highlighted. From this figure one can clearly see that there is a large gap (more than est. 0.4 MYA) between the two subtrees containing integrons (located at the top and bottom part of the figure): integron systems are missing in all earlier branching spirochetes and only appear in some Treponema species, including all strains of T. denticola, T. vincentii, T. brennaborense, and T. succinifaciens. This suggests that the integron system was acquired by the Treponema lineage after T. azotonutricium and T. primitia—isolated from termite guts—diverged.
While typical integrons are inverted  with the integrase gene transcribed in the opposite direction to that of the cassette genes (the two transcripts diverge from the central attI site), the Treponema integron loci we have identified share the atypical structure previously characterized in T. denticola with intI transcribed toward the attI (and so in the same direction as the cassette genes), adding evidence to the hypothesis of a single LGT of the integron system into the Treponema species. We note that there are a few transposases identified in the gene cassettes in the T. succinifaciens integron (their insertions into the integron could have interrupted the integron structure), but all the genes between the integrase gene (located between 716056 and 717288 bp) and the closest transposase (located between 736651 and 737865 bp)—with the exception of one gene located between 728835 and 728461 bp—are of the same orientation as that of the integrase gene, indicating that the atypical integron structure also applies to T. succinifaciens.
Besides the early gain of integrons by the Treponema genus, we also observe the apparent loss of integrons in some Treponema species. The most parsimonious explanation is that—following a single gain into the Treponema lineage—there was a single loss in the common ancestor of T. pallidum and T. phagedenis. To the best of our knowledge, this loss of an integron system from a lineage is unique—the structure of the Treponema chromosomal integron and its flanking regions are not associated with mobile elements and do not appear to be mobile . Thus our result provides the first instance of the loss of a chromosomal integron in evolutionary history.
Perhaps the most striking result is observed in the genome of T. phagedenis. Even though we could not find an intI in this species, we did find integron attC recombination sites at a chromosomal location syntenic to the integron element in other Treponema, and were able to predict integron cassette genes bounded by attC sites (see a comparison between the T. phagedenis genome and the T. denticola genome in Additional file 1: Figure S2, which highlights the presence of attC recombination sites and the absence of intI gene in the T. phagedenis genome). The existence of integron recombination sites and gene cassettes indicates that the integron system did exist in the common ancestor of this species and was lost recently in the evolutionary process. We could not find any traces of the integron structure, including intI integrase gene and attC recombination sites, in T. pallidum, suggesting that the integron system has been entirely deleted in this species.
Note that we detected only a single attC site in the T. brennaborense genome (between 1344413 and 1344528 bp). However, there is an adjacent intI gene (predicted between 1342819 and 1344015 bp; see Additional file 1: Figure S1 for the multiple alignment of the predicted integron integrases, and the conservation of key residues among these sequences). Between the intI gene and the attC site, we found a small segment (between 1344230 and 1344237 bp) of the same sequence (GTTAGGT) as the R’ binding site in the predicted attC site of T. brennaborense, consistent with this segment being the R binding site of the attI site for this integron. All suggest that the integron system in this genome is functional (with integrase gene, attI and attC sites). Chromosomal integron systems lacking integron gene cassettes are found in other species , but the lack of a gene cassette array, which is typical of chromosomal integron structures found in other Treponema species, suggests that the integron system in T. brennaborense needs to be examined in additional isolates.
We found that some of the predicted attC sites in T. succinifaciens and T. brennaborense are significantly longer than those in T. denticola and T. vincentii (in which attC sites are 63–68 bps ): the lengths of the three attC site types in T. succinifaciens are 63 bps, 68 bps, and 132 bps and the attC in T. brennaborense is 115 bps. Although chromosomal integron arrays—especially those with many gene cassettes—typically contain attC sites of similar sizes, arrays have been found that contain attC sites of varying lengths . Note that there are more, long attC sites (6 out of 10) than short ones in the T. succinifaciens genome, and these long attC sites share high sequence similarity (with 92–96% sequence identity). Since the integron system was inserted into the genome of a common ancestor of the Treponema species (as suggested by Figure 1), and T. succinifaciens and T. brennaborense appeared earlier than T. denticola and T. vincentii (and there are more long attC sites in T. succinifaciens), we hypothesize that the original integron recombination sites were longer (> 100 bps), and were gradually reduced in T. denticola (63–68 bps).
Despite the fact that very closely related T. denticola strains share integron genes, the overall integron-gene-sharing among all the denticola strains is not very high (on average two denticola strains share 24.79% of their integron genes; also see Figure 4A). Note that Koenig et al. reported even less sharing of integron genes (< 10%) among 12 Vibrio isolates . We also reported in a previous work that T. denticola integron gene contents are very different between different metagenomic samples . The dynamics of integron genes in T. denticola indicate that the integrons of T. denticola are fully active and are undergoing active insertion and deletion of cassettes.
Note that for the study of gene cassette dynamics in T. denticola (Figure 4A), we only considered genes present between identified attC sites, as attC sites are easier to predict than attI sites (attI sites are less conserved). However, we believe that ignoring the very first gene cassettes between the attI site and the adjacent attC site would not change our conclusions, considering that T. denticola isolates contain long arrays of gene cassettes.
Our combined sequence and phylogenetic analysis of the integron integrases and attC recombination sites in Treponema genomes suggest that integron systems have been acquired only once, by the common ancestor of T. brennaborense, T. succinifaciens, T. vincentii and T. denticola, as we could not find traces of integrons in earlier branching Treponema species (T. azotonutriciums and T. primitia, which are associated with termites) or other spirochetes. We cannot exclude the possibility that the integrons have been gained independently in these species; however a single insertion event is more likely than several distinct events, as the chromosomal integrons are not associated with mobile elements and so cannot move freely, and the Treponema intI genes form a common clade (Figure 2).
We also found evidence for the deletion of integrons in one subclade. The absence of a complete integron structure in T. pallidum and T. phagedenis can be explained by a single deletion event in the common ancestor of these two species. Moreover, we found remnants of the integron structure, including attC recombination sites and gene cassettes, in the T. phagedenis genome, but not in T. pallidum. Without the intI gene this integron structure is now static and cannot facilitate adaptation to new conditions by acquiring new gene cassettes, shuffling of existing gene cassettes, or deletion of cassettes. We could not find traces of an integron structure in T. pallidum, which could be the result of complete degradation of its integron structure.
Our results demonstrate both gain and loss of integron systems in the Treponema species. Even though the horizontal acquisition of integron structures has been suggested by [7, 13], to the best of our knowledge no literature has provided molecular evidence of integron loss—integrons have been found to be evolutionarily stable. The absence of the intI—along with the integron remnants in T. phagedenis—serves as the first instance of an integron deletion event. Since these chromosomal integrons are not associated with mobile elements, the mechanism of integron insertion/deletion is unknown. With the release of ever more bacterial genomes, we may be able to identify more indel events of integron systems and infer the reasons for these events in the foreseeable future.
The genome sequence of T. brennaborense was produced by the US Department of Energy Joint Genome Institute http://www.jgi.doe.gov/ in collaboration with the user community. The authors also thank the Broad Institute for sequencing and releasing the draft genomes of Treponema denticola strains. This work was supported by the National Science Foundation (grant DBI-0845685). We thank the anonymous reviewers for their helpful suggestions.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.