On the alleged origin of geminiviruses from extrachromosomal DNAs of phytoplasmas
© Saccardo et al; licensee BioMed Central Ltd. 2011
Received: 22 May 2011
Accepted: 28 June 2011
Published: 28 June 2011
Skip to main content
© Saccardo et al; licensee BioMed Central Ltd. 2011
Received: 22 May 2011
Accepted: 28 June 2011
Published: 28 June 2011
Several phytoplasmas, wall-less phloem limited plant pathogenic bacteria, have been shown to contain extrachromosomal DNA (EcDNA) molecules encoding a replication associated protein (Rep) similar to that of geminiviruses, a major group of single stranded (ss) DNA plant viruses. On the basis of that observation and of structural similarities between the capsid proteins of geminiviruses and the Satellite tobacco necrosis virus, it has been recently proposed that geminiviruses evolved from phytoplasmal EcDNAs by acquiring a capsid protein coding gene from a co-invading plant RNA virus.
Here we show that this hypothesis has to be rejected because (i) the EcDNA encoded Rep is not of phytoplasmal origin but has been acquired by phytoplasmas through horizontal transfer from a geminivirus or its ancestor; and (ii) the evolution of geminivirus capsid protein in land plants implies missing links, while the analysis of metagenomic data suggests an alternative scenario implying a more ancient evolution in marine environments.
The hypothesis of geminiviruses evolving in plants from DNA molecules of phytoplasma origin contrasts with other findings. An alternative scenario concerning the origin and spread of Rep coding phytoplasmal EcDNA is presented and its implications on the epidemiology of phytoplasmas are discussed.
Geminiviruses are a large group of plant viruses causing several important diseases worldwide, characterized by a nucleic acid genome encapsidated into twinned particles formed by joining two incomplete icosahedra. Geminiviruses differ from most other plant viruses in the fact that they are single-stranded DNA (ssDNA) viruses that multiply through rolling circle replication (RCR). They constitute one of the three recognized groups of episomal replicons that use RCR, the other being circular ssDNA bacteriophages, and plasmids of bacteria or archaea . In a seminal paper Koonin and Ilyina  found weak similarities between the replication associated protein (Rep) of geminiviruses and that of the pLS1 family of plasmids of Gram positive bacteria. Despite the limited similarity, the conservation of motif signatures and of the spacing between them led to the conclusion that they constitute a distinct superfamily. On this basis Koonin and Ilyina  advanced the hypothesis that geminiviruses may have actually originated from bacterial plasmids.
In the late 1990s, sequences with a relatively high similarity to Rep were found in some extrachromosomal DNA molecules (EcDNA) borne by a group of phytoplasmas related to the Western-X disease phytoplasma , and then in the EcDNAs of several other phytoplasmas [4–9]. Phytoplasmas are plant pathogenic Mollicutes, wall-less prokaryotes taxonomically related to the Clostridium/Bacillus clade of low G+C Gram positive bacteria. They share with geminiviruses the characteristic of inhabiting the plant phloem and being transmitted from plant to plant by defined groups of insect vectors. The similarity of replication associated protein of phytoplasma EcDNAs and geminiviruses has been a matter for discussion among plant pathologists over the last ten years [10, 11].
On the basis of similarities among replication associated proteins and comparative homology-based structural modeling of viral capsid proteins, Krupovic and coworkers  recently proposed "a plasmid-to-virus transition scenario, where a phytoplasmal plasmid acquired a capsid-coding gene from a plant RNA virus to give rise to the ancestor of geminiviruses". Here we report some new experimental data, homology searches and phylogenetic analysis that, together with the results of previous research, conclusively show that this, although fascinating, hypothesis is too simplistic and other possible scenarios are more likely.
Phytoplasma strains were maintained in a greenhouse by graft-transmission to healthy Catharanthus roseus. The phytoplasma strains used in this work and their origin are listed in Additional File 1. Nucleic acids from healthy and infected periwinkle plants were isolated using a standard phytoplasma enrichment procedure .
The sequence data used in this work relative to 16S rDNA and single stranded DNA binding (SSB) proteins of various bacteria, plasmid replication protein (rep), phytoplasmal EcDNAs, virus capsid and replication associated proteins, as well as environmental DNA were retrieved from the EMBL database and the community cyberinfrastructure for advanced marine microbial ecology research and analysis (CAMERA, http://camera.calit2.net). The complete EcDNA sequence of New Jersey Aster Yellows (NJAY) phytoplasma was determined in this study. Sequence accessions, genes, organism names, reference databases and labels used in the figures are listed in Additional File 2.
Multiple sequence alignments of 16S rRNA genes, rep and SSB were performed separately using MEGA4 . For rep, the helicase domain was excluded and the alignment was restricted to the replication initiator domain (N-terminal region of about 150-180 aa).
Phylogenetic analysis using parsimony was carried out with the PHYLIP package using the programs SEQBOOT, PROTPAR, DNAPARS and CONSENSE. Bootstrapping with 500 replicates was performed to estimate the stability and support for the interfered clades.
Percent identity and similarity of phytoplasmal EcDNA borne proteins and capsid proteins with other database accessions were calculated using NEEDLE, launched recursively with a BIOPERL script when needed. Principal coordinates analysis was carried out with R . The likelihood-ratio test for monophyly  was carried out with a selection of 14 sequences taking a null hypothesis that the Rep of type II EcDNAs, the rep of type I EcDNA and RCR plasmids are a group while the Rep of geminiviruses are another. Likelihoods were estimated with PHANGRON. The significance of the likelihood ratio was estimated by parametric bootstrap according to  by simulation of 1000 replicated datasets generated with INDEL-SEQ-GEN. Tetranucleotide usage patterns were compared with the program TETRA.
Degenerate primer sets (Additional File 3) were designed on conserved EcDNA regions deduced from sequences available from the EMBL database, to PCR amplify the replication associated protein of the EcDNA of "Candidatus Phytoplasma asteris" strain NJAY. Purified PCR products were sequenced and the entire EcDNA of NJAY phytoplasma was sequenced by primer walking using newly designed primers (see Additional File 3).
Amplifications were performed in a 20-μl PCR reaction containing 100 ng of template DNA, 200 μM dNTPs, 1 μM of each primer, 1 U of 5 PRIME DNA polymerase with the recommended PCR buffer containing MgCl2 (5 PRIME, Hamburg, Germany). PCR was carried out with an automated thermal cycler (T-Professional Basic, Biometra, Germany). The reactions included an initial denaturation cycle at 94°C for 2 min, then 30 cycles of 94°C for 20 sec, 53°C for 20 sec and 72°C for 3 min. At the end, the reaction mixtures were incubated at 72°C for 10 min and then stored at 4°C.
The DNA fragments were sequenced by standard methods and assembled manually using BIOEDIT 7.0.0 (Tom Hall, Carlsbad, CA, USA). Open reading frames were predicted using ORF FINDER (NCBI, http://www.ncbi.nlm.nih.gov/gorf/gorf.html), using the standard genetic code. Homologous sequences were identified from the GenBank database using the BLASTX programme (http://www.ncbi.nlm.nih.gov/blast/Blast.cgi).
In conclusion, evidences from replication associated protein similarity and EcDNA gene organization and composition show that the sequence similarity between the Rep genes of geminiviruses and phytoplasmas do not link geminiviruses to RCR plasmids of Gram positive bacteria, rather they indicate the existence in phytoplasmas of recombinant replicons containing a Rep with a different phylogenetic history from their host bacteria, presumably horizontally acquired from geminiviruses, i.e. viruses that share the same niche of phytoplasmas being insect transmitted and inhabiting the plant phloem.
In an attempt to define the origin of the geminivirus capsid, Krupovic and coworkers  hypothesized that phytoplasmal "plasmids" released upon lysis of the bacterial cell in the cytoplasm of the host plant cell obtained a coat protein (CP) coding gene from an unknown plant virus. Through modeling of the geminiviral CP Krupovic and coworkers  found that it fits the eight-stranded β-barrel folding model, like all isometric ssRNA plant viruses and several DNA viruses. Among viruses for which a 3D structure is available, the Satellite tobacco necrosis virus (STNV) was found, with a significant score, to be a suitable template for structural modeling of geminiviral CPs, as was also earlier reported in [27, 28]. Krupovic and coworkers  constructed 3D models of geminiviral CPs and tested the stereochemical quality along with the X-ray structure of the STNV CP. In addition, they found similarity in the primary amino acid sequence between geminiviruses and STNV in a structure-based sequence alignment. On this basis they hypothesized that a phytoplasma "plasmid" may have recruited, through RNA/DNA recombination, the genetic information of a capsid protein from an icosahedral ssRNA virus similar to STNV resulting in the development of virions composed of two incomplete icosahedra large enough to accommodate its genome.
Virion characteristics of virus families including at least one species transmitted by leafhoppers or whiteflies
Virus Family or genus
60-900 × 24-35 nm
1000-2000 × 10-13
Bullet-shaped or bacilliform
130-350 × 45-100
300-900 × 12-15
3-10 nm × 950-1350
T = 2
T = 3
T = 3
30 × 18-20 nm
T = 1
With no suitable donor candidates among the known leafhopper-or whitefly-transmitted viruses, a less parsimonious scenario has to be postulated to accommodate the hypothesis of Krupovic and coworkers : the recruited CP gene conferred transmission characteristics that were different from those of geminiviruses, but in a later time a virus line evolved with infection characteristics and a niche that were, by pure chance, similar to those of the original donors of the Rep gene, i.e. the leafhopper-transmitted and phloem inhabiting phytoplasmas. This scenario would fit with STNV, that was indicated by Krupovic and coworkers  as the most closely related virus acting as a potential ancestor donor of capsid genes. However, if STNV, a virus transmitted by a fungus, was a donor of CP to the nascent geminivirus, then ssDNA viruses with a replication associated protein similar to geminivirus Rep but with transmission characteristics different from those of the present geminiviruses should have formed, a notion that contrasts with the present knowledge of plant virus diversity.
Despite the great diversity of known plant viruses, a non-geminivirus with Rep-like replication associated protein has never been found. Therefore, the less parsimonious version of the hypothesis of Krupovic and coworkers implies a Geminiviridae ancestral virus taxon that disappeared leaving no trace. On a contrasting line of evidence, a recently discovered geminivirus-related DNA mycovirus from the fungus Sclerotinia sclerotiorum (named SsHADV-1)  greatly differs in its CP from those of geminiviruses and from that of STNV as well. Here, we question that a poorly parsimonious hypothesis that also implies unlikely RNA/DNA recombination could be accepted. Indeed, data obtained from recent metagenomic studies suggest alternative hypothesis.
Amino acid similarities and identities of some protein sequences deduced from entries of metagenomic study with selected geminivirus CPs
Most similar geminivirus CP
Indian cassava mosaic virus
Sweet potato leaf curl virus
Sweet potato leaf curl virus
Mung bean yellow mosaic India virus
Corchorus yellow vein virus
Tomato golden mottle virus
Wheat dwarf virus
In conclusion, although the origin of the geminivirus CP cannot be determined with certainty, the origin from a ssRNA virus such as SNTV appears to be unlikely compared to other hypotheses on the basis of similarity analysis, the absence of any remnant of a non-leafhopper/whitefly-transmitted plant virus encoding Rep, and the requirement of a DNA/RNA recombination event in incongruent cell compartments.
Given the evidence of a distant relationship between the CPs of geminiviruses and STNV, a common origin for both spherical and geminate virions with T = 1 icosahedral symmetry remains an interesting hypothesis; the information reported here only shows that the idea that the evolution from the common ancestor to the present virions occurred in land plants is not sufficiently supported. Several lines of evidence further indicate that geminiviruses evolved earlier, from remote ancestors existing 450 million years ago , and there is molecular evidence that begomoviruses and mastreviruses were already differentiated at the time of the Gondwana separation , i.e. before the phytoplasma phylogenetic branch arose from the insect colonizing AAP (Acholeplasma - Anaeroplasma - Phytoplasma) lineage of Mollicutes (estimated as 180 million years in ). This course of evolutionary events is also compatible with a common origin of ssDNA viruses of plants, in agreement with the results gathered by Gibbs and Weiler  who detected several traits in common between geminiviruses and nanoviruses strongly suggesting their common origin, a notion consistent with both the transmission characteristics and type of replication.
It is tempting to conclude that the apparent evolutionary isolation of geminiviruses deduced by the analysis of RCR replicons in plants is only due to the limitation of our narrow view on life diversity.
Our results from sequence data analysis are consistent with a recombination event between phytoplasma plasmids (type I EcDNAs) and the geminivirus genome giving rise to type II EcDNAs in phytoplasmas. Krupovic and coworkers  have discarded this hypothesis because geminiviruses "maintained features of prokaryotic replicons, such as typical bacterial promoter sequences" and "are in some instances still able to replicate their DNA in bacterial cells". It may be useful to stress that a remote bacterial origin is definitely not in contrast with a hypothesis of a more recent recombination event. There are also reasons to question the putative origin of geminivirus Rep from bacterial plasmids. Kapitonov and Jurka  suggested that geminiviruses might have evolved from plant RC transposons rather than from prokaryotic RC replicons. Plant RC transposons (helitrons) encode their own helicase and SSB. Moreover, some geminiviruses can replicate in the Gram negative Agrobacterium tumefaciens , while, to our knowledge, no RCR plasmid of the pLS1 family has been reported to replicate in Gram negatives. In addition, there is no evidence that geminivirus Rep is functional in a bacterial background that support replication of RCR plasmids. We have tested the ability of different constructs containing phytoplasmal Rep to replicate in Bacillus subtilis. We inserted the entire NJAY EcDNA into pJM103 (a pUC18 derivative that can replicate in E. coli but not in B. subtilis and contains a chloramphenicol resistance that is expressed in B. subtilis ), but found no evidence of replication of the construct in B. subtilis (results not shown). Thus, the replication in A. tumefaciens does not appear to be strong evidence of a geminivirus relationship with RCR plasmids.
The sequence of the complete genome of several phytoplasmas showed that these organisms have incomplete nucleotide synthesis pathways and therefore depend on their host for nucleotides [8, 45, 46]. No transport system for nucleosides or nucleotides has been identified yet in the phytoplasma genomes, and, since no information on how they obtain the necessary nucleotides for replication is available, uptake and recycling of nucleic acids from the host plant may play a prominent role. It has also been shown that phytoplasmas have a highly active recombination system. Indeed, sequences similar to truncated geminivirus Rep have been found in the chromosome of several phytoplasmas. Thus, geminivirus DNA in the phloem may have been readily available for internalization and incorporation into the phytoplasma chromosomal or extrachromosomal DNA by recombination.
Once acquired by recombination, the survival and sequence conservation  of Rep in phytoplasmas may derive from its contribution to the propagation and spread of plasmid borne functions. Namba and coworkers  have highlighted the possible implication of the phytoplasma plasmid borne ORF3 in determining insect transmissibility and showed that a non-insect-trasmissible variant of the same phytoplasma strains lacked ORF3. Thus, a plasmid encoded sequence may have a relevant role in phytoplasma epidemiology.
According to our Southern blot analyses (not shown) and other studies  no EcDNA was detected in phytoplasmas such as "Ca. P. mali", "Ca. P. pyri", "Ca. P. vitis", "Ca. P. prunorum" that are monophagous and have a narrow insect vector range. Conversely EcDNAs have been reported in strains of the polyphagous species "Ca. P. asteris", "Ca. P. australiense", "Ca. P. pruni" and "Ca. P. trifolii", that are transmitted by a wider range of insect vector species [3, 5–9]. There are several reports over the last 15 years of molecular analysis of phytoplasma diversity that indicate that the infection by two or more polyphagous phytoplasmas is a common event in herbaceous plants; besides, transmission of phytoplasma strains by different insect species has been found to be the basis of epidemics and outbreaks of new diseases . In this context, an EcDNA carrying ORF3 and propagating among polyphagous phytoplasmas possibly contributed to widen the insect vector range. Our analysis of the untranslated region of NJAY phytoplasma EcDNA revealed that it includes a remnant of ORF3 (figure 4). Since NJAY phytoplasma EcDNA, like several other EcDNA sequences in the database, has been obtained from a phytoplasma strain isolated in an experimental host and propagated for many years by graft transmission rather than insect vectoring, the NJAY EcDNA could have initiated a process of reductive evolution, as recently reported , loosing a functional ORF3. A search among other phytoplasmal EcDNA sequences revealed that functional or incomplete ORF3 homologs are present in 19 out of the 30 EcDNAs fully sequenced so far.
The potential contribution in broadening insect vector specificity by propagating ORF3 horizontally among phytoplasmas may be the cause of the conservation of EcDNAs, including type II EcDNAs that may have originated by recombination. Although a search for the canonical nonanucleotide sequence in the untranslated region of NJAY type II EcDNA was unsuccessful, we detected a variant with 8 conserved nts (not shown); the recent report that high-affinity Rep-binding is not required for the replication of a geminivirus DNA  gives ground to the hypothesis that, upon recombination, a geminivirus Rep may have functionally substituted rep in catalyzing the replication of DNA sequences, representing a selective advantage for the host organism. We may speculate that the propagation and spread of ORF3 may have granted conservation of both EcDNA types.
Since phytoplasmas belonging to some phylogenetic clades do not have remnants of Rep that are conversely common in other strains, the phytoplasma type II EcDNA should have appeared after the separation of the major phytoplasma clades, well after the appearance on earth of vascular plants and probably the origin of geminiviruses.
The data presented here explain the origin of phytoplasmal type II EcDNAs and support the rejection of the hypothesis that geminiviruses evolved from phytoplasma plasmids, even though the evolutionary history of geminiviruses remains to be clarified. Nevertheless, in agreement with recent reviews on this topic , a more in depth investigations of environments different from higher plants is expected to provide sound answers.
viral replication associated protein
bacterial replication associated protein
single stranded DNA
Dr. William Dundon (Istituto Zooprofilattico Sperimentale delle Venezie, Padova) is gratefully acknowledged for the revision of the text.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.