PCR amplification of genomic DNA and cloning
Degenerate primers GDFNAKH (forward) and FKNMKAPG (reverse) (Sigma Genosys) were designed according to conserved amino acid sequence including 939 bp found in an alignment of ORF2 of the Juan element from Juan-A of Ae. aegypti and Juan -C of C. pipiens (Figure 1). In contrast to the commonly used RT region, we chose to use this less conserved region to increase resolution between sequences from closely related species. Genomic DNA was isolated from several individuals of a given species using the DNAzol Genomic DNA Isolation Reagent (Molecular Research Center). PCR was performed on genomic DNA from a total of 30 species of mosquitoes from 10 genera. The calculated Tms of the forward and reverse primers were 54.2°C and 62.7°C. Each 20 ul PCR reaction consisted of approximately 3 ng of genomic DNA, 1U of TakaRa Taq Polymerase (Takara), 1.5 mM MgCl2, and 0.2 mM each dNTP. PCR was performed by denaturation at 95°C for 90s and 30 cycles of 95°C for 30s, 48°C for 50s, and 72°C for 90s. Amplified products were size-separated on a 0.7% agarose gel and purified using the Sephaglass BandPrep Kit (Amersham Pharmacia Biotech). These products were cloned into the pCR 2.1 TOPO vector using the TOPO TA Cloning Kit version K2 (Invitrogen) or the pGEM-T Easy vector (Promega). Plasmids were purified using the Wizard Plus Minipreps DNA Purification System (Promega).
For construction of mosquito (host) phylogeny, we used a 987 bp region (excluding intron sequence) of Vg-C, a single copy yolk protein-encoding gene . This region was amplified from Ae. simpsoni by nested PCR in our lab to add this species to the mosquito phylogeny. The following describes methods according to Isoe's work . Degenerate primers were designed to amplify a 1.1 kb region that is specific for the Vg-C ortholog that includes the second intron. Primers Vg-C-specific forward (5'-(A/G)A(T/C)(A/G)TNAA(A/G)CA(T/C)CCNAA(A/G)G-3'), Vg-C-specific reverse (5'-TC(A/G)TT(T/C)TG(T/C)TT(A/G)TA(T/C)TG(A/G/T)CC-3'), and Aedes universal reverse (5'-C(A/G)T(A/G)CCA(A/G)CANTCNCCCAT-3') were used in nested PCR. The first PCR used the Vg-C-specific forward and reverse primers for 1 cycle at 94°C for 3 minutes, 32 cycles at 94°C for 1 minute, 50°C for 1.5 minute, and 1 extension cycle at 72°C for 10 minutes. The second PCR used the Vg-C-specific and Aedes universal reverse primers with the same conditions except that the annealing temperature was increased to 54°C. PCR products for Ae. simpsoni were cloned and sequenced as described above. Cloned inserts were sequenced in our laboratory using a GENE READIR DNA sequencer (LI-COR) with fluorescent-labeled T7 and m13r primers, or by DNA sequencing services (Amplicon Express and VBI-Blacksburg, VA). H20 was used as a no-template negative control for PCR.
Genome and sequence analysis
Genome analysis was performed on the contig version of the Aedes aegypti genome sequence, which consists of 36206 contigs comprising 1310.1 Mb, having 7.6 × coverage (Broad Institute). BLAST and other programs (TEpost, FromTEpost) developed in our lab  were used to extract and filter sequences from BLAST output. Genome contribution by Juan-A was estimated using RepeatMasker  using 70% identity cutoff with full-length Juan-A as query. The Wisconsin Package GCG version 10.2-UNIX (Genetics Computer Group) was used for analysis of cloned and sequenced PCR products. Alignments were produced with ClustalX 1.81 . To obtain dS/dN values, substitution analysis was performed using the SNAP program on the web [49, 50]. Only sequences that had intact sequence regions were used for substitution analysis. Mean values are calculated based on all pair-wise comparisons from that group.
Phylogenetic inference was performed using MrBayes version 3.1.2 [51, 52]. Sequences were aligned using ClustalX version 1.83  using the following parameters: pair-wise alignment gap opening = 10, gap extension = 0.1; multiple alignment gap opening = 10, gap extension = 0.2. Nucleotide Vg-C sequence data (see above) were used for the host phylogeny. The Modeltest server (version 3.7) [54, 55] was used to determine the best nucleotide evolutionary model (General Time Reversible (GTR) allowing for variable substitution rates among sites) according to an Aikaike Information Criteria (AIC) score. The model was implemented with MrBayes, running 150,000 generations, concluding with an average standard deviation of split frequencies below 0.01 (as suggested in the MrBayes manual), evidence of convergence of two independent tree searches.
Conceptually translated nucleotide sequences and sequences form Genbank (accessions) were used for non-LTR phylogeny. Sequences were aligned as described above. MrBayes was used to explore 9 fixed-rate amino acid evolutionary models, finding Jones  to have the highest score. Two hundred thousand generations were run resulting in an average standard deviation of split frequencies below 0.01. For all consensus trees displayed, clade credibility values are given at each node representing samplings of 1 of every 100th generation, while discarding the first 25% of all generations (the "burnin" period). Another analysis performed for 1,000,000 generations produced the same tree topology. See additional files 1 and 2 for alignments used for phylogenetic inference.
Juan-A copy number determination in the Ae. aegypti genome sequence
Different regions of the Juan-A sequence were used to determine Juan copy number in the Ae. aegypti genome by database search using BLAST (Figure 1, Table 1). The Juan-A 3' UTR is approximately 240 bp. For copy number determination, we used 0.34 Kb of the 3' end as the query to be consistent with the use of the 0.34 Kb 5' UTR. Hits were counted which had sequence identities greater than or equal to 70%, 80%, 95%, 97%, 98%, or 99% compared to the query. In each case, a hit had to have at least 90% length of the query sequence. The full-length published Juan-A sequence is 4727 bp .
Sequence identity comparisons
In Table 3, values shown for all species except Ae. aegypti are means plus one standard deviation from pair-wise comparisons of nucleotide sequences obtained by PCR. Only sequences from the same lineages are compared. Comparisons were made between sequences and the consensus derived from the number of sequences in column 3. Ae. aegypti sequences were obtained by database search using a query that spans the same sequence amplified by PCR (see Figure 1). The number 768 is higher than what is shown in row 1 of Table 1 because the query here is the segment used for PCR.
Amplified genomic libraries for Ae. albopictus, Ae. polynesiensis, C. tarsalis, and C. quinquefasciatus made using the Zap Express or Dash II kits (Stratagene) were screened using Digoxegenin-labeled (Roche Diagnostics) ssDNA probes generated from asymmetric PCR reactions. Two probes used for screening libraries of the Aedes or Culex genus were made from cloned PCR products amplified from Ae. aegypti and C. tarsalis using degenerate primers described above. The average insert size for the genomic libraries was 7 kb for Aedes and Culex libraries. Approximately 15,000 – 50,000 plaques were plated on NZY Agar plates and lifts were performed with nylon membranes (Osmonics). The membranes were blocked with prehybridization solution, containing 5 × SSC, 0.1% N-laurolylsarcosine, 0.02% SDS, and 2% nonfat milk for 2 hours at 55.0°C in a rotating hybridization incubator. Hybridization was performed with about 20 ng/ml of Digoxegenin-labeled probe in prehybridization solution for 6 hours to overnight at 55.0°C in a rotating hybridization incubator. Stringency washes were done using 0.5 × SSC, 0.1% SDS. Membranes were incubated with an anti-Digoxegenin antibody conjugated to alkaline phosphatase, and then developed with substrates BCIP and NBT for colorimetric detection. The copy number of Juan was calculated using known values of haploid genome size, average insert size of the library, and the ratio of positives to total number of plaques.