Molecular polymorphism, differentiation and introgression in the period gene between Lutzomyia intermedia and Lutzomyia whitmani

Background Lutzomyia intermedia and Lutzomyia whitmani (Diptera: Psychodidae) are important and very closely related vector species of cutaneous leishmaniasis in Brazil, which are distinguishable by a few morphological differences. There is evidence of mitochondrial introgression between the two species but it is not clear whether gene flow also occurs in nuclear genes. Results We analyzed the molecular variation within the clock gene period (per) of these two species in five different localities in Eastern Brazil. AMOVA and Fst estimates showed no evidence for geographical differentiation within species. On the other hand, the values were highly significant for both analyses between species. The two species show no fixed differences and a higher number of shared polymorphisms compared to exclusive mutations. In addition, some haplotypes that are "typical" of one species were found in some individuals of the other species suggesting either the persistence of old polymorphisms or the occurrence of introgression. Two tests of gene flow, one based on linkage disequilibrium and a MCMC analysis based on coalescence, suggest that the two species might be exchanging alleles at the per locus. Conclusion Introgression might be occurring between L. intermedia and L. whitmani in period, a gene controlling behavioral rhythms in Drosophila. This result raises the question of whether similar phenomena are occurring at other loci controlling important aspects of behavior and vectorial capacity.


Background
The Phlebotominae sand flies Lutzomyia intermedia Lutz & Neiva 1912 and Lutzomyia whitmani Antunes & Coutinho 1912 are vectors of cutaneous leishmaniasis in Brazil. These are closely related species that can be only distinguished by a few morphological differences [1] and both show high anthropophily and reported natural infections with Leishmania in different regions of Brazil [2].
Despite their importance as vectors, only a handful of studies have been carried out in these two species using molecular techniques [3][4][5][6]. One of the most important findings from an epidemiological perspective is the evidence obtained for introgression between the two species using mitochondrial DNA [4]. This was particularly interesting because apparently, only lineages of L. whitmani sympatric with L. intermedia have been involved in cutaneous leishmaniasis transmission in the peridomestic environment [4], which suggests that genes controlling aspects of vectorial capacity could be passing from one species to the other. In fact, mitochondrial introgression has been reported in other sand fly species [7,8] suggesting that might be a common phenomenon in these insect vectors. However, because mitochondrial genes can introgress relatively easily between closely related species [9], it becomes important to examine whether introgression can occur with nuclear genes.
The Drosophila period (per) gene homologue was isolated in sand flies by Peixoto et al. [10]. This circadian clock gene was originally identified using mutagenesis by Konopka and Benzer [11], but is also known to control the differences in the "lovesong" rhythms between D. melanogaster and D. simulans [12], that are important to the sexual isolation between these two species [13][14][15]. In addition, per was implicated in the control of species-specific circadian mating rhythms in Drosophila and Bractocera, which might also constitute a reproductive isolation mechanism [16][17][18]. Thus per may possibly represent an example of a Drosophila speciation gene [19], and in fact it has been used as a molecular marker in a number of speciation and evolutionary studies, not only in Drosophila (reviewed in [20]) but also in other insects (e.g. [21]) including sand flies [22][23][24].
Because per controls the circadian clock in different insects [25], it is almost certainly involved in the rhythms of activity and biting of sand flies [26], which are very important to leishmaniasis transmission. In addition, per might be involved in reproductive isolation in sand flies, via mating rhythms, or via their "lovesongs" [2,27]. per is thus a particularly interesting marker, among the few available, for an introgression analysis in L. intermedia and L. whitmani. Evidence for introgression in per might suggest that gene flow between these two vector species is occurring at other genes controlling important aspects of behavior and vectorial capacity. It might also suggest that per does not have a strong role in their reproductive isolation. In the current study, we analyzed the molecular variation within the per gene of L. intermedia and L. whitmani in five different localities in Eastern Brazil.

Results
Polymorphism and divergence between L. intermedia and L. whitmani A total of 68 sequences from L. intermedia and 53 from L. whitmani homologue to a fragment of the period gene were analyzed from populations of five localities in Eastern Brazil (Fig 1). The alignment of 72 variable sites is shown in Fig 2. Although most of the changes are either synonymous or occur within the 58 bp intron, non-synonymous substitutions are observed causing 9 amino acid differences among the sequences (Fig 2). Table 1 shows the number of sequences of each population of the two species, the number of polymorphic sites (S) and the estimates of molecular polymorphism θ (based on the total number of mutations) and π. Table 1 also shows the Tajima's [28] and Fu & Li's [29] statistics. Within each species, all populations present similar levels of polymorphism with the exception of L. whitmani from Ilhéus, which seems to be less polymorphic than the others. This population was also the only one presenting a significant value in the Fu & Li test but only at the 5% level. Finally, the last column of Table 1 presents the recombination estimator γ [30] indicating that both species show evidence of intragenic recombination in the per gene.

Genealogy of period sequences
A phylogenetic analysis of the period gene sequences from L. intermedia and L. whitmani was carried out with the Minimum Evolution method using the Kimura 2-parameter distance (Fig 3). A sequence from L. umbratilis, a related species from the same subgenus Nyssomyia, was used as outgroup [24]. The tree shows L. intermedia and L. whitmani as non-monophyletic. However, despite the low bootstrap values, which are below 50% in most cases, there is a large group that contains most L. intermedia sequences and a second large group with most L. whitmani  As mentioned before, there is evidence of intragenic recombination in the per gene fragment of both species (see Table 1) and for that reason the bifurcating tree shown in Fig 3 has (Fig 4). A small number of ambiguities were resolved as suggested by Crandall and Templeton [37]. The haplotype network shows connections between  [54] is the average number of nucleotide substitutions per site between the two species and D a [54] is the number of net nucleotide substitutions per site. Both D xy and D a were calculated using Jukes & Cantor correction [55]. Standard deviations for Da and Dxy are between parentheses. Fst is the fixation index. The significance of F st , P(F st ), is based on 1000 permutations as before and Nm is the estimated number of migrants per generation. S int is the number of sites that are polymorphic in L. intermedia and monomorphic in L. whitmani; S whit is the number of sites that are polymorphic in L. whitmani and L. intermedia in the first; S S is the number of polymorphic sites shared by the two species and S F is the number of fixed differences. sequences from each species, separating most of the sequences of L. intermedia and L. whitmani in two groups. No intraspecific geographical structuring was found. Once again, some of the L. whitmani sequences (WAC2, WAC10, WPO13 and WPO14) appear more closely related to L. intermedia haplotypes. In addition, one L. intermedia allele (ICP16) is connected by a small number of mutations to some of the main L. whitmani haplotypes and IPO13 is a shared haplotype between the two species. These results confirm the same putative introgressed sequences indicated by the phylogenetic reconstructions.

LD test of introgression
We tested the hypothesis of gene flow between L. intermedia and L. whitmani using a method based on linkage disequilibrium (LD) developed by Machado et al. [38]. In this test, x is the difference between the average LD found among all pairs of shared polymorphisms (DSS) between the two species and the average LD among all pairs of sites for which one member is a shared polymorphism and the other is an exclusive polymorphism (DSX). In case of gene flow x should tend to be positive [see [38] for more details].
Because of limitations on the total number of sequences that could be handled by the WH program we could not perform the simulations with all sequences. Therefore, we carried out the LD test of introgression between each pair of sympatric populations of L. intermedia and L. whitmani from the localities of Posse, Afonso Claudio and Corte de Pedra. The input files were prepared using the values of recombination and linkage disequilibrium calculated by the SITES program [30] for each population (data not shown). Although no significant values were found for the smaller samples of Afonso Claudio and Corte de Pedra, the results (Table 5) present evidence for introgression in the period gene in both directions (from L. intermedia to L. whitmani and vice-versa) in the locality of Posse.

Discussion
There is some evidence that L. intermedia and L. whitmani might represent sibling-species complexes in Brazil. Lutzomyia neivai Pinto 1926, a sibling of L. intermedia is found in parts of Southern and Western Brazil and some other countries of South America [40]. The present study did not include populations of this species. In the case of L. whitmani, mitochondrial data [3,6] indicates three main lineages in Brazil: an Amazonian group, a North-South group and a Northeast group. We did not find strong evidence of a geographical differentiation in the period gene among populations of L. whitmani although one of the pairwise Fst comparisons (Posse × Ilhéus) was significant at the 5% level.
When we compare L. intermedia and L. whitmani, we find a highly significant Fst value (0.3373), which is however smaller than that observed for the period gene between sympatric siblings of Lutzomyia longipalpis (Fst = 0.3952) [23], a complex of cryptic species that are vectors of Amer-ican visceral leishmaniasis. Therefore, despite the presence of diagnostic morphological characters to identify L. intermedia and L. whitmani [1] the level of molecular divergence in period is not as high as the cryptic L. longipalpis siblings.
Even though it is hard to distinguish introgression from the persistence of ancestral polymorphisms, a test of gene flow based on the signature introgression leaves on the patterns of linkage disequilibrium [38] as well as simulations that fit the "Isolation with Migration" model to the data suggest that L. intermedia and L. whitmani might be exchanging alleles at the per locus. This is further supported by the presence of shared haplotypes between the two species in Posse and very similar sequences in all sympatric populations. There is mounting evidence that introgression plays a major role in the evolution of closely related insect vector species. Introgression among vectors may have important epidemiological consequences. Gene flow in loci that affect vectorial capacity, such as those controlling host preference and susceptibility to parasite infection, can change the transmission patterns and consequently make the disease control a harder task. Introgression of genes that control adaptation to particular types of environment can also have a major impact on the spread of vector-borne diseases as was proposed for the major African malaria vector Anopheles gambiae [41]. The same can be said about genes controlling insecticide resistance. For example, Weill et al.
[42] found a kdr mutation responsible for pyrethroid resistance in the Mopti form of Anopheles gambiae, a normally susceptible taxon of this species complex. Sequence analysis reveals that this resistant allele probably originates through introgression from the Savanna form.
Although L. intermedia and L. whitmani are closely related and only distinguished by a few morphological differences, they do show differentiation in some other important traits. For example, in Posse, one of the localities we studied, the two species show differences in abundance during the year. L. intermedia is more abundant in the Posterior distribution for migration estimates   In the first line for each species are the observed and mean simulated values of x (see text). The estimated probability of observing a simulated value higher than the observed value of x is presented in brackets below the mean simulated value of x; * less than 5% of simulated values higher than the observed value.
summer while L. whitmani is more frequent in the winter months [2]. They also show differences in microhabitat preferences, L. intermedia being more common in the peridomestic area while L. whitmani is found mainly in the surrounding forest [2]. In addition, the two species show marked differences in their tendencies to bite humans in the early morning, with L. whitmani showing higher feeding rates than L. intermedia [26]. Therefore, despite the evidence of introgression in the period gene in this locality, there are important ecological and behavioral differences between the two species in Posse suggesting that gene flow is probably rather limited in loci controlling these traits. Hence, it is yet not clear whether introgression has played an important role in the evolution of L. intermedia and L.
whitmani. Further work with other genes might help clarify the issue.

Conclusion
Evidence for introgression between L. intermedia and L. whitmani obtained using mitochondrial DNA [4] seems to be corroborated by our data on the period gene, a nuclear marker. Nevertheless, considering that period is potentially involved in reproductive isolation and might be, therefore, less prone to introgression than the "average" gene [43], it is possible that much higher levels of gene flow between the two species occur at other genes. It might, on the other hand, suggest that this behavioral gene, or at least the fragment we analyzed, did not play a role in speciation between L. intermedia and L. whitmani. In fact the same has been suggested for some Drosophila species [44] despite per's role controlling lovesong and mating rhythm differences between D. melanogaster and D. simulans [13][14][15][16].
Although the evidence for introgression in the per gene between L. intermedia and L. whitmani is not overwhelming, it does indicate the need to extend this analysis to other loci in the future. We are currently isolating new molecular markers in the two species to carry out a multilocus approach [39] that might help determining how much variation in gene flow and differentiation there is across the genome of these two very important leishmaniasis vectors.
The progeny of each wild caught female was raised separately according to Souza et al. [45] and only one F1 male of each female was used for the molecular analysis, which included 68 individuals of L. intermedia (12 from Afonso Claudio, 18 from Posse, 20 from Corte de Pedra and 18 from Jacarepaguá) and 51 individuals of L. whitmani (12 from Afonso Claudio, 17 from Posse, 3 from Corte de Pedra and 19 from Ilhéus). Note that, although the distribution of the two species shows considerable overlap in Eastern Brazil, in many localities only one species is found or is far more abundant than the other. There are also seasonal and microhabitat differences in abundance between them in areas of sympatry [2].

DNA methods
Genomic DNA was prepared according to Jowett [46] with slight modifications and the PCR was carried out for 30 cycles at 95°C for 30 sec, 60°C for 30 sec and 72°C for 30 sec, using Abgene, Amersham Biosciences or Biotools reagents according to manufacturers directions. The per primer sequences are: 5llper2: 5'-AGCATCCTTTTGTAG-CAAAC-3' (forward) and 3llper2: 5'-TCAGATGAACTCTT-GCTGTC-3' (reverse). These primers amplify a 486 bp fragment of the sand fly per gene homologue that includes part of the PAS/CLD domain, an intron (58 bp) and the beginning of the per S domain [24]. The amplified fragments were cloned using the pMOSBlue blunt ended cloning kit (Amersham Biosciences) and plasmid DNA preparation was carried out using the "Flexiprep" Kit (Amersham Biosciences). Cloned PCR fragments were sequenced at Fundação Oswaldo Cruz and at University of Leicester using ABI 377 sequencers. With the exception of two L. whitmani individuals from Corte de Pedra (see below), only one sequence of each sand fly (representing one of the two possible alleles) was used in the analysis but an average of three sequences per individual were obtained in order to check possible PCR induced mutations. In addition, PCR fragments were also sequenced directly in some cases for the same reason. In the case of the two L. whitmani mentioned above 6 and 9 clones were sequenced respectively from specimens WCP01 and WCP03 to determine both alleles simply to increase the size of this small sample.
Negative controls were performed for all amplification reactions. In addition, PCR, cloning and sequencing were repeated for two individuals to confirm putative introgressed sequences and to exclude the possibility that they were the result of PCR contamination. Finally, for at least two individuals with putative introgressed sequences, we could define the other allele from additional clones (not included in the analysis), which showed to be typical of the species, indicating no identification problems.