- Methodology article
- Open Access
Novel intron markers to study the phylogeny of closely related mammalian species
© Igea et al; licensee BioMed Central Ltd. 2010
- Received: 4 May 2010
- Accepted: 30 November 2010
- Published: 30 November 2010
Multilocus phylogenies can be used to infer the species tree of a group of closely related species. In species trees, the nodes represent the actual separation between species, thus providing essential information about their evolutionary history. In addition, multilocus phylogenies can help in analyses of species delimitation, gene flow and genetic differentiation within species. However, few adequate markers are available for such studies.
In order to develop nuclear markers that can be useful in multilocus studies of mammals, we analyzed the mammalian genomes of human, chimpanzee, macaque, dog and cow. Rodents were excluded due to their unusual genomic features. Introns were extracted from the mammalian genomes because of their greater genetic variability and ease of amplification from the flanking exons. To an initial set of more than 10,000 one-to-one orthologous introns we applied several filters to select introns that belong to single-copy genes, show neutral evolutionary rates and have an adequate length for their amplification. This analysis led to a final list of 224 intron markers randomly distributed along the genome. To experimentally test their validity, we amplified twelve of these introns in a panel of six mammalian species. The result was that seven of these introns gave rise to a PCR band of the expected size in all species. In addition, we sequenced these bands and analyzed the accumulation of substitutions in these introns in five pairs of closely related species. The results showed that the estimated genetic distances in the five species pairs was quite variable among introns and that this divergence cannot be directly predicted from the overall intron divergence in mammals.
We have designed a new set of 224 nuclear introns with optimal features for the phylogeny of closely related mammalian species. A large proportion of the introns tested experimentally showed a perfect amplification and enough variability in most species, indicating that this marker set can be very helpful in multilocus phylogenetics of mammals. Due to the lower variability and stronger stochasticity of nuclear markers with respect to mitochondrial genes, studies should be designed to make use of several markers like the ones designed here.
- Species Pair
- ENSEMBL Database
- Snow Leopard
- Tandem Repeat Finder
- Gray Mouse Lemur
Phylogenetic analyses of closely related species are affected by specific problems that are different from those present in phylogenies of more distant species . The first and most obvious difficulty is that nucleotide sequences must have enough variability within and among the studied species to obtain an adequate resolution of the phylogenetic tree. For this reason, mitochondrial genes, which show a high rate of nucleotide substitution, have been the main choice for the reconstruction of phylogenetic trees at the genus and family levels in all animals [2, 3]. Another distinctive feature of the phylogenetic reconstruction of closely related species is that gene coalescence and the stochasticity associated with population genetic processes must be taken into account. For example, incomplete lineage sorting may cause the gene tree to have a different topology than the species tree [4–6]. This leads to the necessity of using multiple, unlinked genes, together with the integration of coalescent-based techniques, in the reconstruction of the species tree [7–10]. Phylogenies based on a broad representation of the genome can also help in species delimitation or analyses of genetic variability. Therefore, in these approaches it is essential to make use of the nuclear genome, where there are plenty of unlinked genes. However, nuclear genes in animals usually have much lower evolutionary rates than mitochondrial genes and, sometimes, they are not informative enough for assessing the variability within species or for phylogenetic reconstruction. In addition, due to the larger population size of nuclear genes with respect to mitochondrial genes, the former are more affected by coalescent stochasticity, so the necessity of using multiple genes is stronger [11, 12]. Thus, the progress of multilocus phylogenetics requires, as a first step, an important effort of developing unlinked nuclear markers able to provide enough differences within and among species .
Systematic efforts to find novel markers for phylogenetic studies have been performed in different groups of organisms [14–21]. Markers selected include exons, introns and intergenic regions. Exons show very little variability for the phylogenies of closely related species, whereas intergenic regions present difficulties for the design of primers of wide-specificity. On the other hand, introns are a part of the genome with large nucleotide variability and, at the same time, they can be easily amplified with primers placed in the adjacent exons [20, 22, 23]. Thus, they are ideal candidates for multilocus phylogenies of closely related species. However, not all introns are equally valid for this task. Some introns are highly conserved due to their involvement in certain functions and they may not show enough differences between closely related species [24, 25]. In fact, introns show a wide range of evolutionary rates. For example, a comparison of human and mouse introns showed genetic distances that ranged between 0.2 and 1.7 substitutions/site . In addition, some introns show disparate rates of evolution in some lineages but not in others , indicating an imperfect molecular clock that may affect the measurement of genetic diversity in some species. Different processes, such as a change in chromosome position of the gene or the development of a new isoform by alternative splicing in certain lineages , may cause such variations in the evolutionary dynamics and thus in the molecular clock of the introns.
In mammals, introns have been mainly used to resolve deep groupings but also to study more recent phylogenies [29–33]. However, no attempt has been made so far to systematically select an optimal set of introns for mammals. In this work, we have devised a protocol to extract the best introns from the complete mammalian genomes of five species: human, chimpanzee, macaque, dog and cow [34–39]. We deliberately did not use the available genomes of rodents (mouse and rat) because they have genomic features that would have made the comparisons of all mammals problematic. For example, rodents have very attenuated isochors and show very fast evolutionary rates when compared to other mammals [40, 41]. This is also true for introns, as previously shown . The high evolutionary rates of rodent introns can complicate the alignments, phylogenetic reconstructions and measurement of genetic distances. To avoid the same problems we also decided not to include marsupials and monotremes. Thus, in this work we concentrated on the analysis of non-rodent eutherian genomes. From these genomes we obtained one-to-one orthologues, constructed alignments and trees from each individual intron and filtered out introns with inadequate features for shallow phylogenies. In addition, introns were selected to come from single-copy genes in order to avoid multiple bands in PCR reactions, to have an adequate length for PCR amplification, and to be surrounded by exons with enough space for primer design. From the resulting introns, we selected a small set that we used to experimentally verify that they work according to the expected features of ease of amplification and high evolutionary rate. Finally, we studied the variability of intron divergence in different species pairs.
Intron set acquisition
Most of these introns had an inadequate length for their amplification and were therefore discarded in our initial filtering processes. First, we restricted the intron length in Homo sapiens to a minimum of 200 and a maximum of 1600 nucleotides. Second, we controlled for intron length conservation among the five species used in this study, taking human as a referent and relaxing the constraints as the phylogenetic distance between the compared species increased. After the application of these size filters, 2750 introns remained. In addition, the introns flanked by small exons (<40 nucleotides) were discarded because the design of PCR primers could be difficult in them. This filter affected 5.2% of the introns available in the previous step.
Mammalian genomes have a large number of duplicated genes due to different gene or genome duplication processes . These genes constitute a severe problem because primers could hybridize in a large number of genomic places, giving rise to multiple PCR products. Thus, a crucial step was to check for the duplication of the introns using BLAST over the different genomes. In fact, we performed the BLAST searches using the flanking exons instead of the introns because the exonic sequences are easier to detect by BLAST, even in divergent genes . Only those introns that had both flanking exons present just once in the five genomes were considered to be single-copy. This step ruled out approximately half of the remaining introns. At this point, there were 1344 introns with a preliminary orthology filter that can be used to study different evolutionary processes. However, for these introns to be most useful in phylogenetic studies of closely related species we applied several additional filters.
The next step was designed to eliminate the most conserved introns, which could be involved in some function and are therefore not variable enough for the phylogeny of closely related species. To do this, we employed two different measures of divergence. First, we calculated with the Ktreedist software the scaling factor from each intronic phylogenetic tree to the global genomic tree. This measure allowed us to discard the introns that showed the slowest overall evolutionary rates. Second, we calculated another measure of divergence, the primates total branch length, by using the corresponding alignments and trees with only the three primate species (human, chimpanzee and macaque). This measure is more accurate and less affected by alignment imperfections possibly generated when comparing sequences that are too divergent. We then selected the introns that had a scaling factor lower than 1.15 and a primates total branch length higher than 0.075 substitutions per position. This rendered a dataset of 225 introns that excluded the most conserved ones.
Finally, a visual inspection step was performed to detect any poorly aligned sequences caused by wrong annotations or clearly incorrect orthology assignments. Only one intron had to be eliminated in this step. The resulting final set was thus composed of 224 new phylogenetic markers (Figure 1). In the additional file 1 we show all relevant information for each marker. This includes the alignments of both the intron and the flanking exons, the phylogenetic tree of the five mammals constructed with the intron sequences, and the genomic location and function of the gene to which the intron belongs. The examination of this information allows the selection of optimal markers for specific purposes and the design of exon primers with different degrees of specificity. Interestingly, one of the introns in our final set turned out to be intron 1 of transthyretin (TTR-1), which is one of the most widely used introns in mammalian phylogenetics [44–48]. To our knowledge, no other markers in our dataset have been used so far.
Analysis of genomic features of the new set of introns
Repetitive sequences present in the intron set were analyzed with the RepeatMasker software. Of the 224 introns, 163 were found to contain repetitive elements. The most frequent elements were SINEs, which were present in 70% of the introns that bore any kind of element, followed by LINEs, which were found in 26% of them. Introns with repeats had, on average, 18% of their sequence corresponding to repetitive elements. These types of repetitive sequences normally evolve in a neutral way, mostly by point mutations, and therefore their presence is not a problem for sequence-based analysis methods that assume a normal nucleotide substitution model. If all species in the alignment have the repeat (which is the normal situation in closely related species) this fragment can be used normally as any other sequence. When only some species have the repetitive element (as it happened often in our set of five mammalian species), we observed that alignment programs do not have problems dealing with these repetitive elements, and therefore the alignments can also be used for further sequence or phylogenetic analyses.
More problematic are microsatellites (defined here as two or more contiguous, approximate copies of a pattern of 1 to 6 nucleotides ). Since microsatellites evolve by a slippage mechanism instead of by point mutations , they cannot be used with phylogenetic or coalescence methods that assume a normal point substitution model. We checked for the presence of microsatellites with the Tandem Repeats Finder program and found that 43 of the 224 introns had at least one microsatellite in some species. Microsatellites are, in general, specific for particular lineages, and their presence in one species does not mean that the intron cannot be used in other taxa. Therefore, introns with some microsatellites were not eliminated from the data set. However, when these sequences are found within the introns in a species to be studied it may be better not to use this intron and rather select a new one from the set designed here. Alternatively, the positions belonging to the microsatellites can be eliminated from the alignment for further processing with sequence analysis programs that assume a point mutation mechanism of evolution.
Analysis of genetic distances and single nucleotide polymorphisms
To compare the mean intronic genetic distance with the divergence of cytochrome b, which is the most popular marker for mammalian phylogenetics , we obtained the cytochrome b distance between human and chimpanzee. To do this, we measured the patristic distance in a maximum-likelihood phylogenetic tree similarly reconstructed with human, chimpanzee and macaque cytochrome b sequences obtained from GenBank . The resulting distance was 0.169 substitutions/position, which implies that the final set of 224 introns has, on average, 12.1 times less divergence than the mitochondrial cytochrome b gene. Furthermore, alignments containing the exons flanking the 224 introns were constructed for the three primates, maximum likelihood trees were built as above, and the resulting trees were used to calculate the distance between Homo sapiens and Pan troglodytes for both the upstream and downstream exons of each intron. The resulting mean genetic divergence was 0.006 substitutions/position, which implies that the 224 selected introns have, on average, 2.3 more divergence than their corresponding flanking exons.
Experimental validation of the newly developed phylogenetic markers: Primer design and PCR amplification
The intron filtering processes carried out above were designed to select a set of optimal introns for the phylogeny of closely related mammalian species. However, many unidentified factors may affect the amplification of these introns in different species. To experimentally test the real performance of these introns and the validity of the designed bioinformatic analysis we randomly selected twelve introns among the largest ones in the final data set for sequencing in different mammals.
In order to design primers of wide spectrum, the flanking exonic sequences already gathered (human, chimpanzee, macaque, cow and dog) were complemented by (whenever possible) a few others from the ENSEMBL database: horse (Equus caballus), Sumatran orangutan (Pongo abelii), little brown bat (Myotis lucifugus), European hedgehog (Erinaceus europaeus), domestic cat (Felis catus), Northern tree shrew (Tupaia belangeri), gray mouse lemur (Microcebus murinus), African bush elephant (Loxodonta africana), lesser hedgehog tenrec (Echinops telfairi), nine-banded armadillo (Dasypus novemcinctus) and common shrew (Sorex araneus). These genomes mostly corresponded to low-coverage genome projects but many of them were valid for certain exons. Exonic alignments, containing between 11 and 15 different species, were constructed. These alignments allowed us to design the 3' end of every primer in the most conserved part of the exonic alignment. Degenerate bases (up to a limit of 48 per primer) were used to make the primers suitable for as many different mammalian species as possible.
Introns selected for amplification and sequencing in six species, and final outcome
Length (H. sapiens)
Primer sequences (Forward and Reverse)
We can conclude that a large part of the markers tested were valid, with a minimum optimization, for a wide variety of mammals. Of course, the success rate will vary depending on the taxa of choice, but, to increase the chances that a primer works, it is important to take into account that this optimization may be necessary in all pilot studies and may include the adjustment of the hybridization temperature or the modification of the primer specificity with a different degree of degeneracy. In our experimental test of selected introns, we designed primers of broad specificity so that they could be used in a wide range of mammalian groups, but primers intended for specific taxa need not be based on such a variety of species and can have less degenerate bases. In addition to the species tested here, selected to include pairs of closely related species, we have successfully amplified other intron markers from our data set in other mammals, mainly belonging to Erinaceomorpha and Soricomorpha, as part of ongoing studies. They also produced the expected PCR band and the corresponding sequence, indicating again the usefulness of this set of introns.
Use of the selected introns as markers for the phylogeny of closely related species
The sequenced introns were added to the introns already downloaded for human, chimpanzee, macaque, dog and cow, and this set was complemented by three additional genomes that had information for the seven successful introns: the Sumatran orangutan (Pongo abelii), the horse (Equus caballus) and the little brown bat (Myotis lucifugus). This extended species set allowed us to assess the variability of our introns in five pairs of closely related mammalian species, which comprised four different mammalian families: Mustelidae, which included Mustela nivalis and M. putorius, with an estimated divergence time of 2.8 million years ; Felidae, with Panthera tigris and Uncia uncia, which diverged 2.9 million years ago ; Hominidae, which included two pairs, namely, Pongo pygmaeus and P. abelii, with 3.8 million years of estimated divergence , and Homo sapiens and Pan troglodytes, with 6 million years of estimated divergence ; and Vespertilionidae, with Myotis lucifugus and M. escalerai, which diverged 12.2 million years ago .
All of the sequenced alleles for the seven introns were different in both members of every pair of closely related species, except, surprisingly, the PNPO-3 sequence obtained for Pongo pygmaeus, which was identical to the one downloaded from ENSEMBL for P. abelii. This can be due to several reasons such as a recent introgression, the existence of a constraint in the evolution of this particular marker in the orangutan lineage or the fact that, by mere chance, no mutations have accumulated in any of the two species since their recent divergence. All other intron pairs showed one or several substitutions between the species pairs selected, thus providing useful information for phylogenetic reconstruction.
The development of multilocus phylogenetics requires the availability of a large number of adequate markers for different taxa. In this study, we have designed 224 intronic markers with optimal conditions for the phylogeny of closely related mammalian species (excluding rodents, marsupials and monotremes). Among the important criteria used to select these markers were that they belonged to single-copy genes, were not highly conserved, and did not show disparate rates of evolution in different lineages. The experimental validation of these introns showed that, after some optimizations, they could be amplified in different mammalian species, yielding a single PCR band. In addition, an analysis of the genetic distances estimated in several pairs of closely related mammalian species revealed that introns may show different degrees of divergence in different pairs, and that it is not possible to predict which introns will be more variable in each group. In any case, it may be necessary to carry out initial pilot studies with several introns to decide which ones perform best for each species or group of closely related species. In addition, the use of several introns will reduce the stochasticity associated to mutational and coalescence processes, which is particularly large for nuclear sequences. The availability of a large set of introns like the one provided here will greatly facilitate this work.
Databases and filters
The following mammalian genomes were downloaded from the ENSEMBL database  in GenBank format: human (Homo sapiens) version NCBI 36 [36, 39], chimpanzee (Pan troglodytes) version Pan_troglodytes-2.1 , rhesus macaque (Macaca mulatta) version Mmul_1 , dog (Canis familiaris) version CanFam2.0  and cow (Bos taurus) version Btau_3.1 . These genomes are sequenced with high coverage and thoroughly annotated. There were also several low-coverage genomes available, but their lack of many annotated features made their use in studies like the present one very inconvenient [64, 65]. The genome of the marsupial Monodelphis domestica  was also available and well annotated at the time this study was performed but we decided not to include it to avoid problems with alignments and phylogenetic reconstruction due to its large distance from the other mammalian species used. In addition, the mouse and rat genomes, which are also very divergent due to their peculiar genomic features [40, 41], were also excluded from this study.
We developed a pipeline of Perl scripts to extract all the introns and exons of each mammalian genome (Figure 1). At the same time, features like the gene description and the genomic location and length of every intron and exon were stored. In this initial step, introns of more than 50,000 nucleotides were discarded.
The ENSEMBL database includes orthology information obtained from a phylogenetic analysis, which makes the determination of the orthology relationships more accurate . Therefore, we used this information to determine preliminary one-to-one orthologous genes between human, chimpanzee, macaque, dog and cow. To achieve this, we downloaded orthology tables containing the whole list of one-to-one orthologous genes between pairs of species. By crossing the information of the human-chimpanzee and human-macaque pairs we obtained a set of one-to-one orthologous genes for primates. From the table of dog and cow we obtained the corresponding set of orthologues for laurasiatherians. Finally, we crossed these two tables to obtain a set of putative one-to-one orthologous genes for the five considered species. The corresponding lists of introns were then assembled from this table. In addition, only those introns belonging to genes with equal numbers of introns and whose relative position inside the corresponding gene was conserved in every compared species were considered to be true one-to-one orthologues.
The size variation of each intron across the studied species was also controlled. To do this, the maximum difference in length allowed between human and chimpanzee, human and macaque, human and cow, human and dog, and cow and dog was established at 10%, 20%, 100%, 100% and 50%, respectively.
Multiple alignments of the orthologous intronic sequences were built using MAFFT version 5.8 . Gblocks version 0.91  was used with relaxed parameters  to discard poorly aligned positions. Maximum-likelihood phylogenetic trees were reconstructed using PhyML version 2.4.4, with the GTR model of evolution and four substitution rate categories .
To detect duplicated genes, BLAST searches  of the pair of flanking exons were performed for every intron. For each species, the exons were used as query against its own genome. The e-value limit was set at 10-4. We selected only the introns with exons that produced a single hit (itself) against one region of the genome. Since each genome is subdivided into large fragments, normally chromosomes, we checked that every single hit was also composed of a single subalignment and therefore corresponded to a single exon.
A reference tree reflecting the global genomic evolution was constructed using all single-copy introns. To do so, the 1344 intron alignments that passed the BLAST filter were concatenated to generate an alignment of 768,745 positions. This alignment was used to build a maximum-likelihood phylogenetic tree using RaxML version 2.4.4, which can handle big alignments . Phylogenetic trees of every individual intron were compared to this reference tree using the K tree score implemented in the Ktreedist program .
Repetitive sequences present in the final set of introns were analyzed with the program RepeatMasker version 3.2.8 using the corresponding library for each species . The intron sequences were also scanned for microsatellites with Tandem Repeats Finder version 4.04  using the following program parameters: 2 7 7 80 10 50 6. Moreover, the presence of human SNPs in these introns was analyzed using information obtained from the ENSEMBL database. To do this, the whole list of annotated intronic SNPs of the NCBI 36 version of the Homo sapiens genome was downloaded using the Biomart platform , and the corresponding SNPs were mapped in our set of introns.
Samples and laboratory procedures
Samples of six mammalian species spanning different orders were obtained. The chosen species were the Iberian mouse-bat (Myotis escalerai); the Bornean orangutan (Pongo pygmaeus), obtained from the INPRIMAT Consortium and the Biomedical Primate Research Centre in Rijswijk; the snow leopard (Uncia uncia) and the tiger (Panthera tigris), obtained from the Animal Tissue Bank of Catalunya; and the least weasel (Mustela nivalis) and the European polecat (Mustela putorius), obtained from Collection of Tissues and DNA of the Museo Nacional de Ciencias Naturales.
DNA extractions were performed using the DNEasy Blood and Tissue Kit (QIAGen), following manufacturer's instructions. PCR reactions were carried out in 25 μl of final reaction volume, containing 50 - 100 ng of template DNA, 0.5 - 1 μM of each primer, and 0.75 units of Promega GoTaq. Amplification conditions were as follows: initial denaturation at 95°C for 5 minutes, followed by 28 to 35 cycles comprising 30 seconds of denaturation at 95°C, 30 seconds of annealing at 52-65°C (this temperature turned out to be highly variable among different species and introns and had to be adjusted for each particular case) and extension at 72°C for 30-60 seconds. The resulting PCR products were loaded in 1% agarose gels stained with SYBR Safe DNA Gel Stain (Invitrogen). Bands were manually excised from the gel and subsequently purified using Illustra GFX PCR and DNA Gel Band Purification Kit (GE Healthcare). The resulting products were then sequenced using different sequencing services. Both heterozygous point mutations and indels that caused difficulties in assigning a clear sequence to each allele of the individuals were observed in several sequences. Therefore, we cloned each PCR product into the pstBlue-1 vector (Invitrogen) to be able to isolate one of the alleles. The plasmid containing one of the alleles was isolated from the bacterial cultures, purified using the GenElute Plasmid Miniprep Kit (Sigma Aldrich), and subsequently sequenced. Resulting sequences were analyzed using Geneious version 4.5.5 . To correct the errors induced by the polymerase and evidenced in the cloning process, both the PCR product and its corresponding cloned sequence were assembled. Intron sequences have been deposited in GenBank under accession numbers HM147892-HM147933.
Phylogenetic analyses of the new markers
For each successfully amplified and sequenced intron, a new multiple sequence alignment was created containing the six newly obtained sequences plus further sequences from ENSEMBL. As above, MAFFT was used to build these alignments, which were then cleaned using Gblocks with the same parameters as above. We constructed the phylogenetic trees using PhyML with the GTR model of evolution and four substitution rate categories. In addition, for each of these introns we analyzed the divergence for five pairs of closely related species. To do this, we constructed new pairwise intronic alignments for each species pair using MAFFT as above, but without applying Gblocks, and estimated maximum-likelihood distance using PAUP  with the GTR substitution model.
We thank the collection of tissues and DNA of the Museo Nacional de Ciencias Naturales (CSIC), the EBD-CSIC's scientific collections, the Animal Tissue Bank of Catalunya (BTAC) of the Universidad Autónoma de Barcelona (UAB), and Dr. Ernst J Verschoor of the INPRIMAT Consortium and the Biomedical Primate Research Centre in Rijswijk for samples. We also thank M. Dolors Piulachs and people in her lab for technical advice. JI is recipient of a pre-doctoral research fellowship (I3P) from CSIC. JC is supported by grant number CGL2008-00434/BOS from the Plan Nacional I+D+I of the Ministerio de Ciencia e Innovación (Spain).
- Arbogast BS, Edwards SV, Wakeley J, Beerli P, Slowinski JB: Estimating divergence times from molecular data on phylogenetic and population genetic timescales. Annu Rev Ecol Syst. 2002, 33: 707-740. 10.1146/annurev.ecolsys.33.010802.150500.View ArticleGoogle Scholar
- Brown WM, George M, Wilson AC: Rapid evolution of animal mitochondrial DNA. Proc Natl Acad Sci USA. 1979, 76 (4): 1967-1971. 10.1073/pnas.76.4.1967.PubMed CentralView ArticlePubMedGoogle Scholar
- Johns GC, Avise JC: A comparative summary of genetic distances in the vertebrates from the mitochondrial cytochrome b. Mol Biol Evol. 1998, 15: 1481-1490.View ArticlePubMedGoogle Scholar
- Takahata N: Gene genealogy in three related populations: consistency probability between gene and population trees. Genetics. 1989, 122 (4): 957-966.PubMed CentralPubMedGoogle Scholar
- Maddison WP: Gene trees in species trees. Syst Biol. 1997, 46: 523-536.View ArticleGoogle Scholar
- Nichols R: Gene trees and species trees are not the same. Trends Ecol Evol. 2001, 16 (7): 358-364. 10.1016/S0169-5347(01)02203-0.View ArticlePubMedGoogle Scholar
- Liu L, Pearl DK, Brumfield RT, Edwards SV: Estimating species trees using multiple-allele DNA sequence data. Evolution. 2008, 62 (8): 2080-2091. 10.1111/j.1558-5646.2008.00414.x.View ArticlePubMedGoogle Scholar
- Carstens BC, Knowles LL: Estimating Species Phylogeny from Gene-Tree Probabilities Despite Incomplete Lineage Sorting: An Example from Melanoplus Grasshoppers. Syst Biol. 2007, 56 (3): 400-411. 10.1080/10635150701405560.View ArticlePubMedGoogle Scholar
- Degnan JH, Rosenberg NA: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol. 2009, 24 (6): 332-340. 10.1016/j.tree.2009.01.009.View ArticlePubMedGoogle Scholar
- Brito PH, Edwards SV: Multilocus phylogeography and phylogenetics using sequence-based markers. Genetica. 2009, 135 (3): 439-455. 10.1007/s10709-008-9293-3.View ArticlePubMedGoogle Scholar
- Moore WS: Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution. 1995, 49 (4): 718-726. 10.2307/2410325.View ArticleGoogle Scholar
- Zink RM, Barrowclough GF: Mitochondrial DNA under siege in avian phylogeography. Mol Ecol. 2008, 17 (9): 2107-2121. 10.1111/j.1365-294X.2008.03737.x.View ArticlePubMedGoogle Scholar
- Zhang DX, Hewitt GM: Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Molecular ecology. 2003, 12 (3): 563-584. 10.1046/j.1365-294X.2003.01773.x.View ArticlePubMedGoogle Scholar
- Townsend TM, Alegre RE, Kelley ST, Wiens JJ, Reeder TW: Rapid development of multiple nuclear loci for phylogenetic analysis using genomic resources: an example from squamate reptiles. Mol Phylogenet Evol. 2008, 47 (1): 129-142. 10.1016/j.ympev.2008.01.008.View ArticlePubMedGoogle Scholar
- Thomson RC, Shedlock AM, Edwards SV, Shaffer HB: Developing markers for multilocus phylogenetics in non-model organisms: A test case with turtles. Mol Phylogenet Evol. 2008, 49 (2): 514-525. 10.1016/j.ympev.2008.08.006.View ArticlePubMedGoogle Scholar
- Ranwez V, Delsuc F, Ranwez S, Belkhir K, Tilak MK, Douzery EJ: OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics. BMC Evol Biol. 2007, 7: 241-10.1186/1471-2148-7-241.PubMed CentralView ArticlePubMedGoogle Scholar
- Peng Z, Elango N, Wildman DE, Yi SV: Primate phylogenomics: developing numerous nuclear non-coding, non-repetitive markers for ecological and phylogenetic applications and analysis of evolutionary rate variation. BMC Genomics. 2009, 10: 247-10.1186/1471-2164-10-247.PubMed CentralView ArticlePubMedGoogle Scholar
- Li C, Ortí G, Zhang G, Lu G: A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study. BMC Evol Biol. 2007, 7: 44-10.1186/1471-2148-7-44.PubMed CentralView ArticlePubMedGoogle Scholar
- Kimball RT, Braun EL, Barker FK, Bowie RCK, Braun MJ, Chojnowski JL, Hackett SJ, Han KL, Harshman J, Heimer-Torres V, Holznagel W, Huddleston CJ, Marks BD, Miglia KJ, Moore WS, Reddy S, Sheldon FH, Smith JV, Witt CC, Yuri T: A well-tested set of primers to amplify regions spread across the avian genome. Mol Phylogenet Evol. 2009, 50 (3): 654-660. 10.1016/j.ympev.2008.11.018.View ArticlePubMedGoogle Scholar
- Aitken N, Smith S, Schwarz C, Morin PA: Single nucleotide polymorphism (SNP) discovery in mammals: a targeted-gene approach. Molecular ecology. 2004, 13 (6): 1423-1431. 10.1111/j.1365-294X.2004.02159.x.View ArticlePubMedGoogle Scholar
- Backström N, Fagerberg S, Ellegren H: Genomics of natural bird populations: a gene-based set of reference markers evenly spread across the avian genome. Molecular ecology. 2008, 17 (4): 964-980. 10.1111/j.1365-294X.2007.03551.x.View ArticlePubMedGoogle Scholar
- Creer S, Malhotra A, Thorpe R, Pook C: Targeting optimal introns for phylogenetic analyses in non-model taxa: experimental results in Asian pitvipers. Cladistics. 2005, 21 (4): 390-395. 10.1111/j.1096-0031.2005.00072.x.View ArticleGoogle Scholar
- Hellborg L, Ellegren H: Y chromosome conserved anchored tagged sequences (YCATS) for the analysis of mammalian male-specific DNA. Molecular ecology. 2003, 12 (1): 283-291. 10.1046/j.1365-294X.2003.01702.x.View ArticlePubMedGoogle Scholar
- Rodova M, Islam MR, Peterson KR, Calvet JP: Remarkable sequence conservation of the last intron in the PKD1 gene. Mol Biol Evol. 2003, 20 (10): 1669-1674. 10.1093/molbev/msg191.View ArticlePubMedGoogle Scholar
- Gazave E, Marqués-Bonet T, Fernando O, Charlesworth B, Navarro A: Patterns and rates of intron divergence between humans and chimpanzees. Genome Biol. 2007, 8 (2): R21-10.1186/gb-2007-8-2-r21.PubMed CentralView ArticlePubMedGoogle Scholar
- Castresana J: Estimation of genetic distances from human and mouse introns. Genome Biol. 2002, 3: research0028.0021-0028.0027. 10.1186/gb-2002-3-6-research0028.View ArticleGoogle Scholar
- Soria-Carrasco V, Talavera G, Igea J, Castresana J: The K tree score: quantification of differences in the relative branch length and topology of phylogenetic trees. Bioinformatics. 2007, 23 (21): 2954-2956. 10.1093/bioinformatics/btm466.View ArticlePubMedGoogle Scholar
- Plass M, Eyras E: Differentiated evolutionary rates in alternative exons and the implications for splicing regulation. BMC Evol Biol. 2006, 6: 50-10.1186/1471-2148-6-50.PubMed CentralView ArticlePubMedGoogle Scholar
- Bardeleben C, Moore RL, Wayne RK: A molecular phylogeny of the Canidae based on six nuclear loci. Molecular Phylogenetics and Evolution. 2005, 37 (3): 815-831. 10.1016/j.ympev.2005.07.019.View ArticlePubMedGoogle Scholar
- Castillo A, Cortinas M, Lessa E: Rapid diversification of South American tuco-tucos (Ctenomys; Rodentia, Ctenomyidae): contrasting mitochondrial and nuclear intron sequences. J Mammal. 2005, 86 (1): 170-179. 10.1644/1545-1542(2005)086<0170:RDOSAT>2.0.CO;2.View ArticleGoogle Scholar
- DeBry R, Seshadri S: Nuclear intron sequences for phylogenetics of closely related mammals: an example using the phylogeny of Mus. J Mammal. 2001, 82 (2): 280-288. 10.1644/1545-1542(2001)082<0280:NISFPO>2.0.CO;2.View ArticleGoogle Scholar
- Gilbert C, Ropiquet A, Hassanin A: Mitochondrial and nuclear phylogenies of Cervidae (Mammalia, Ruminantia): Systematics, morphology, and biogeography. Mol Phylogenet Evol. 2006, 40 (1): 101-117. 10.1016/j.ympev.2006.02.017.View ArticlePubMedGoogle Scholar
- Yu L, Zhang YP: Phylogeny of the caniform carnivora: evidence from multiple genes. Genetica. 2006, 127 (1-3): 65-79. 10.1007/s10709-005-2482-4.View ArticlePubMedGoogle Scholar
- Bovine Genome Sequencing and Analysis Consortium, Elsik CG, Tellam RL, Worley KC: The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science. 2009, 324 (5926): 522-528. 10.1126/science.1169588.View ArticleGoogle Scholar
- Chimpanzee Sequencing and Analysis Consortium: Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005, 437 (7055): 69-87. 10.1038/nature04072.View ArticleGoogle Scholar
- International Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature. 2001, 409 (6822): 860-921. 10.1038/35057062.View ArticleGoogle Scholar
- Lindblad-Toh K, et al: Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005, 438 (7069): 803-819. 10.1038/nature04338.View ArticlePubMedGoogle Scholar
- Rhesus Macaque Genome Sequencing and Analysis Consortium: Evolutionary and biomedical insights from the rhesus macaque genome. Science. 2007, 316 (5822): 222-234. 10.1126/science.1139247.View ArticleGoogle Scholar
- Venter JC, et al: The sequence of the human genome. Science. 2001, 291 (5507): 1304-1351. 10.1126/science.1058040.View ArticlePubMedGoogle Scholar
- Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420 (6915): 520-562. 10.1038/nature01262.View ArticleGoogle Scholar
- Rat Genome Sequencing Project Consortium: Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004, 428 (6982): 493-521. 10.1038/nature02426.View ArticleGoogle Scholar
- Zhou Y, Mishra B: Quantifying the mechanisms for segmental duplications in mammalian genomes by statistical analysis and modeling. Proc Natl Acad Sci USA. 2005, 102 (11): 4051-4056. 10.1073/pnas.0407957102.PubMed CentralView ArticlePubMedGoogle Scholar
- Alba MM, Castresana J: On homology searches by protein Blast and the characterization of the age of genes. BMC Evol Biol. 2007, 7: 53-10.1186/1471-2148-7-53.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaubert P, Cordeiro-Estrela P: Phylogenetic systematics and tempo of evolution of the Viverrinae (Mammalia, Carnivora, Viverridae) within feliformians: implications for faunal exchanges between Asia and Africa. Mol Phylogenet Evol. 2006, 41 (2): 266-278. 10.1016/j.ympev.2006.05.034.View ArticlePubMedGoogle Scholar
- Ingram CM, Burda H, Honeycutt RL: Molecular phylogenetics and taxonomy of the African mole-rats, genus Cryptomys and the new genus Coetomys Gray, 1864. Mol Phylogenet Evol. 2004, 31 (3): 997-1014. 10.1016/j.ympev.2003.11.004.View ArticlePubMedGoogle Scholar
- Rowe DL, Honeycutt RL: Phylogenetic relationships, ecological correlates, and molecular evolution within the cavioidea (mammalia, rodentia). Mol Biol Evol. 2002, 19 (3): 263-277.View ArticlePubMedGoogle Scholar
- Steiner C, Tilak MK, Douzery EJ, Catzeflis FM: New DNA data from a transthyretin nuclear intron suggest an Oligocene to Miocene diversification of living South America opossums (Marsupialia: Didelphidae). Mol Phylogenet Evol. 2005, 35 (2): 363-379. 10.1016/j.ympev.2004.12.013.View ArticlePubMedGoogle Scholar
- Yu L, Li QW, Ryder OA, Zhang YP: Phylogeny of the bears (Ursidae) based on nuclear and mitochondrial genes. Mol Phylogenet Evol. 2004, 32 (2): 480-494. 10.1016/j.ympev.2004.02.015.View ArticlePubMedGoogle Scholar
- Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27 (2): 573-580. 10.1093/nar/27.2.573.PubMed CentralView ArticlePubMedGoogle Scholar
- Tautz D, Trick M, Dover GA: Cryptic simplicity in DNA is a major source of genetic variation. Nature. 1986, 322 (6080): 652-656. 10.1038/322652a0.View ArticlePubMedGoogle Scholar
- Castresana J: Cytochrome b phylogeny and the taxonomy of great apes and mammals. Mol Biol Evol. 2001, 18 (4): 465-471.View ArticlePubMedGoogle Scholar
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank. Nucleic Acids Res. 2009, D26-31. 10.1093/nar/gkn723. 37 DatabaseGoogle Scholar
- Koepfli KP, Deere KA, Slater GJ, Begg C, Begg K, Grassman L, Lucherini M, Veron G, Wayne RK: Multigene phylogeny of the Mustelidae: resolving relationships, tempo and biogeographic history of a mammalian adaptive radiation. BMC Biol. 2008, 6: 10-10.1186/1741-7007-6-10.PubMed CentralView ArticlePubMedGoogle Scholar
- Johnson WE, Eizirik E, Pecon-Slattery J, Murphy WJ, Antunes A, Teeling E, O'Brien SJ: The late Miocene radiation of modern Felidae: a genetic assessment. Science. 2006, 311 (5757): 73-77. 10.1126/science.1122277.View ArticlePubMedGoogle Scholar
- Raaum RL, Sterner KN, Noviello CM, Stewart CB, Disotell TR: Catarrhine primate divergence dates estimated from complete mitochondrial genomes: concordance with fossil and nuclear DNA evidence. J Hum Evol. 2005, 48 (3): 237-257. 10.1016/j.jhevol.2004.11.007.View ArticlePubMedGoogle Scholar
- Kumar S, Filipski A, Swarna V, Walker A, Hedges SB: Placing confidence limits on the molecular age of the human-chimpanzee divergence. Proc Natl Acad Sci USA. 2005, 102 (52): 18842-18847. 10.1073/pnas.0509585102.PubMed CentralView ArticlePubMedGoogle Scholar
- Stadelmann B, Lin LK, Kunz TH, Ruedi M: Molecular phylogeny of New World Myotis (Chiroptera, Vespertilionidae) inferred from mitochondrial and nuclear DNA genes. Mol Phylogenet Evol. 2007, 43 (1): 32-48. 10.1016/j.ympev.2006.06.019.View ArticlePubMedGoogle Scholar
- Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS: Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science. 2001, 294 (5550): 2348-2351. 10.1126/science.1067179.View ArticlePubMedGoogle Scholar
- Murphy WJ, Pevzner PA, O'Brien SJ: Mammalian phylogenomics comes of age. Trends Genet. 2004, 20 (12): 631-639. 10.1016/j.tig.2004.09.005.View ArticlePubMedGoogle Scholar
- Springer MS, Stanhope MJ, Madsen O, de Jong WW: Molecules consolidate the placental mammal tree. Trends Ecol Evol. 2004, 19 (8): 430-438. 10.1016/j.tree.2004.05.006.View ArticlePubMedGoogle Scholar
- Waddell PJ, Kishino H, Ota R: A phylogenetic foundation for comparative mammalian genomics. Genome informatics International Conference on Genome Informatics. 2001, 12: 141-154.PubMedGoogle Scholar
- Waddell PJ, Okada N, Hasegawa M: Towards resolving the interordinal relationships of placental mammals. Systematic Biology. 1999, 48 (1): 1-5. 10.1093/sysbio/48.1.1.View ArticlePubMedGoogle Scholar
- Edwards SV, Beerli P: Perspective: gene divergence, population divergence, and the variance in coalescence time in phylogeographic studies. Evolution. 2000, 54 (6): 1839-1854.PubMedGoogle Scholar
- Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Overduin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E: Ensembl 2007. Nucleic Acids Res. 2007, 35: D610-D617. 10.1093/nar/gkl996.PubMed CentralView ArticlePubMedGoogle Scholar
- Green P: 2x genomes--does depth matter?. Genome Res. 2007, 17 (11): 1547-1549. 10.1101/gr.7050807.View ArticlePubMedGoogle Scholar
- Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke S, Garber M, Gentles AJ, Goodstadt L, Heger A, Jurka J, Kamal M, Mauceli E, Searle SM, Sharpe T, Baker ML, Batzer MA, Benos PV, Belov K, Clamp M, Cook A, Cuff J, Das R, Davidow L, Deakin JE, Fazzari MJ, Glass JL, Grabherr M, Greally JM, Gu W, Hore TA, Huttley GA, Kleber M, Jirtle RL, Koina E, Lee JT, Mahony S, Marra MA, Miller RD, Nicholls RD, Oda M, Papenfuss AT, Parra ZE, Pollock DD, Ray DA, Schein JE, Speed TP, Thompson K, VandeBerg JL, Wade CM, Walker JA, Waters PD, Webber C, Weidman JR, Xie X, Zody MC, Graves JA, Ponting CP, Breen M, Samollow PB, Lander ES, Lindblad-Toh K: Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature. 2007, 447 (7141): 167-177. 10.1038/nature05805.View ArticlePubMedGoogle Scholar
- Katoh K, Kuma K, Toh H, Miyata T: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005, 33 (2): 511-518. 10.1093/nar/gki198.PubMed CentralView ArticlePubMedGoogle Scholar
- Castresana J: Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000, 17 (4): 540-552.View ArticlePubMedGoogle Scholar
- Talavera G, Castresana J: Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007, 56 (4): 564-577. 10.1080/10635150701472164.View ArticlePubMedGoogle Scholar
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52 (5): 696-704. 10.1080/10635150390235520.View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006, 22 (21): 2688-2690. 10.1093/bioinformatics/btl446.View ArticlePubMedGoogle Scholar
- Smit AFA, Hubley R, Green P: RepeatMasker Open-3.0. 1996-2004. 2004, [http://repeatmasker.org]Google Scholar
- Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A: BioMart--biological queries made easy. BMC Genomics. 2009, 10: 22-10.1186/1471-2164-10-22.PubMed CentralView ArticlePubMedGoogle Scholar
- Drummond AJ, Ashton B, Cheung M, Heled J, Kearse M, Moir R, Stones-Havas S, Thierer T, Wilson A: Geneious v4. 2009, [http://geneious.com]Google Scholar
- Swofford DL: PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. 2003, Sunderland, Massachusetts.: Sinauer AssociatesGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.