Comparative mitochondrial genomics within and among species of killifish
© Whitehead. 2009
Received: 20 August 2008
Accepted: 13 January 2009
Published: 13 January 2009
Skip to main content
© Whitehead. 2009
Received: 20 August 2008
Accepted: 13 January 2009
Published: 13 January 2009
This study was motivated by the observation of unusual mitochondrial haplotype distributions and associated physiological differences between populations of the killifish Fundulus heteroclitus distributed along the Atlantic coast of North America. A distinct "northern" haplotype is fixed in all populations north of New Jersey, and does not appear south of New Jersey except in extreme upper-estuary fresh water habitats, and northern individuals are known to be more tolerant of hyposmotic conditions than southern individuals. Complete mitochondrial genomes were sequenced from individuals from northern coastal, southern coastal, and fresh water populations (and from out-groups). Comparative genomics approaches were used to test multiple evolutionary hypotheses proposed to explain among-population genome variation including directional selection and hybridization.
Structure and organization of the Fundulus mitochondrial genome is typical of animals, yet subtle differences in substitution patterns exist among populations. No signals of directional selection or hybridization were detected. Mitochondrial genes evolve at variable rates, but all genes exhibit very low dN/dS ratios across all lineages, and the southern population harbors more synonymous polymorphism than other populations.
Evolution of mitochondrial genomes within Fundulus is primarily governed by interaction between strong purifying selection and demographic influences, including larger historical population size in the south. Though directional selection and hybridization hypotheses were not supported, adaptive processes may indirectly contribute to partitioning of variation between populations.
Fundulus heteroclitus is a euryhaline killifish that occupies diverse habitats along steep osmotic and thermal gradients; large populations are found in extreme upper-estuary fresh water sites through brackish estuarine habitats to marine environments, and across a steep latitudinal temperature gradient along the Atlantic coast of North America from Nova Scotia (Canada) to Florida (USA). This species has long been of interest to science in part because of their extraordinary physiological resilience in the face of osmotic stress , and as a model for studying adaptive evolutionary variation along ecological clines . To date, the bulk of evolutionary studies on this species have focused on adaptive variation related to temperature and pollution (see review in Burnett et al. ). However, other studies reveal unusual patterns of mitochondrial haplotype distributions that covary with differences in osmotic habitat  and physiological tolerance to osmotic stress [5, 6]. Together, results from these studies converge on the hypothesis that natural selection has helped shape mitochondrial genomic variation as an adaptive response to physiological challenges posed by life in fresh water habitats. The purpose of this study was to compare mitochondrial genome variation within and among Fundulus species to test this and related hypotheses.
For coastal populations of F. heteroclitus, a phylogeographic break that centers on the Hudson River separates "northern" from "southern" types, with a narrow zone of introgression found in northern New Jersey. This phylogeographic break is supported by data from mitochondrial haplotype variation [4, 7, 8], but also from studies of variation in nuclear proteins [2, 7, 9], microsatellites [10, 11], and gene expression . Few studies in F. heteroclitus have examined adaptive variation associated with potential selective agents other than temperature and pollution, such as strong osmotic gradients in estuarine environments.
Hyposmotic tolerance varies widely among Fundulus species . Though F. heteroclitus is widely tolerant of osmotic extremes , populations are distributed along steep salinity gradients in nature and within-species variation in hyposmotic stress tolerance exists. For example, adult fish from northern populations are more tolerant of hyposmotic transfer than southern populations , and individuals from northern populations have higher reproductive success (fertilization success and larval survival) in low salinities than those from southern populations . Intriguingly, mitochondrial haplotype data show that the "northern" haplotype is never found south of the phylogeographic break, except in extreme upper estuary habitats that are exclusively fresh water . In these southern fresh water habitats the northern haplotype is fixed, but only dozens of miles downstream, after transition to slightly brackish water habitats, the southern haplotype is fixed (unpublished data).
I hypothesize (as have others [4, 6]) that different mitochondrial genotypes are favored by natural selection in different osmotic environments, and that the "northern" haplotype provides an adaptive advantage in hyposmotic habitats. One model that could account for the unusual contemporary distribution of mitochondrial haplotypes is as follows: As Pleistocene glaciers receded, successful invaders of newly emergent northern habitats were likely those that could tolerate low salinities leading to fixation of those genotypes during northward expansion, and those same genotypes remain fixed in southern habitats that are fresh water. It is physiologically plausible that mitochondrial genes may be involved in adaptation to fresh water habitats since hyposmotic acclimation is very energetically demanding [14, 15] requiring the recruitment of mitochondria-rich cells in gill epithelia . An alternate hypothesis is that mitochondrial haplotypes from the exclusively fresh water species F. diaphanus have introgressed into F. heteroclitus populations accounting for the unique haplotype distribution and hyposmotic tolerance in "northern" F. heteroclitus. This possibility has been recognized by others [6, 17] and is plausible since F. diaphanus live exclusively in fresh water, co-occur with F. heteroclitus in fresh water and northern habitats, these two species are known to hybridize where they co-occur, and hybrids are only successful from pairings of female F. diaphanus with male F. heteroclitus [17, 18]. Another alternative is that southern fresh water populations are remnants from a northward expansion of northern haplotypes following Pleistocene glaciation.
To test these hypotheses I sequenced complete mitochondrial genomes (mitogenomes) from northern, southern, and fresh water populations of F. heteroclitus, from F. grandis (the sister taxon to F. heteroclitus) and from F. diaphanus. Mitogenome sequences were used to test for the influence of neutral drift, directional selection, and selective constraint on intraspecific and interspecific variation, and to define phylogenetic relationships among populations and species.
Complete mitochondrial genomes were sequenced from two individuals from each of five taxa including three populations of Fundulus heteroclitus, one population F. grandis (Fgds; Cocodrie, Louisiana) which is the sister species to F. heteroclitus, and one population of F. diaphanus (Fdia; Piscataway Park, Maryland) which is a co-occurring congener known to sometimes hybridize with F. heteroclitus . F. heteroclitus were sampled from Maine (Fhet-ME; north of phylogeographic break, northern haplotype), Georgia (Fhet-GA; south of phylogeographic break, southern haplotype), and from an extreme upper-estuary fresh water habitat (Fhet-MDPP; south of phylogeographic break, northern haplotype). This fresh water F. heteroclitus population is from the Potomac River at Piscataway Park in Maryland. This is the same site as site "26" in Duvernell et al.  and site "PS" in Smith et al. , and is the same site from which F. diaphanus were collected for this study.
Genomic DNA was extracted from fresh livers using Qiagen DNeasy extraction kits, and PCR primers were from Lee et al.  or Miya & Nishida  or specifically designed for this study (see Additional file 1). Long-distance PCR (TaKaRa LA taq) was first used to amplify three overlapping regions covering the complete mitogenome sequence. Long PCR amplifications were performed in 50 μl reactions including 10 μM final concentration of each primer under cycling conditions including 40 cycles of 94°C denaturation for 30-s followed by 15-min annealing/extension at 68°C, with a final 72°C hold for 10-min. Long amplification products were used as the template for a series of 39 overlapping PCR reactions. PCR was performed in 20 μl reactions including 0.25 μM final concentration of each primer under cycling conditions including 5 cycles of 20-s denaturation at 94°C, 15-s annealing at 37°C, and 45-s extension at 72°C, followed by 25 cycles of 20-s denaturation at 94°C, 15-s annealing at 45°C, and 45-s extension at 72°C. Amplification products from both long-distance and nested PCR reactions were electrophoresed on 1% agarose to verify size and cleaned using Ampure magnetic beads (Agencourt Bioscience).
Both strands of PCR products were directly sequenced using Big-Dye Terminator cycle sequencing (Applied Biosystems) in 10 μl reactions including 0.5 μl Big Dye and 0.32 μM final concentration of sequencing primers (forward and reverse PCR primers). The cycling profile consisted of 40 cycles of 15-s denaturation at 94°C, 20-s annealing at 37°C, and 4-min extension at 60°C. Sequencing reaction products were cleaned using CleanSeq magnetic beads (Agencourt Bioscience) before electrophoresis on an Applied Biosystems 3130XL Genetic Analyzer.
A consensus sequence was determined for each individual genome following assembly of individual reads in Gap4 . Complete mitogenome consensus sequences were aligned using ClustalW in MEGA 4.0  and alignments annotated in SEQUIN http://www.ncbi.nlm.nih.gov/Sequin/sequin.hlp.html. Transfer RNAs were identified using tRNAscan  and protein coding genes by alignment with Kryptolebias marmoratus and Gambusia affinis mitogenome sequences (GenBank accession numbers AF283503 and AP004422, respectively) coupled with determination of in-frame start and stop codon positions.
Genome-wide characteristics were enumerated and compared among lineages. These characteristics included genome-wide, gene-specific, and taxon-specific differences in G/C content, synonymous and amino acid replacement substitution rates, transition/transversion substitution ratios, and codon usage bias. Topological predictions of transmembrane segments of electron transport proteins were determined from the consensus prediction from multiple models [24–28] using the server-based software TOPCONS http://topcons.cbr.su.se/ with the sequence of individual Fhet-ME2 as the query sequence. Radical amino acid polymorphisms were identified according to the criteria outlined in Woolley et al. .
Phylogenetic analyses were used to test for possible hybridization between F. diaphanus and co-occurring F. heteroclitus populations. These analyses examined the relationship of Fundulus populations and species (Acanthopterygii; Atherinomorpha; Cyprinodontiformes; Fundulidae) to each other and to other closely related fishes for which complete mitogenomes exist including Cyprinodon rubrofluviatilis (Acanthopterygii; Atherinomorpha; Cyprinodontiformes; Cyprinodontidae), Gambusia affinis (Acanthopterygii; Atherinomorpha; Cyprinodontiformes; Poeciliidae), Kryptolebias marmoratus (Acanthopterygii; Atherinomorpha; Cyprinodontiformes; Rivulidae), Oryzias latipes (Acanthopterygii; Atherinomorpha; Beloniformes), Cololabis saira (Acanthopterygii; Atherinomorpha; Beloniformes), Cypselurus hiraii (Acanthopterygii; Atherinomorpha; Beloniformes), Exocoetus volitans (Acanthopterygii; Atherinomorpha; Beloniformes), and Mugil cephalus (Acanthopterygii; Mugilomorpha) (GenBank accession numbers EF442803, AP004422, AF283503, AP004421, AP002932, AB182653, AP002933, and AP002930, respectively). Protein-encoding gene sequences were concatenated and aligned using ClustalW and Bayesian methods were used to reconstruct phylogenetic relationships using MrBayes 3  with third codon positions excluded (Nst = 6, Rrates = Invgamma, Ngen = 10,000,000, sampling every 10 gen, nchains = 8, nruns = 2, burnin = 0.25, outgroup = M. cephalus, all other parameters set at default values). These analyses were run on the parallel version of MrBayes 3.1.2 compiled on the LONI high performance computing system at LSU http://www.hpc.lsu.edu/loni/.
Fundulus mitochondrial genomes (16,524–16,531 bp long; GenBank:FJ445394–FJ445403) are typical in that they contain the identical gene complement (22 transfer RNA genes, 2 ribosomal RNA genes, and 13 protein-coding genes) and gene order as found in most vertebrate mitochondrial genomes. The control region is long (865–870 bp) with many conserved blocks and found between tRNA-Pro and tRNA-Phe. The origin of light strand replication is a short highly conserved block (36 bp) found between tRNA-Asn and tRNA-Cys. Gene start sites are typically immediately downstream of the end of the previous gene, and several genes start before the end of the preceding gene. For example, the end of ATPase8 and beginning of ATPase6 overlap 10 bp, ND4L and ND4 overlap 7 bp, and the ends of ND5 and ND6 (coded on the opposite strand) overlap 4 bp. Indeed, the genomes are highly compact as only 73 bp of non-coding intergenic spacer DNA is found. All genes are encoded on the heavy chain except for ND6 and eight tRNA genes (tRNA-Gln, tRNA-Ala, tRNA-Asn, tRNA-Cys, tRNA-Tyr, tRNA-Ser (UCN), tRNA-Glu, and tRNA-Pro).
Only five amino-acid replacement mutations are fixed exclusively in the northern lineage (including Maryland fresh water and Maine individuals: green lineage in Figure 3). Two occur in ND1 (Pro→Ser at amino acid site 73, Thr→Ala at site 264) and one in each of ND2 (Ala→Thr at site 344), ATPase8 (Glu→Asp at site 42), and ND5 (Ile→Val at site 555). All ND subunits are located in the membrane-bound portion of the NADH dehydrogenase complex; ND1 is thought to be located at the junction between the matrix arm and membrane-embedded arm, and ND2 and ND5 are located in the membrane-embedded arm where proton pumping occurs . ATPase subunit 8 is a core component of the F0 complex of ATP synthase and may serve an important role in subunit assembly . Though these substitutions could represent candidate loci for directional selection, statistical support is not strong. Branch-site tests indicate low posterior probabilities (less than 50%) for a model of positive selection in the northern lineage for each of these five sites. However, given the relatively limited phylogenetic sampling included in this study (only five taxa) the branch-sites tests may have insufficient power to detect positive selection , and more thorough phylogenetic sampling within Fundulus may increase statistical support for positive selection at these candidate loci. Only one amino acid replacement is shared between F. diaphanus and the northern populations of F. heteroclitus (Ile→Val at site 592 at edge of inner membrane region in ND5), offering little support of convergent mutation conferring hyposmotic tolerance between these fresh water tolerant and co-occurring groups. Over 160 synonymous mutations distinguish northern from southern populations of F. heteroclitus, though no lineage-specific codon usage biases were evident.
Comparisons of fixed and variable synonymous and replacement mutations between each F. heteroclitus population and F. grandis allow calculation of the neutrality index (NI) which is the ratio of synonymous fixed sites to synonymous polymorphic sites divided by the ratio of replacement fixed sites to replacement polymorphic sites (Figure 3) . Note that only two genomes were sequenced per taxon, so rates of within-group polymorphism are certainly underestimated but should not be biased among taxa being compared. NI values greater than 1.0 are common for animal mitochondrial genes [33, 45], where amino acid variation is higher within taxa than expected relative to variation between taxa. This observation rejects neutral expectations, and excess amino acid polymorphism may reflect either the influence of balancing selection or selection against slightly deleterious mutation. All F. heteroclitus populations show a general pattern of NI values significantly greater than 1.0, though this pattern is most striking in the Maine population which is the only population in which the NI remains significantly greater than 1.0 after Bonferroni adjustment of p-values. This observation is consistent with the slightly deleterious model since historical population sizes are considered much larger in Georgia compared to northern populations [9, 11], and excess deleterious mutation should be harder to detect in larger populations because of the proportionately greater efficiency of purifying selection. These results are consistent with NI values from Drosophila taxa of different historical population sizes , but do not reject the alternative explanation that balancing selection may be maintaining functional polymorphism in the Maine population.
Georgia individuals harbor twice the number of synonymous mutations compared to northern populations (Figures 3 and 6), again likely reflecting much larger historical population size in which synonymous mutations are more likely to be maintained but mildly deleterious replacement mutations are efficiently purged. Indeed, synonymous polymorphic sites are consistently fewer across genes within the Maine population compared to the Georgia population (Figure 6), and chi-squared tests of 2 × 4 contingency tables indicate that the Georgia mutation ratios are significantly different from those of both northern populations (p > 0.01 for both comparisons), but northern ratios are not different from each other (p = 0.77).
In agreement with very low dN/dS ratios, branch-site tests indicate that over 95% of substitutions appear to be evolving according to a model of strong selective constraint (substitutions with posterior probabilities > 90% for site class 0 using BEB in CODEML) across lineages. In the lineage leading to F. heteroclitus northern clade only 33 sites (out of 3,813 possible sites) reject the evolutionary model of strong selective constraint, and all 33 of these sites best fit a model of neutral evolution. Similarly, the vast majority of substitutions along all other branches in the phylogeny best fit a model of strong selective constraint.
Polymorphism in transcriptional control regions may affect protein expression levels, and therefore be phenotypically important. Presumed functional domains within the teleost control region, such as conserved sequence blocks (CSB) in the central conserved region, CSB-II, and CSB-III (as identified in Lee et al. ), are all highly conserved across Fundulus species. The only mutation unique to northern populations within these regions is at the beginning of the central conserved region (C→T at mitogenome position 15958 in F. heteroclitus ME-2), but the functional importance of this polymorphism has yet to be determined.
Together, these lineage-specific patterns of substitution, dN/dS ratios, and branch-site tests offer little support for the influence of direct selection on mitogenome evolution within Fundulus heteroclitus. Though five amino acid substitutions are unique to the northern clade, lineage-specific mitogenome differences more likely reflect differences in historical population size. Low dN/dS ratios coupled with results from branch-site tests indicate the influence of strong selective constraint, and an excess of synonymous polymorphisms within the Georgia population (relative to northern populations and relative to replacement polymorphisms within Georgia) is consistent with larger historical population size. Nuclear data also indicate greater genetic diversity in southern versus northern populations of F. heteroclitus [9, 11]. Larger populations can harbor more purely neutral mutations and are more efficient at purging slightly deleterious mutations [47, 48]. These results support the model that mitogenome evolution in killifish is primarily governed by interaction between strong selective constraint and demographic parameters, which is consistent with broader patterns of mitogenome evolution in other organisms including flies , copepods , rats , and humans .
Phylogenetic analyses of concatenated mitogenome protein sequences indicate that within Fundulus the Maine population and the Maryland fresh water population mitogenome types are very similar, all three F. heteroclitus populations are monophyletic and form a clade with sister species F. grandis, and F. diaphanus is the outgroup (Figure 1). Posterior probabilities for branching nodes were high, and sequence from two nuclear genes supports the same topology (data not shown). Broader analysis of all complete mitochondrial genomes (concatenated proteins) that are currently available for the clade indicate that cyprinodontiformes and beloniformes fishes are reciprocally monophyletic (Figure 1), in contrast to the findings reported in Miya et al. . Furthermore, Fundulus taxa appear to be relatively recently derived among the cyprinodontiformes lineages. These data provide no evidence for hybridization between fresh water F. heteroclitus and F. diaphanus in the upper Chesapeake estuary, and thus does not contribute to explaining the unusual distribution of mitochondrial genotypes in extreme upper-estuary habitats.
These comparative data illuminate the relative influence of evolutionary forces that shape mitochondrial genome variation in killifish, but do not offer strong support for hypotheses proposed to explain unusual distributions of mitochondrial types in nature. Patterns of synonymous and amino acid replacement substitutions and branch-site models of codon substitution provide little evidence that directional selection has directly influenced mitochondrial genome sequence variation between southern, northern, and fresh water populations of F. heteroclitus. However, five amino acid replacements and one mutation in a conserved and presumably functional block of the control region are unique to the northern clade and may be candidates for further study. Also, since all F. heteroclitus mitogenomes form a monophyletic group with F. diaphanus as the out-group, there is no evidence for introgression with F. diaphanus in northern and fresh water sites. Substitution patterns indicate the important role of purifying selection governing mitochondrial genome evolution, and the most apparent lineage-specific patterns of variation within F. heteroclitus is an excess of synonymous relative to replacement polymorphism in the Georgia population (Figure 3). This could reflect a larger historical population size, in which synonymous (purely neutral) polymorphisms are retained but slightly deleterious replacement variants are purged.
Since these data do not support two explicit hypotheses, the evolutionary mechanisms responsible for maintaining "northern" mitochondrial types in southern fresh water habitats remains unresolved. One possible explanation is that fresh water types are relicts from an historical northward expansion following Pleistocene glaciation. An alternate explanation is that southern types that reside in brackish habitats are competitively excluded from contributing to the gene pool in nearby fresh water habitats where northern types reside. We know that transition from fresh water to only slightly brackish demands considerable physiological adjustment . We also know that northern types have higher fitness  and better osmotic compensatory ability  than southern types in fresh water. One hypothetical mechanism keeping gene pools distinct may be through breakdown of intergenomic coadaptation where unmatched nuclear-mitochondrial hybrids exhibit lower fitness (reviewed in Rand et al. ). One way to test the competitive exclusion and relict population hypotheses would be to densely sample populations along the salinity cline to test for sharp inflections (indicating competitive exclusion) or gradual transitions (indicating introgression) in allele frequencies across the ecological boundary between fresh water and slightly brackish habitats. This work is ongoing in the Whitehead laboratory.
Special thanks to Dr. Douglas Crawford (University of Miami) and Dr. Mark Batzer (Louisiana State University) for access to sequencing facilities, and to Dr. Scott Herke (Louisiana State University) and Ms. Jen Roach (Louisiana State University) for assistance with sequencing. Earlier drafts of the manuscript benefitted from helpful comments from Dr. Michael Hellberg (Louisiana State University), and reviews from Dr. David Rand (Brown University) and a second anonymous reviewer. This research was supported by NSF grants BES-0652006 and EF-0723771 to AW.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.