Shifting patterns of natural variation in the nuclear genome of caenorhabditis elegans
© Solorzano et al; licensee BioMed Central Ltd. 2011
Received: 10 December 2010
Accepted: 16 June 2011
Published: 16 June 2011
Genome wide analysis of variation within a species can reveal the evolution of fundamental biological processes such as mutation, recombination, and natural selection. We compare genome wide sequence differences between two independent isolates of the nematode Caenorhabditis elegans (CB4856 and CB4858) and the reference genome (N2).
The base substitution pattern when comparing N2 against CB4858 reveals a transition over transversion bias (1.32:1) that is not present in CB4856. In CB4856, there is a significant bias in the direction of base substitution. The frequency of A or T bases in N2 that are G or C bases in CB4856 outnumber the opposite frequencies for transitions as well as transversions. These differences were not observed in the N2/CB4858 comparison. Similarly, we observed a strong bias for deletions over insertions in CB4856 (1.44: 1) that is not present in CB4858. In both CB4856 and CB4858, there is a significant correlation between SNP rate and recombination rate on the autosomes but not on the X chromosome. Furthermore, we identified numerous significant hotspots of variation in the CB4856-N2 comparison.
In both CB4856 and CB4858, based on a measure of the strength of selection (ka/ks), all the chromosomes are under negative selection and in CB4856, there is no difference in the strength of natural selection in either the autosomes versus X or between any of the chromosomes. By contrast, in CB4858, ka/ks values are smaller in the autosomes than in the X chromosome. In addition, in CB4858, ka/ks values differ between chromosomes.
The clear bias of deletions over insertions in CB4856 suggests that either the CB4856 genome is becoming smaller or the N2 genome is getting larger. We hypothesize the hotspots found represent alleles that are shared between CB4856 and CB4858 but not N2. Because the ka/ks ratio in the X chromosome is higher than the autosomes on average in CB4858, purifying selection is reduced on the X chromosome.
All variation found in natural populations is ultimately derived from evolutionary forces acting upon mutation and recombination. By studying genome-wide patterns of variation among closely related species, it is possible to make inferences about the evolutionary process of that organism. Although the model organism Caenorhabditis elegans has been well studied in biology, our knowledge of the evolution of natural populations of C. elegans is limited . In the past, molecular variation in C. elegans has been studied using assays involving single mitochondrial and nuclear genes , transposons [3, 4], amplified fragment length polymorphisms (aflps), and microsatellites [5–8].
In this study, we investigate genetic variation in three strains of C. elegans: N2, CB4856, and CB4858 using large-scale DNA sequence datasets [9–11]. N2 (from Bristol, England) is the most commonly used lab strain. It is the strain for which the genome sequence has been completely determined. CB4856 is a strain of C. elegans from Hawaii while CB4858 is a strain of C. elegans from Pasadena, CA. Strains CB4856 and CB4858 were isolated in 1972 and 1971, respectively. Since then, they have been cultured in the lab. In their C. elegans population genomics study, Rockman and Kruglyak  distinguished 41 haplotypes among 125 wild isolates in their genotype data from 1460 N2-CB4856 SNPs. They indicated that CB4858 shares haplotype 20 with strains from other localities, including AB2-4, from Adelaide, Australia, and CB4855, from Palo Alto, California. The genotypic similarity among CB4858, CB4855, and AB2-4 suggests they may share an ancestor in a laboratory.
Previous analysis of multiple Caenorhabditis lines shows extensive variation in nuclear single-nucleotide polymorphism (SNP) density , with the Hawaiian strain CB4856 having the highest SNP density and most unique SNPs . Additional polymorphisms in CB4856 were found by Swan et al. , and analysis of the distribution of these polymorphisms by Cutter et al.  revealed that the polymorphisms occurred more frequently in the arms than in the core in chromosomes one to five, but are randomly distributed in chromosome X.
The patterns of natural variation found in a genome are the result of mutational events that are influenced by both drift and selection. Under neutral theory, a majority of mutations that arise spontaneously are neutral, and their fate (loss or fixation) is primarily determined by drift. In rare cases, a spontaneous mutation will confer a selective advantage (positive selection) or disadvantage (purifying selection). If by chance this mutation is not lost by drift, in consequent generations the mutation will spread throughout the population, and variation at linked sites will either accumulate or be purged. The combination of neutral mutations fixed through drift and the mutations that are accumulated or purged determine the genomic landscape of the organism . By surveying the patterns of natural variation between the closely related Caenorhabditis strains CB4856 and CB4858, it is possible to investigate the evolutionary processes of drift and selection occurring in this lineage.
Although previous studies provide insights into C. elegans SNP patterns and densities in the natural isolates, a large-scale genomic analysis of the C. elegans polymorphic spectrum has not been accomplished until now. Other studies of C. elegans have not focused on examining the patterns of variation that we present here, but have instead focused on other important patterns. For example, Weber et al.  reported the genome sequence of the LSJ1 (Bristol) strain and compared it to the reference wild type N2 (Bristol). LSJ1 is a sibling of the reference wild type N2. The authors indicated that comparing the genomes of N2 and LSJ1 highlight genetic changes associated with lab domestication and errors in the C. elegans reference genome. They confirmed that LSJ1 and N2 are derived from the same Bristol isolate. In addition, they identified 3246 predicted differences between the LSJ1 and N2 genome sequences. Finally, by sequencing two recently isolated wild C. elegans strains, CB4856 (Hawaii) and ED3054 (Kenya), they inferred the ancestral state of the original Bristol isolate. All of this makes CB4856 an extremely valuable resource not only for C. elegans gene mapping  but also for the study of the distribution and spectrum of Caenorhabditis SNPs.
Graustein et al.  compared the levels of DNA sequence polymorphism among three closely related species in the genus Caenorhabditis: two self-fertilizing species, C. elegans and C. briggsae, and one cross-fertilizing species, C. remanei. Jovelin et al.  analyzed nucleotide variation within and between species across two olfactory pathways in Caenorhabditis to examine the relationship between evolutionary rate and gene function and their position within the pathway. The authors found that transcription factors specifying chemosensory neuron subtype identity exhibit higher levels of nucleotide divergence and polymorphism than the structural genes in the signaling pathways. They found much higher levels of polymorphism within C. remanei than within the related species C. elegans and C. briggsae. Their findings suggest behavioral differences between species in response to olfactory stimuli are likely to involve differences in gene regulation due to the divergence of regulatory genes.
We report the entire spectrum of events critical for understanding the fundamental divergence patterns of these natural isolates. In addition, we compare our results to the N2 MA line results from Denver et al.  because we wish to compare the differences in mutation patterns when selection is present (in our study) versus when it is nearly absent (in MA lines).
Results and Discussion
Overview of SNP variation
Overview of genome wide differences between N2 and CB4856 or CB4858
CB4856 × 10-4
SNPs per base
CB4858 × 10-4
SNPs per base
The total number of bases assayed in CB4856 was 4658922 (4.6465% of the C. elegans reference genome) while the number of SNPs observed in CB4858 was based on the assay of 77% of the C. elegans reference genome . Even though 4.6465% is a small portion of the C. elegans reference genome, we assume that the sequences used to obtain the polymorphisms are randomly selected from the entire genome. It is therefore reasonable to compare the results of CB4856 to the results of CB4858.
The base substitution/indel ratio is 0.6883:1 (5402/7848) for CB4856 and 7.3369:1 (37132/5061) for CB4858 (Table 1). Two observations suggest that the differences in the observed base substitution/indel ratios in CB4846 vs CB4858 do not stem from read length differences in the two assays. First, clusters of SNPs (see CB4856 Statistically Significant Hotspots) do not contribute to a large fraction of the total variation. Therefore, sequential indels which may be undetectable using short-read technology do not greatly influence the base substitution/indel ratio. Second, when we limit the comparisons to single base substitutions and indels which are fully assayable using short read technology in CB4858, we find that the discrepancy between these two natural isolates remains (Table 1). Based on the above observations, the CB4856 SNPs have a higher proportion of indels than the CB4858 SNPs. These observations are consistent with those reported by Denver et al.  in which C. elegans MA lines showed a base substitution/indel ratio of 0.7059:1 (12/17).
Either the CB4856 genome is getting smaller or the N2 genome is getting larger
Among the SNPs predicted in the comparison of CB4856 and N2, we observed a strong and significant bias for deletions (4627) over insertions (3221). This bias was reflected in single event indels as well as larger indels (see Table 1 and Additional File 1). Although the direction in which the indel occurred is unknown, this pattern differs from the CB4858 ratio (1:1), suggesting two possibilities: (1) the CB4856 genome is becoming smaller or (2) the N2 genome is getting larger because it fails to purge insertions. This result differs from the baseline mutation spectrum in C. elegans MA lines in Denver et al. . They observed a high frequency of indels where the mutations appeared to be biased toward insertions. There are two possible explanations for this observation: (1) there is a change in the mutation spectrum between N2 and CB4856 or (2) selection constraints between N2 and CB4856 have changed.
Sequential vs single SNPs
Sequential SNPs can arise from single mutational events involving multiple base pairs or as a consequence of individual events at adjacent positions. There is a larger proportion of sequential SNPs in CB4856 compared to CB4858 (Table 1). The high proportions of observed sequential base substitutions [3.6% (476/13360)] and indels [24% (3206/13360)] in CB4856 are not consistent with expectations based on a hypothesis of independent adjacent polymorphisms. This suggests a large fraction of the adjacent SNPs result from mutations involving multiple adjacent base pairs or that the adjacent SNPs are not independent. The distribution of the sizes for sequential SNPs, however, suggests that the occurrence of sequential SNPs is negatively correlated with the number of bases involved.
Due to the limitations of short read sequence alignment, the analysis between N2 and CB4858 is limited to sequential base substitutions, indels, and putative inversions with a length of 2 bp. In particular, inversions are a special subclass of sequential changes that occur through a single event. We identified all size classes of inversions in CB4856 and in CB4858 (Additional File 2). These account for only a small fraction [0.82% (110/13360) in CB4856 and 0.18% (76/42269) in CB4858] of all SNPs and do not affect the overall conclusions of our study. We note that in CB4858, the inversions may be downwardly biased due to the CB4858 sequencing method.
Shifting patterns of Base substitution
Base substitutions (bs) in CB4856 and CB4858 relative to the N2 reference genome*.
Bayesian Intervals for difference in rates between different types of transitions and transversions in CB4856 and CB4858.
CB4856 Base sub
95% Bayesian Interval for difference using MCMC
no sign difference
no sign difference
CB4858 Base sub
95% Bayesian Interval for difference using MCMC
no sign difference
no sign difference
no sign difference
no sign difference
no sign difference
no sign difference
The results shown here differ from the expectation of a significant mutation bias from G:C to A:T and G:C to T:A based on mutation accumulation lines . Because the base substitution pattern is consistent with the N2/CB4858 comparison, but different in CB4856, the change is likely either a change in the mutational mechanisms or shifting selective forces. CB4856 is a prime example of base compositional differences (percentage of the CB4856 genome that consists of the various nucleotides A, T, C, G) that can occur within a species.
CB4856/CB4858 SNP rate of chromosomes
SNP rate by chromosome in CB4856 and CB4858
CB4856/CB4858 arms versus core SNP rate
Comparison of arms vs core SNP rates in CB4856 and CB4858*
SNP rate comparison
overall arms vs core across chromosomes
arms across chromosomes
core across chromosomes
arms vs core in each chromosome
There are many possible reasons why the SNP rates differ between the arms and the core in both CB4856 and CB4858. The first possibility is that there are more genes in the core. Since purifying selection removes mutations from coding regions, if the gene density is higher in the core, fewer SNPs will occur there [9, 20, 22]. The second possibility is that there may be more recombination away from the core. If recombination is mutagenic , higher rates of recombination will increase SNP density. On the other hand, recombination interrupts linkage disequilibrium reducing the efficacy of selection. As stated in Cutter et al. , chromosome arms have moderate to high rates of crossover recombination and low gene density and cores have strikingly little recombination despite comprising nearly half of each autosome. Rockman and Kruglyak  and Barnes et al.  provided evidence of these observations in their studies. Finally, there may be an inherent difference in mutation rates in the arms and the core.
As observed in this paper and by Stein et al. , there is a difference in the rate of SNPs in the arms vs the core. Observations by Denver et al  have suggested that the baseline rate of mutation in MA lines is not increased in the arms of autosomes. This suggests that selection, not mutation, plays a larger role in the biased physical distribution of polymorphisms. We then conclude that the reason the SNP rate differs in the arms and core regions is primarily due to natural selection.
Recombination and SNP rate
Relationship between SNP rate and recombination rate in CB4856 and CB4858*
The discrepancy with the X chromosome can be explained by the X chromosomes' many functional differences from the autosomes . Recombination is almost non-existent in the right terminal arm of the X chromosome . Furthermore, sex is determined by X chromosome dosage and this dosage is compensated in hermaphrodites . Observations of correlation between SNP rate and recombination on the autosomes in both CB4856 and CB4858 have been proposed to either reflect an inherent mutagenic effect of recombination or conditions where selection results in the loss (negative selection) or fixation (positive selection) of new mutations along with all linked polymorphisms. The data in this study shows us that in the autosomes, the observed levels of polymorphisms are positively correlated with levels of recombination in both isolates.
Using the chromosomal domains in Rockman and Kruglyak , we found that 1) in the arms, the recombination rate is negatively correlated with chromosome size, although the correlation was not significant (p = 0.303003); 2) the two tips of all autosomes and the X chromosome have an effectively zero recombination; 3) the C. elegans autosomes and the X chromosome have nearly constant recombination rates within each chromosomal domain (the slope of the recombination rate over the chromosomal domains is 0). Findings 2) and 3) agree with the findings in Rockman and Kruglyak  but finding 1) is different from the finding in Rockman and Kruglyak .
CB4856 Hotspot analysis
The level of SNP variation across the genome in any comparison reflects two possible mechanisms. A high frequency of local polymorphisms can be a consequence of high localized mutation rate or a mutational hotspot. Alternatively, localized patterns of divergence can reflect the age of the allele and historic patterns of recombination. As a consequence of recombination and interbreeding of the ancestors that gave rise to these isolates, each allele can have a unique history of coalescence.
To further investigate the pattern of variation across the C. elegans genome and specifically identify regions of significantly high localized polymorphisms, we identified clusters of SNPs with a significant number of polymorphisms (Additional File 3). In general, these hotspots are DNA segments that have an unusual number of polymorphisms considering the length of the DNA (in bp). A more detailed description of the hotspots is given in the Methods section. These hotspots of polymorphisms may be mutational hotspots or may indicate old alleles. We excluded the CB4858 datasets due to the inherent inability of short reads to represent loci with highly clustered polymorphisms.
Summary Statistics of CB4856 Statistical Hotspots per Chromosome
Hotspots Number of hotspots
Average Number of bases
Average Number of Events
Proportion of Base Substitutions
Proportion of Nonsequential Indels
Proportion of Sequential Indels
Proportion of Inversions
Distribution of CB4856 Hotspots
Length of hotspot (bp)
Number of Events
Proportion of Base substitutions
Proportion of non sequential indels
Proportion of sequential indels
Proportion of Inversions
Total Number of Small and Large Hotspots in CB4856 per Chromosome
Small (<100 bp)
Large (> 100 bp)
In some taxa, it has been shown that the rate of mutation at specific loci is strongly correlated with GC content [26–31]. To test this possibility, we assayed the GC content in all clusters and compared the %GC content in the clusters to the N2 genome as a whole (%GC).
There is sufficient evidence to indicate the %GC content in the statistical hotspots differs from the %GC content in the N2 genome as a whole (p-value <2.2e-16). The %GC content in different statistical hotspots is displayed in Additional File 4. Some of the hotspots have a higher %GC content than N2 and some of them have a lower %GC content than N2.
We also compared these hotspots of polymorphism to mutations identified in a genome wide analysis of N2 mutation accumulation lines . Denver et al  found a random distribution of mutations in their MA lines. The 391 mutations identified among ten N2 derived MA lines did not fall within any of the hotspots identified here. Finally, in a comparison of CB4856 and CB4858, there were 37 SNPs that were shared between CB4856 and CB4858 (Additional File 5), which is larger than the expected number of SNPs shared between CB4856 and CB4858 (approximately 7).
We hypothesize that the hotspots of polymorphisms in this study are alleles that are shared between CB4856 and CB4858 but not by N2 and are not mutational hotspots because (1) the %GC content in the hotspots differs from the %GC content in N2 and (2) because the SNPs found in CB4856 vs N2 were not in the MA lines.
Distribution of SNPs by function
Frequencies of SNPs by function and chromosome.
Frequencies SNPs by function in CB4856 and CB4858 across the genome.
Mean k-values and p-values testing if ka/ks = 1 in each chromosome*
p-value for ka/ks = 1
p-value for ka/ks = 1
In CB4856, the autosomes have ka/ks values that are not significantly different from X in either direction (p-value = 0.3924). In addition, when considering all of the chromosomes, the ka/ks values are not significantly different in the different chromosomes in either direction (p-value = 0.2343). However, the SNP rate is lower in chromosome X (0.002406) compared to the autosomes (0.00289) in CB4856 (p < 2.2e-16).
Differences in the length or the number of cell divisions in the male germline compared to the hermaphrodites in CB4856 may explain the lower SNP rate in the X chromosome . When the X chromosome is passed through the male germline, it is in a hemizygotic state and exposed to increased purifying selection . In addition, the mitotic cell cycle length has been shown to be shorter in males than hermaphrodite germ cells . Taken together, both of these factors can contribute to a lower SNP rate in the X chromosome than in the autosomes in CB4856. These observations are dependent upon how often males are involved in reproduction from generation to generation. While direct observations of C. elegans populations show a low frequency of males and low levels of heterozygosity, it is possible that some lineages of Caenorhabditis have historically undergone much more outbreeding involving males .
CB4858 chromosomes whose ka/ks values are significantly different and their p-values
II vs IV
II vs V
III vs IV
III vs V
IV vs X
Because the ka/ks ratio in the X chromosome is higher than the autosomes on average in CB4858, purifying selection appears to be reduced on the X chromosome in this strain compared to CB4856. The relative difference in ka/ks ratios in CB4858 could reflect differences in the life history of CB4858. We suggest that males (XO) could contribute to a smaller degree in natural populations of CB4858 than in CB4856 populations, resulting in a lower rate of protein sequence evolution for X-linked mutations in the Hawaiian isolate CB4856. Previous studies have presented varying levels of the genetic contribution of males in different natural isolates of Caenorhabditis elegans [5, 6, 38, 39].
This study provides a large scale genomic portrait of the C. elegans polymorphisms in CB4856 and CB4858. The patterns observed here give us insight into the evolutionary spectrum of the Caenorhabditis lineage. There is a clear bias of deletions over insertions in CB4856, suggesting two possibilities. The first one is that the CB4856 genome is becoming smaller. Other studies have argued that possible reasons that genomes could become smaller in general are gene loss, natural selection, or the occurrence of long terminal repeat (LTR) sequences of transposable elements . The second possibility is that the N2 genome is getting larger because it fails to purge insertions. Interestingly, direct sequencing of C. elegans insertion deletion rates in mutation accumulation lines  observe a bias towards insertions. Taken together, we conclude that CB4856 is evolving under different mutation or selection pressures than N2 or CB4858.
Our results show that in both CB4856 and CB4858, SNP rates are higher in the chromosome arms than in the chromosome cores in agreement with previous studies . Direct genome sequencing of C. elegans mutation accumulation lines  show that the mutation rate is not higher in arm regions compared to the core regions. We hypothesize SNP rates differ in the core regions of N2, CB4856, and CB4858 in our study due to higher levels of purifying selection in the core regions.
We found several statistically significant hotspots of polymorphisms in our CB4856-N2 comparison. Some of the CB4856 hotspots contained significantly higher levels of G/C when compared to the genome average, consistent with observations of higher rates of deamination at methylated cytosines [26–28, 30, 31]. None of these hotspots coincided with C. elegans mutations identified in Denver et al. , and are likely to be older alleles shared between CB4856 and CB4858 but not N2.
The ka/ks ratios in both CB4856 and CB4858 are consistent with the effect of purifying selection on SNPs in most protein coding regions. The differences in ka/ks in the autosomes vs the X chromosome in CB4858 suggest selection may be more efficient for X-linked genes. In addition, these differences may be attributed to male-driven evolution of the X chromosomes. We hypothesize that natural populations of CB4858 strains have fewer males than CB4856 populations, resulting in a lower rate of protein sequence evolution for X-linked mutations. The finding in CB4858 of a significant difference in selection on the X chromosome compared to the autosomes is consistent with a biological role for males in the lineage giving rise to CB4858, consistent with previous published results [5, 6, 38, 39].
Pipeline for analysis of CB4856 data
The reads for CB4856 were trimmed for quality using the default parameters of LUCY , and individually assembled at 90% identity using the AMOS  reference assembler against the WS170 C. elegans assembly. To reduce the amount of false calls based upon paralogy, each of the polymorphic sites and 70 flanking base pairs was extracted from the reference. These sequences were BLASTed reciprocally at E-5 (~85%) against the genome, and any position that matched multiple sites in the genome was removed from the analysis. The CB4858 base calls followed an identical procedure to remove false polymorphic calls based upon paralogy.
Due to the low coverage of CB4856 reads, it was necessary to set a stringent cutoff of 40 sanger quality score (.0001 base call error rate). CB4856 deletions were scored based on the lowest neighboring quality score. The coding context of both CB4856 and CB4858 polymorphisms were determined using the WS170 General Feature Format (gff).
Discovery of the statistical hotspots of polymorphisms in CB4856
We did a three step procedure to finalize our statistical hotspots. First, we found initial clusters of SNPs. Then, we decided on an optimal window size to consider for our preliminary hotspots. Finally, we used Gibbs sampling to determine the final statistically significant hotspots.
Initial Clusters of SNPs
We initially considered windows of SNPs ranging from 50 bp to 600 bp in increments of 50 bps. For each window, we calculated the mean and standard deviation of the number of SNPs. We then calculated the mean + 3 standard deviations. If there were SNPs outside the window whose location was within 3 standard deviations of the mean, then those SNPs belonged to this cluster. Otherwise, the first SNP which was outside the mean + 3 standard deviations was the beginning of the next cluster. We did this by considering all of the SNPs between N2 and CB4856.
Decision on the Size of the Window of the Initial Clusters
A window size of 300 bp yielded the highest number of initial clusters. Assume the number of events in the initial cluster (X) is distributed as Poisson with mean λ = 2.305315 (the average number of events in the 300 bp clusters). To find the total number of events required to be a statistical hotspot, we find number xo such that P(X ≥ xo) = .0015. This signifies the number of events in the initial cluster (xo) is an outlier and a putative hotspot. The number .0015 corresponds to an event which is multiple standard deviations above the mean in our Poisson distribution. The number of events required to be a statistical hotspot (xo) in a given bp region is in Additional File 6.
Gibbs Sampling Methodology
Gibbs Sampling is an algorithm used to generate a sequence of samples from the joint probability distribution of multiple random variables. It is used as a means of statistical inference. In this work, we used Gibbs Sampling to find posterior confidence intervals to detect statistical hotspots and to detect differences between base substitution rates. Gilks et al.  contains a useful description of Gibbs Sampling.
Gibbs Sampling Methodology to find Final Statistically Significant Hotspots
Let X i = total number of events in any 300 bp window i, θ i = the event rate for window i, and t i = total number of bp between the first event and the last event in a 300 bp window. Then, X i ~ Poisson(θ i t i ) for i = 1,..., n, where n is the total number of 300 bp windows being considered. We assume a conjugate gamma prior distribution for the event rate θ i ~ Gamma(α, β) for i = 1, ... n. In addition, we assume the prior specification for the hyperparameters as α ~ Exponential(1.0) and β ~ Gamma(0.08,1.0). We find the posterior distribution of all θ i | x i , α, β for i = 1, ..., n to determine the statistical hotspots and construct 95% credible intervals for the rate of events in a region (λ = θi ti). If the lower bound of the 95% credible interval for θiti is above the number xo (in Additional File 6), the initial cluster in window i is a statistical hotspot. We do this simultaneously for all initial clusters.
Gibbs Sampling Methodology to find the Difference in the Direction of Base Substitution Rates in CB4856-N2 and CB4858-N2 comparisons
The model to test the difference between base substitution rates is described below and is reasonable given the large number of bases assayed in both CB4856 and CB4858. Let θj represent the true difference between base substitution rates and yj represent the sample difference between base substitution rates. For example, θ1 represents the difference between the G →A and A →G base substitution rates. Then, yj||θj~ Normal for j = 1,...,6 for the 6 possible base direction changes. The prior distribution θj|μ,τ2 ~ Normal (μ, τ2). Finally, we have an improper noninformative hyperprior distribution p(μ,τ) α 1. From this, we can obtain the posterior confidence intervals for the direction of base substitution rates given in Table 3. If the posterior confidence intervals do not contain 0, then we conclude there is a difference in the direction of base substitution rates in the isolate considered.
Polymorphic loci identified by pairwise comparisons between N2 and CB4856 were selected at random and satisfied the criteria of two or more polymorphic loci within 300-500 bp. Forward primers were designed specific to CB4856. PCR amplification of the 300-500 bp fragments containing the candidate polymorphic base(s) were directly sequenced in the forward direction using standard florescent sequencing technology on an ABI3130 Genetic Analyzer at the University of New Hampshire Hubbard Center for Genome Studies. We found 51 out of 55 polymorphisms were present in the CB4856 genome. The rate of confirmation was 93%.
ES thanks Abraham E. Tucker for his genetics insight and helpful discussions. ES also thanks Sanjuro Jogdeo for his bioinformatics insight and helpful discussions. This work was supported by NSF IGMS Grant 0527342.
- Stein LD, Bao Z, Blasiar D, Blumenthal T, Brent MR, Chen N, Chinwalla A, Clarke L, Clee C, Coghlan A, Coulson A, D'Eustachio P, Fitch DHA, Fulton LA, Fulton RE, Griffiths-Jones S, Harris TW, Hillier LW, Kamath R, Kuwabara PE, Mardis ER, Marra MA, Miner TL, Minx P, Mullikin JC, Plumb RW, Rogers J, Schein JE, Sohrmann M, Spieth J, Stajich JE, Wei C, Willey D, Wilson RK, Durbin R, Waterston RH: The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics. PLoS Biol. 2003, 1 (2): 166-192.View ArticleGoogle Scholar
- Thomas WK, Wilson AC: Mode and tempo of molecular evolution in the nematode Caenorhabditis: cytochrome oxidase I and calmodulin sequences. Genetics. 1991, 128: 269-279.PubMedPubMed CentralGoogle Scholar
- Egilmez NK, Ebert RH, Shmookler Reis RJ: Strain evolution in Caenorhabditis elegans: transposable elements as markers of interstrain evolutionary history. J Mol Evol. 1995, 40: 372-381. 10.1007/BF00164023.View ArticlePubMedGoogle Scholar
- Hodgkin J, Doniach T: Natural variation and copulatory plug formation in Caenorhabditis elegans. Genetics. 1997, 146: 149-164.PubMedPubMed CentralGoogle Scholar
- Sivasundar A, Hey J: Sampling from Natural Populations with RNAi Reveals High Outcrossing and Population Structure in Caenorhabditis elegans. Current Biology. 2005, 15 (17): 1598-1602. 10.1016/j.cub.2005.08.034.View ArticlePubMedGoogle Scholar
- Barriere A, Felix M: High Local Genetic Diversity and Low Outcrossing Rate in Caenorhabditis elegans Natural Populations. Current Biology. 2005, 15 (13): 1176-1184. 10.1016/j.cub.2005.06.022.View ArticlePubMedGoogle Scholar
- Barriere A, Felix M: Temporal Dynamics and Linkage Disequilibrium in Natural C. elegans Populations. Genetics. 2007, 176: 999-1011.View ArticlePubMedPubMed CentralGoogle Scholar
- Cutter AD, Félix MA, Barrière A, Charlesworth D: Patterns of nucleotide polymorphism distinguish temperate and tropical wild isolates of Caenorhabditis briggsae. Genetics. 2006, 173 (4): 2021-31. 10.1534/genetics.106.058651.View ArticlePubMedPubMed CentralGoogle Scholar
- Koch R, van Leuenen HG, van der Horst M, Thijssen KL, Plasterk RH: Single nucleotide polymorphisms in wild isolates of Caenorhabditis elegans. Genome Res. 2000, 10: 1690-1696. 10.1101/gr.GR-1471R.View ArticlePubMedPubMed CentralGoogle Scholar
- Wicks SR, Yeh RT, Gish WR, Waterston RH, Plasterk HA: Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map. Nature Genetics. 2001, 28: 160-164. 10.1038/88878.View ArticlePubMedGoogle Scholar
- Hillier LW, Marth GT, Quinlan AR, Dooling D, Fewell G, Barnett D, Fox P, Glasscock JI, Hickenbotham M, Huang W, Magrini VJ, Richt RJ, Sander SN, Stewart DA, Stromberg M, Tsung EF, Wylie T, Schedl T, Wilson RK, Mardis ER: Whole-genome sequencing and variant discovery in C. elegans. Nature Methods. 2008, 5: 183-188. 10.1038/nmeth.1179.View ArticlePubMedGoogle Scholar
- Rockman MV, Kruglyak L: Recombinational Landscape and Population Genomics of Caenorhabditis elegans. PLOS Genetics. 2009, 5 (3): 1-16.View ArticleGoogle Scholar
- Denver DR, Lynch M, Thomas WK: High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Letters to Nature. 2004, 430: 679-682. 10.1038/nature02697.View ArticleGoogle Scholar
- Swan KA, Curtis DE, McKusick KB, Voinov AV, Mapa FA, Cancilla MR: High-Throughput Gene Mapping in Caenorhabditis elegans. Genome Research. 2002, 12: 1100-1105.PubMedPubMed CentralGoogle Scholar
- Cutter AD, Payseur BA: Selection at Linked Sites in the Partial Selfer Caenorhabditis elegans. Molecular Biological Evolution. 2003, 20 (5): 665-673. 10.1093/molbev/msg072.View ArticleGoogle Scholar
- Lynch M: The Origins of Genomic Architecture. 2007, Sunderland MA: Sinauer Associates IncGoogle Scholar
- Weber KP, De S, Kozarewa I, Turner DJ, Babu MM, deBono M: Whole Genome Sequencing Highlights Genetic Changes Associated with Laboratory Domestication of C. elegans. PLOS One. 2010, 5 (11): 1-10.View ArticleGoogle Scholar
- Graustein A, Gaspar JM, Walters JR, Palopoli MF: Levels of DNA Polymorphism Vary With Mating System in the Nematode Genus Caenorhabditis. Genetics. 2002, 161: 99-107.PubMedPubMed CentralGoogle Scholar
- Jovelin R, Sung FS, Phillips PC: High Nucleotide Divergence in Developmental Regulatory Genes Contrasts With the Structural Elements of Olfactory Pathways in Caenorhabditis. Genetics. 181: 1387-1397.
- Barnes TM, Kohara Y, Coulson A, Hekimi S: Meitic recombination, noncoding DNA and genomic organization in Caenorhabditis elegans. Genetics. 1995, 141 (1): 158-79.Google Scholar
- Denver DR, Dolan PC, Wilhelm LJ, Sung W, Lucas-Lledo JI, Hower DK, Lewis SC, Okamoto K, Thomas WK, Lynch M, Baer CF: A genome-wide view of Caenorhabditis elegans base-substitution mutation processes. PNAS. 2009, 106 (38): 16310-163134. 10.1073/pnas.0904895106.View ArticlePubMedPubMed CentralGoogle Scholar
- Denver DR, Morris K, Thomas WK: Phylogenetics in Caenorhabditis elegans: An analysis of divergence and outcrossing. Mol Biol Evol. 2003, 20: 393-400. 10.1093/molbev/msg044.View ArticlePubMedGoogle Scholar
- Cutter AD, Dey A, Murray RL: Evolution of the Caenorhabditis elegans Genome. Molecular Biology and Evolution. 2009, 26 (6): 1199-1234. 10.1093/molbev/msp048.View ArticlePubMedGoogle Scholar
- Kelly WG, Schaner CE, Dernburg AF, Lee M, Kim SK, Velleneuve AM, Reinke V: X-chromosome silencing in the germline of C. elegans. Development. 2002, 129: 479-492.PubMedPubMed CentralGoogle Scholar
- Hodgkin J: X chromosome dosage and gene expression in Caenorhabditis elegans: Two unusual dumpy genes. Mol Gen Genet. 1983, 192: 452-458. 10.1007/BF00392190.View ArticleGoogle Scholar
- Li WH: A model for the correlation of mutation rate with GC content and the origin of GC-rich isochors. Journal of Molecular Evolution. 1994, 38 (5): 468-475. 10.1007/BF00178846.View ArticlePubMedGoogle Scholar
- Fullerton SM, Carvalho AB, Clark AG: Local Rates of Recombination are Positively Correlated with GC Content in the Human Genome. Molecular Biol Evol. 2001, 18 (6): 1139-1142.View ArticleGoogle Scholar
- Filipski J: Correlation between molecular clock ticking, codon usage fidelity of DNA repair, chromosome banding and chromatin compactness in germline cells. FEBS Lett. 1987, 217: 184-186. 10.1016/0014-5793(87)80660-9.View ArticlePubMedGoogle Scholar
- Sueoka N: Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci. 1988, 85: 2653-2657. 10.1073/pnas.85.8.2653.View ArticlePubMedPubMed CentralGoogle Scholar
- Wolfe KH, Sharp PM, Li WH: Mutation rates differ among regions of the mammalian genome. Nature. 1989, 337: 283-285. 10.1038/337283a0.View ArticlePubMedGoogle Scholar
- Francino MP, Ochman H: A comparative genomics approach to DNA asymmetries. Ann NY Acad Sci. 1999, 870: 428-431. 10.1111/j.1749-6632.1999.tb08919.x.View ArticlePubMedGoogle Scholar
- Kosakovsky Pond SL, Frost SDW: Not So Different After All: A Comparison of Methods for Detecting Amino Acid Sites Under Selection. Molecular Biology and Evolution. 2005, 22 (5): 1208-1222. 10.1093/molbev/msi105.View ArticlePubMedGoogle Scholar
- Zhang Z, Li J, Zhao X, Wang J, Wong G, Yu J: KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging. Genomics, Proteomics & Bioinformatics. 2006, 4 (4): 259-263. 10.1016/S1672-0229(07)60007-2.View ArticleGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series (Methodological). 1995, 57 (1): 289-300.Google Scholar
- Vicoso B, Charlesworth B: Evolution on the X chromosome: unusual patterns and processes. Nat Rev Genet. 2006, 7: 645-653. 10.1038/nrg1914.View ArticlePubMedGoogle Scholar
- Morgan D, Crittenden SL, Kimble J: The C. elegans adult male germline: Stem cells and sexual dimorphism. Developmental Biology. 2010, 346: 204-214. 10.1016/j.ydbio.2010.07.022.View ArticlePubMedPubMed CentralGoogle Scholar
- Felix M, Braendle C: The natural history of Caenorhabditis elegans. Current Biology. 2010, 20: R965-R969. 10.1016/j.cub.2010.09.050.View ArticlePubMedGoogle Scholar
- Mayden JS, Lorch A, Edgley ML: Copy number variation in the genomes of twelve natural isolates of Caenorhabditis elegans. BMC Genomics. 2010, 11: 12-10.1186/1471-2164-11-12.View ArticleGoogle Scholar
- Haber M, Schungel M, Putz A, Muller S, Hasert B, Schulenburg H: Evolutionary History of Caenorhabditis elegans Inferred from Microsatellites: Evidence for Spatial and Temporal Genetic Differentiation and the Occurrence of Outbreeding. Molecular Biology and Evolution. 2005, 22 (1): 160-173.View ArticlePubMedGoogle Scholar
- Ridley M: Evolution. 2003, Malden MA: Wiley-BlackwellGoogle Scholar
- Chou H, Holmes MH: DNA sequence quality trimming and vector removal. Bioinformatics. 2001, 17 (12): 1093-1104. 10.1093/bioinformatics/17.12.1093.View ArticlePubMedGoogle Scholar
- Pop M, Philippy A, Delcher A, Salzberg SL: Comparative Genome assembly. Briefings in Bioinformatics. 2004, 5 (3): 237-248. 10.1093/bib/5.3.237.View ArticlePubMedGoogle Scholar
- Gilks WR, Richardson S, Spiegelhalter D: Markov Chain Monte Carlo in Practice. 1995, Boca Raton, FL: Chapman and Hall/CRCGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.