### Standard null model

To test the hypothesis that Muller's ratchet does not threaten the Amazon molly with extinction during the known time of its existence, we used the null model for quantifying the threat of extinction from Muller's ratchet, as described elsewhere in detail [7]. This model is a simple extension of the standard model of Haigh [104] combined with mutational meltdowns [105]. In short, we combined a multiplicative fitness model with an upper limit for *R*
_{
max
}, the maximal effective number of offspring that can be produced by an individual. This allows computing *C*
_{
mm
}, the number of clicks of Muller's ratchet that are needed to start mutational meltdown *C*
_{
mm
}= log(1/*R*
_{
max
})/log(1 - *s*), (1)

where each click is the stochastic extinction of the genotype carrying the fewest deleterious mutations that have the constant positive selection coefficient *s* [see [7, 105]]. While this extinction does not require the fixation of a new mutation in the population, such a fixation typically follows shortly after the loss of the best class [106, 107], unless there are special circumstances. Combining the effective population size *N*
_{
e
}with *U*
_{
sdm
}, the genomic rate of the origin of s lightly d eleterious m utations with effect *s*, allows us to compute *T*
_{
cl
}, the average effective time between two clicks of Muller's ratchet in generations that are assumed to be discrete. Since computation of this click time is rather difficult, we combine two of the best analytical approximations [9, 27, 28] with extensive individual-based computer simulations that were distributed over the Internet using evolution@home, the first global computing system for evolutionary biology [7, 29, 30, 31]. We do not quantify mutational meltdown, because this demographic process at the end of genomic decay is so fast that it can be neglected. Therefore, we approximate *T*
_{
ex
}, the time to extinction by *T*
_{
ex
}= *C*
_{
mm
}* *T*
_{
cl
}* *T*
_{
gen
} (2)

where *T*
_{
gen
}is the time between generations that are assumed to be discrete [7]. Extinction times are computed for a large number of parameter combinations that span the whole realistic range of parameters, to assess how many parameter combinations could lead to an extinction by *T*
_{
age
}, the known age of existence of the asexual lines of descent. Our simplifying assumption here is that no significant mutation accumulation had occurred in the sexual ancestors, since regular recombination would have facilitated the selective removal of all slightly deleterious mutations of critical effects (i.e. effects that could endanger the long-term survival of the Amazon molly in the presence of Muller's ratchet). To account for the distribution of mutational effects we scale the total genomic mutation rate appropriately to obtain *U*
_{
sdm
}[7] and we estimate *N*
_{
e
}from diversity data [108]. This approach is justified in the corresponding sections below.

### Mutation rate estimates

Before we can estimate *U*
_{
sdm
}, we need to estimate *U*
_{
tot
}, the mutation rate at all potentially deleterious sites in a haploid genome per generation. To do this we focus on synonymous point mutations for estimating the rate per base pair and later extrapolate to potentially harmful sites that change amino acids or affect regulatory functions. We focus on synonymous point mutations for several reasons.

- (i)
Synonymous mutations are expected to be mostly effectively neutral (or only under weak selection) and therefore their rate of fixation can be predicted from the neutral theory [109] (the rate is slightly lower if there is very weak selection). Thus synonymous substitution rates are probably very close to the true mutation rates of neighbouring non-synonymous sites that accumulate the actual mutational damage.

- (ii)
We cannot apply observations from microsatellites, since their mutational model is very different from that of normal point mutations and it is not clear, how an estimate of one rate could be converted to a direct estimate of the other rate.

- (iii)
One might want to ignore chromosome rearrangements and frame-shift mutations like indels or TE insertions, if one wants to obtain a conservative estimate of extinction time. Many of these mutations are so strongly selected against that they have no chance of accumulating in a large population (their selection coefficients are firmly behind the 'wall of background selection', see Figure 1 and [7]). Thus, such drastic mutations are probably only of importance in the context of corrections for polyploidy that were discussed above and if mutation rates are already very high.

In addition to mutation rates in the nuclear genome, one might want to consider mutation rates in the mitochondrial genome, since mitochondria are probably as essential to fish as they are to humans [7].

### Distribution of mutational effects

To quantify Muller's ratchet in the presence of a distribution of mutational effects on fitness we partition this distribution in three as suggested elsewhere [7, 108]:

- (i)
Very deleterious mutations are implicitly dealt with by our method of determining *N*
_{
e
}, see below.

- (ii)
Slightly deleterious mutations with effects in the critical range or close to the critical range are the main focus of our attention here, as these contribute most to extinction, see below.

- (iii)
Effectively neutral mutations that accumulate like neutral mutations are ignored, as their effects are too small for impacting fitness.

In order to quantify Muller's ratchet we need a good estimate of *U*
_{
sdm
}, the slightly deleterious mutation rate, which is given by *U*
_{
sdm
}= *U*
_{
tot
}* *f*
_{
sdm
}, (8)

where *U*
_{
tot
}is the genomic mutation rate at sites that are in a functional category with potentially deleterious effects like non-synonymous mutations, and *f*
_{
sdm
}is the fraction of sites with slightly deleterious, critical selection coefficients *s*
_{
c
}among all potentially deleterious mutations (for more details on this approach, see [7]). The corresponding range of critical selection coefficients approximates all *s* that lead to extinction within a minimal time or within a given time *T*
_{
age
}; this can be determined from Figure 1. To compute *f*
_{
sdm
}we need to combine the range of all *s*
_{
c
}with estimates of the distribution of mutational effects on fitness, which traditionally have been difficult to obtain. However recent progress has been substantial and we now know that the distribution of non-synonymous mutational effects is very leptokurtic and spans many orders of magnitude in many different species [46, 47, 110, 111, 112].

### Effective population size

There are three population sizes that are potentially relevant here:

- (i)
total census population size *N*
_{
t
},

- (ii)
effective population size in the absence of any deleterious mutations *N*
_{e 0},

- (iii)
effective population size in the presence of background selection *N*
_{
eb
}.

Here we argue that *N*
_{
e
}= *N*
_{
eb
}for the purposes of quantifying the rate of Muller's ratchet; this is how we use *N*
_{
e
}outside of this section. In order to see this, we need to consider how to approximate the rate of Muller's ratchet in the presence of a wide distribution of mutational effects. New work by Söderberg & Berg [108] has shown that the rate of the ratchet in the presence of such a distribution can be approximated reasonably by dividing mutational effects into different categories (see above). Here we consider the category that contains mutations with very strong effects (see background selection theory [113, 114]). Simulations [108] show that the effect of these mutations on the rate of the ratchet is well approximated by using a scaled effective population size *N*
_{e 0}that is appropriately reduced by the factor *N*
_{
eb
}/*N*
_{e 0}so that it only contains individuals that are free from strongly deleterious mutations (see equation 7 in [108]). This is consistent with classical Hill-Robertson effect theory, which states that a locus that is linked to another locus under selection will experience a reduction in effective population size [109].

In the absence of deleterious mutations, estimates of *N*
_{
e
}= *N*
_{e 0}can be made from diversity data and mutation rates, but we cannot estimate *N*
_{e 0}in our system, since deleterious mutations are present. Surprisingly, we do not need such an estimate, since background selection theory predicts that any estimate of *N*
_{
e
}based on diversity and mutation rates will in reality be an estimate of *N*
_{
eb
}, the effective size of the class that is free from strongly deleterious mutatons [113, 114]. This means that two difficulties in our analysis cancel each other out: we do not need to estimate *N*
_{e 0}and we do not need to simulate background selection for a population of size *N*
_{e 0}to quantify the rate of the ratchet for mutations with critical effects. We only need to use standard approaches for measuring *N*
_{
e
}from DNA sequence diversity; this will approximately give us *N*
_{
eb
}for our simulations, which then have to ignore all mutations with effects in the background selection range [108]. Thus our analysis uses information from two categories of sites:

- (i)
Neutral sites are important for estimating *N*
_{
e
}= *N*
_{
eb
}under background selection, but not for the ratchet itself (ignore when computing *U*
_{
sdm
}).

- (ii)
Selected sites contribute towards operating the ratchet (use for computing *U*
_{
sdm
}), but cannot be used to estimate *N*
_{
e
}.

Since mtDNA and all nuclear chromosomes are completely linked with the same selective unit in the absence of outcrossing, *N*
_{
e
}is the same for both systems like in pure selfers [115, 116].

### Simplifications overview

Here we approximate the effects of Muller's ratchet in a real diploid genome by proposing to choose (i) effective deleterious mutation rates that are adjusted for the corresponding ploidy level and exclude mutations with effects that are either too large to accumulate or too small to cause effective harm and (ii) effective selection coefficients that are adjusted for arbitrary levels of ploidy, dominance and mitotic recombination. The corresponding adjustments are compiled in Table 1 and explained below. Given the large degree of uncertainty about mutation rates and selection coefficients in our study organism, the errors from these approximations seem insignificant. Also, it is not possible to convincingly decide between the "Core-Genome-Model" and the alternative "Equal-Contribution-Model", although the latter seems to be much more realistic (both models are explained below). Therefore all possible options are explored by including a wide range of possible mutation rates and effects.

### Extensions for diploids and polyploids

To apply the model above to non-haploid organisms requires some adjustments to account for the fact that duplicate copies of genes can buffer deleterious effects. To avoid the complexity associated with recent models of the evolution of gene duplicates [117–121], we propose two simple models that allow the reduction of polyploids to an effectively haploid genome model: (i) The *Core-Genome-Model* assumes that one functional copy of each essential gene is sufficient and all other alleles can be discarded without problems, so that fitness degradation from Muller's ratchet can only start when the 'backup alleles' have been deactivated. (ii) The *Equal-Contribution-Model* assumes that each allelic copy contributes equally to fitness, so that selection coefficients are reduced and mutation rates are increased proportionally to the ploidy level. Both models are discussed below for diploid and triploid Amazon molly along with the impact of a variable dominance coefficient *h* on quantifications of Muller's ratchet.

### Core-Genome-Model

One might assume that only one copy of each important gene is really needed and that all additional copies from higher ploidy levels of the genome could be regarded as 'neutral backups' that can be deleted without any negative effect, as long as the last functional copy is still working. An extreme interpretation of this assumption implies that all deleterious mutations are completely recessive (dominance coefficient *h* = 0), which is rather unrealistic in many settings [122–130]. However, in less extreme cases, recessivity may be strong enough to reduce the effects of a complete gene knockout to effective selective neutrality (*N*
_{
e
}
*sh* < 1). Under the Core-Genome-Model otherwise strongly deleterious mutations like indels or transposable element insertions can disrupt gene function in all additional copies and thus accumulate effectively like neutral mutations. Eventually the core genome will be distributed across all the different haploid copies of the original genome, as it is really only a combination of all the least damaged parts of the genome. Since one frame-shifting mutation is enough to transform a copy of gene *i* into a pseudogene, we can approximate its time to inactivation, *T*
_{
ko
}, by *T*
_{
ko
}≈ 1/(*l*
_{
i
}* μ_{
bp
}* *f*
_{
ko
}) (3)

where *f*
_{
ko
}is the factor that specifies the frequency of frame-shift or other knock-out mutations relative to μ_{
bp
}, the synonymous point mutation rate/basepair/generation, and *l*
_{
i
}is the total length of gene *i*. Lets assume a typical gene has 2,000 base pairs and frame-shifts occur at a rate of μ_{
bp
}* *f*
_{
ko
}≈ 1*10^{-9} (probably a lower limit if compared to ≈ 1*10^{-8} mutations/bp/generation in mice and *f*
_{
ko
}≈ 1/3 as observed in bacteria [see the 'C' parameter in [131, 132]]). In this case one may have to wait on average for about 500,000 generations, before inactivation of such a gene can be expected. Since these events most probably follow a Poisson distribution, it can be expected that a substantial fraction of essential genes would be inactivated very quickly, leaving only one copy that is then maintained by selection. It is these last copies that are then slowly degraded by Muller's ratchet, since they can most probably mutate to such slightly deleterious states that allow the operation of the ratchet. Early inactivation of enough 'backup copies' will give Muller's ratchet extensive periods of time for degrading the core genome (late inactivation may allow for slightly deleterious point mutations in the non-deactivated allele). For example, the rates above suggest that frame-shift accumulation in 20,000 diploid genes would inactivate on average about 1,000 genes (sd = 32) over 25,000 generations. A precise calculation of extinction time in this model would require the integration of the effects of increasing mutation accumulation over time, where the increase is caused by rising deleterious mutation rates due to the progressive inactivation of genes. To avoid these complex calculations, the following two extreme simplifications are proposed.

A lower limit of *T*
_{
ex
}can be obtained by ignoring all additional copies of the essential core genes. In this case the resulting *U*
_{
sdm
}to use for the computation of click time is that of the haploid core copy of the genome. An upper limit can be obtained by partitioning time into (i) the accumulation of knockouts while ignoring Muller's ratchet and (ii) the operation of Muller's ratchet while neglecting any further increase of mutation rates due to additional knockouts. Testing several possible ratios of these two times allows minimization of extinction time between the extreme of no time for frameshifts (no extinction due to Muller's ratchet, since mutation rates are too low) and all time for frame-shifts (no extinction, because Muller's ratchet will always need some time to cause an extinction, even with high mutation rates).

In the case of triploids or higher ploidy levels, the Core-Genome-Model needs correspondingly longer for genomic decay, but in no case can additional copies stop genomic decay. Higher ploidy levels are not discussed here, since we assume that the first Amazon molly was diploid. Triploids lose their advantage in the Core-Genome-Model, if they occasionally lose one of their three sets of chromosomes, as they might do [49]. The fact that this is possible indicates that the lost set of chromosomes does not carry any last functional copy of a vital gene. For more details on triploids see the Equal-Contribution-Model.

### Equal-Contribution-Model

#### Diploids

Many mutations of small effects do not seem to be completely recessive, but rather close to codominance, especially if their effects are very small [122–130]. If that is approximately true for most genes, then both copies in a diploid genome are actually needed to produce the required dose of proteins to maintain fitness. In this case, all copies of the genome of the Amazon molly will contribute approximately the same amount of functionality, implying that the size of the mutational target increases two-fold, while effective selection coefficients decrease two-fold, relative to the haploid case (there are twice as many sites to hit and selection only operates against heterozygotes with strength *sh*, where *h* = 0.5).

#### Triploids

It is not clear whether such reasoning can be extended to triploids, as ancestral genomes may have been fine-tuned for diploidy. The fact that triploids can have significant disadvantages if compared directly to diploids [133, 134] cautions against a positive functional role of the additional copy of the genome. On the other hand, one could argue that the surprising flexibility of fish in tolerating various ploidy levels [133] allows for an easy incorporation of an additional dosage of proteins provided by a third set of chromosomes. This is not contradicted by the fact that Amazon molly triploids can be observed in the wild [23], as strong purifying selection would predict vanishing frequencies, since the rates of origin of triploids are rather low in this fish [23]. Existing data on expression patterns of muscle proteins in Amazon molly triploids are inconclusive [50]. These complications suggest that triploids may face purifying selection and therefore they are not likely to reach 100% in a population. Depending on the specific mechanism of how triploids incur a possible disadvantage, the inactivation of one copy might occasionally even be beneficial for triploids. In addition, triploids may revert back to diploidy by occasionally losing one of their three sets of chromosomes [49]. The fraction of triploids in the wild seems to fluctuate substantially, but no population with 100% triploids has been observed. Frequencies of triploids for two different samples were in the range of 4%–15% (Lamatsch et al., unpublished) and 3% – 46% [135], indicating that selection against triploids is probably weak. Therefore, equal-contribution-corrections to haploid estimates of *U*
_{
sdm
}(and *s* if available) have to be treated with caution in the case of triploids, but appear to be reasonable approximations for diploids.

### Varying dominance

The two models above assume either complete recessivity or codominance. Observed dominance levels in other species are frequently somewhere in-between, where mutations with smaller effects tend to be closer to codominance [122–130]. While we know very little about the precise values of the dominance coefficient, *h*, and the selection coefficient, *s*, in the Amazon molly, it appears to be highly probable that general findings from other species apply here too. Of special interest here is the fact that the distribution of mutational effects on fitness appears to be very wide on a log scale [46, 47]. Thus we can propose a simplification that allows for arbitrary dominance coefficients between *h* = 0 and *h* = 1 by making slight adjustments to the effective *s* used in simulations. The simplification is based on the observation that

- (i)
deleterious mutations with smaller effects have a higher probability of accumulating in the population than those with larger effects and

- (ii)
mutations with large enough effects do not accumulate (see the 'wall of background selection' in Figure 1 and in [7]).

To simplify the treatment we speak of the ratchet as if it fixes mutations, although strictly speaking the ratchet will only cause the mutation-free class to go extinct and fixing happens shortly after that by genetic drift (see [106, 107]). The following qualitative analysis is supported by published simulation results [106].

We can treat *dominant* mutations as follows. If and only if the ratchet can fix the first mutation at a site (i.e. individuals that are homozygous for that size go extinct and all individuals in the population become heterozygous) then it will eventually also fix the second mutation (i.e. all individuals become homozygous for the deleterious allele). If a heterozygote cannot be fixed, there is no way the homozygote state can be fixed and the effective mutation rate used to compute the speed of the ratchet should be reduced accordingly to exclude such sites.

In the case of *recessive* deleterious mutations, there are the two following extremes. If *s* is small (most mutations), then click times are hardly affected and almost no correction is necessary. If *s* is large enough to prevent fixation of heterozygotes, then again the effective mutation rate may be reduced, as fixation of the homozygous state is impossible. Only few intermediate *s* can become heterozygous but not homozygous, as they have to be close to the switch-like transition between 'can accumulate' and 'cannot accumulate' (see the 'wall of background selection' in Figure 1 and [7]) and their homozygous effects have to be too deleterious to accumulate. This simplification relies on the almost switch-like transition between mutation accumulation by Muller's ratchet and complete mutation removal by background selection.

We ignore *overdominant* sites, assuming that these stay in their optimal heterozygous state. This will lead to a conservative estimate of extinction time, as these sites do not contribute to fitness decay under this assumption. Future models will have to investigate to what extent such an approximation is justified in the presence of a small fraction of overdominant sites [129, 130].

### Extensions for mitotic recombination

Non-meiotic recombination can play a substantial role in reducing diversity in asexual lineages [40]. To assess the extent to which the rate of Muller's ratchet could be reduced by this process [43, 44, 136], the following approach was used. We assume that the maximal possible rate of mitotic recombination is equivalent to selfing with free recombination between loci. The true reduction of the rate of Muller's ratchet is probably much smaller, since mitotic recombination is certainly not as effective as free recombination in a selfing population. Thus the strongest possible reduction of the rate of Muller's ratchet from mitotic recombination can be estimated from the analytical work of Heller and Maynard Smith [45]. As explained below, a simple scaling of *U* and *s* is enough to reduce predictions to the haploid case.

### Muller's ratchet with selfing

Heller and Maynard Smith [45] derived an equation that describes the distribution of deleterious mutations in a selfing population under Muller's ratchet. We assume here that the expectation of the size of the 'best class' in mutation-selection balance mainly determines the rate of Muller's ratchet. This appears to be supported by some simulation results [43, 44] although Gordo & Charlesworth [137] found that the ratchet could click at different rates for the same size of the best class if other parameters were different. Our assumption allows us to use a simple scaling of *U* and *s* to reduce predictions for diploid selfers to the haploid case.

In detail, Heller and Maynard Smith [

45] show that in a selfing population

*x*
_{
ij
}, the fraction of individuals carrying

*i* homozygous and

*j* heterozygous mutations is given by

where

*U*
_{sdm, d}is the slightly deleterious mutation rate/diploid genome/generation that changes wildtype homozygotes into heterozygotes,

*h* is the dominance coefficient and

*s* is the homozygous selection coefficient (positive for harmful mutations). This equation assumes that (i) recombination is free between sites, so that the ratchet only operates at homozygous sites, since selfing can restore the best class on a heterozygous site, (ii)

*U*
_{sdm, d}is constant and independent from the number of mutations per individual, (iii) mutations at heterozygous sites are negligible, since selfing produces many more homozygotes, (iv) the standard model of Muller's ratchet without back mutations is valid [

7,

104]. Then the number of individuals in the 'best class',

*N*
_{0}, can be computed by adding up all heterozygotes that are free from homozygous deleterious mutations and then scaling by the effective population size

Using the well known result that

equations (4) – (6) yield

For complete recessivity (*h* = 0) this reduces to the haploid asexual case (
), because the diploid mutation rate *U*
_{sdm, d}equals twice the haploid rate *U*
_{
sdm
}. It is easy to see from (7) that the degree of dominance of mutations with small selection coefficients does not significantly affect numerical values and hence the rate of the ratchet. Since the ratchet does not click for large *s*, one might as well neglect dominance altogether.