The multispecies coalescent model [1] is preferred to the ‘super-matrix’ method for phylogenetic inference when population sizes are large relative to the ages of the species being considered, because considerable differences are expected between individual gene trees and the species tree they evolve within [2, 3]. This is understood both theoretically [2] and by simulation [3–5]. Recent developments have produced a number of methods and software packages for estimating species trees under the multi-species coalescent model [4–8]. Of these methods it is the full Bayesian implementations that are expected to perform the best as they use all available information and this is born out in simulation [5, 9].

In all of these implementations, strict divergence is a standard assumption of the multispecies coalescent. Under strict divergence, a species is a perfectly mixing Wright-Fisher population until the moment of splitting, and from that point onwards the two sub-species evolve in total isolation. Strict divergence is a simplifying assumption, one which is violated by the presence of horizontal gene transfer, reassortment, migration or any other means of gene flow. Such simplifying assumptions are common in scientific models due to incomplete understanding of the processes involved, unavailability of analytical solutions or limitations in computational resources.

Here we focus on the effect of violating the central assumption of strict divergence. We model one specific type of gene flow – migration – and investigate its effects on the Bayesian inference of multispecies phylogenies. There are several software packages which infer species trees from multiple loci [4, 7, 10]. We explore the impact that migration has on this posterior distribution using the ⋆BEAST package [5].

Models of genetic differentiation in subdivided populations go back more than 70 years. In 1943 Wright introduced the “Island Model” in which “*the total population is assumed to be divided into subgroups, each breeding at random within itself, except for a certain proportion of migrants drawn at random from the whole*” [11]. Wright views the model as one extreme of the more general case of *Isolation by distance*, and also investigates an alternative, “local embedding in a continuous area”, where the population is distributed in a metric space and the probability of contact is inversely proportional to distance. Other intermediate models include Kimura’s “Stepping Stones” model [12, 13] and the more general “Migration Matrix” which encapsulates geographic (or other) barriers to migration [14].

There are a large number of existing coalescent simulators [15–21] allowing for varying degrees of flexibility in modeling migration between related and unrelated populations. A common assumption behind the standard island models implemented in these existing simulators is that the rate of migration between two populations is either constant, or piecewise constant. If two populations only slowly become isolated after divergence then there will be a gradual decline in gene flow (migration), rather than a sudden drop. For this reason we extend the standard migration models to allow continuous change of migration rate over time. This modification adds some complexity to the simulation algorithms.

Given that the gradual decline of gene flow after divergence could well be a likely occurrence, we consider the effect this migration has on inference of species trees. It has been previously shown [22] that *I*
*M*
*α*[23, 24] estimates are quite robust to moderate model violation, in which there is a “realistic” level of population structure within each species. However, IM *α* assumes that the species tree ranked topology is known, while this may be hard to pre-determine in many cases where migration is present. Here we examine the effect of migration on the posterior distribution of species trees without prior constraints on topology or divergence times.

Wright [25] showed that, in island models, it takes only one migrant per generation into each population (i.e. *N*
*m*>1) to prevent differentiation of two populations, given neutral markers [26]. Wright’s rule has clear implications for the inference of species trees, even with lower levels of migration. In order to test Wright’s rule for species trees, we wrote the simulator so that migration is determined in terms of the expected number of migrants, irrespective of population size. We then examine the relationship between the species tree as inferred from simulated sequence data and the true species tree.