Time-calibrated molecular phylogeny
We compiled a near-complete set of samples for species and candidate species of the Mantellidae. Only the described mantellid species Spinomantis brunae, S. nussbaumi, and S. tavaratra are missing ("all-taxa data set"). We are aware that new mantellid species will continue being discovered in the future . Along with other thorough molecular phylogenetic studies of species-rich tropical amphibian radiations [e.g., [26, 27] we are confident to have assembled the most complete such data set to date. To define candidate species, we followed an integrated approach that combined genetic divergence with bioacoustic and morphological characters [24, 48, 54, 55]. For most mantellid species, genetic data was available for more than one population and individual, and these as well as bioacoustic and morphological data support a status as valid species for the undescribed species included in this study .
We compiled two sets of DNA sequences: (i) a combined mitochondrial/nuclear gene data set to reconstruct the deep phylogenetic relationships among 46 species representing all major mantellid lineages, including 3760 basepairs (bp) of the mitochondrial gene fragments 12SrRNA (12S, 538 bp), 16SrRNA (16S, two fragments of 582 bp and 505bp), cytochrome b (cob, 988 bp), cytochrome oxidase subunit I (cox1, 625 bp), and of the nuclear rhodopsin exon 1 (289 bp) and RAG2 (816 bp) gene fragments; (ii) a 1172 bp mitochondrial data set of 16S, cob and cox1 sequences from all but three described mantellid species species (187 of the 190 described and valid species) plus 67 undescribed confirmed candidate species. The reduced-taxa data set was largely based on sequences used in other studies [31, 56], complemented by additional sequences for crucial species (see Additional file 1, Table S1). In the all-taxa dataset, 16S sequences were mainly taken from previous work , whereas most sequences of cob and cox1 were newly determined.
PCRs were performed according to the reaction conditions and thermocycling protocols described previously [44, 54, 56, 57]. Primers for the various genes were as specified in these previous publications (see Additional file 1, Table S2 for primer sequences). Sequencing reactions were performed with the forward primers and resolved on automated sequencers by Macrogen Inc., Korea for mitochondrial markers. Nuclear markers and ambiguous sequences of the mitochondrial markers were additionally sequenced with the reverse primers, with several sequencing reactions being repeated numerous times to obtain unambiguous results. The obtained electropherograms were manually edited and verified as mantellid DNA via BLAST searches. Alignments were generated with MEGA using the CLUSTALW algorithm  and refined manually. Gapped and hypervariable regions of the rRNA gene sequences were excluded from the analysis. Newly determined DNA sequences were submitted to Genbank (accession numbers JN132821-JN133276; for a complete list of GenBank accession numbers and a list of voucher specimens see Additional file 1, Table S1. For the reduced-taxa data set we used Heterixalus variabilis (a representative of the Hyperoliidae) as outgroup and included Polypedates spp. (Rhacophoridae, the sister group of the Mantellidae) as a hierarchical outgroup. For the all-taxa data set we defined Polypedates spp. as outgroup.
Best-fit models of evolution were constructed for the various character sets used in the partitioned analysis with MrModeltest  (Additional file 1, Table S3). To obtain the optimal partitioning strategy for the dataset, Bayesian tree searches were run for 20,000,000 generations for the reduced-taxa dataset and for 5,400,000 - 10,000,000 generations in the all-taxa dataset each with 2 runs and 4 chains (MrBayes V.3.1.2 ). Harmonic means were calculated using the sump command in MrBayes, with a conservative burn-in corresponding to the first 500,000 generations after assessing that stability of likelihood values had been reached in each case after much fewer generations. The partitioning strategies that explained the data set with the least random error were the maximum partition datasets in both analyses .
For the reduced-taxa data set, ML searches were conducted using the software RaxML V.7.0.0 (under the estimated best substitution model GammaInvar) , and the rapid bootstrapping algorithm was used with an estimated number of bootstraps . Heuristic searches under MP were conducted for the reduced-taxa dataset in PAUP* , with 2000 bootstrap replicates. Characters in the MP searches were treated as unordered with equal weight. Gaps were treated as "missing"; multistate characters were interpreted as "uncertain". Trees were computed with random stepwise addition of taxa, and branch swapping was performed with the TBR (Tree-Bisection-Reconnection) algorithm, without limitation in the number of retained trees.
We then constrained the major lineages (subfamilies and relationships among some genera and subgenera) in the all-taxa dataset according to the optimal topology found for the reduced-taxa dataset, to enable a stable run of MrBayes under optimized computation time. A final Bayesian analysis was then performed for the all-taxa dataset running 30,000,000 generations for 58 days on a computing cluster at UC Berkeley and the 50%-majority consensus tree from this run after discarding the burnin was used as the preferred estimate of mantellid relationships.
For the all-taxa data set, we subsequently computed an ultrametric tree to estimate evolutionary node ages for ARC from the preferred Bayesian tree topology. We used the software Pathd8 to estimate a time-calibrated phylogeny  which computes ultrametric trees for large data sets. The rationale for using Pathd8 instead of more commonly used software like BEAST  or MultiDivtime  was two-fold: first, both alternative software crashed on our dataset during repeated trials. Second, the Pathd8 software has been introduced as especially suited for large datasets, being only less precise compared with penalized likelihood methods, but giving more sensible answers for extreme data sets. The reason for its faster performance with large data sets is that substitution rates are being smoothed locally, rather than simultaneously over the whole tree . We used the estimated evolutionary split of the two undescribed species endemic to the Comoro island of Mayotte (Blommersia sp. 4, Boophis sp. 1) as fixed calibration points, which is estimated at 8.7 ma based on the age of the volcanic island of Mayotte ; validated in . For adjustment of the deeper branches, we furthermore applied secondary age constraints, with divergence time estimates between mantellid genera based on the confidence intervals calculated in a previous study  (Additional file 1, Table S4). These secondary constraints were obtained from a estimation of divergence times on the basis of external calibration points and in general were fully congruent with other estimates of mantellid ages [e.g., . We also estimated divergence times without these secondary constraints and obtained a largely similar time frame for mantellid diversification, leaving us confident that these secondary constraints have not introduced any major bias in our analysis. In any case, our subsequent analyses of body and range size influences on diversification depend on relative, not absolute age of nodes, and thus are independent from possible inaccuracies that may remain in our estimates of absolute ages of diversification events.
We identified 53 pairs of mantellid sister species in the all-taxa phylogeny that were supported by high Bayesian posterior probabilities (>98). We did not consider sister species that had low Bayesian support values, or unknown ranges. Due to severe uncertainties in taxonomy and thus range estimations we also excluded well-supported sister species of the subgenera Ochthomantis (Mantidactylus) and Pandanusicola (Guibemantis). Because the two Comoroan species (Boophis sp. 1 and Blommersia sp. 4) likely originated by overseas dispersal (Vences et al., 2003), they were also excluded from subsequent calculations. The ages of all nodes separating pairs of sister species were extracted from the ultrametric tree obtained with Pathd8 and were tested for normal distribution with a one-tailed Kolmogorov-Smirnov test using STATISTICA (© StatSoft, Tulsa, OK).
Geographic data and analysis
We used point locality information for 242 species and candidate species in our phylogeny obtained from an extensive GIS-referenced database  to construct distribution maps with ArcView GIS (V.3.2a, Esri ©1992-2000). For species that were only known from one or two localities, we estimated species distribution areas by assigning buffer zones around these localities. The estimation of small range sizes is crucial for our paper, so we conducted all analyses using two different estimates for these buffer zones: a rather large one of 25 km radius versus a small quasi-zero one of 17 m radius. Comparing results obtained with both buffer zone estimates ensures the robustness of our results without knowing the exact extent of range size of these species, although according to our own observations , the larger estimate is probably an overestimation of the range sizes of microendemic species. However, we emphasize that one or two-locality species do not equal limited sampling effort: In most cases, these are well-identifiable species that have not been found elsewhere despite important survey efforts in Madagascar over the past 20 years. Subclades with taxonomic or range uncertainties (e.g., in the subgenera Ochthomantis and Pandanusicola) were excluded from analysis). For species that are known from more than two localities, minimum convex polygons (MCPs) were taken as estimate for real species distribution area size. MCPs span the whole area between two occurrence records of a species, disregarding climatic and habitat differences and possible range discontinuities. However, for the understanding of past evolutionary processes, we consider these analyses as adequate because the current distribution of a species alone may be misleading. For instance, if a species has currently a very fragmented range, in the past it must have dispersed from one of its current range fragments to the other and thus occupied a much larger and more continuous range in which it may have co-occurred broadly with other species. Especially in the instance of limited extent of range size and locality records MCPs may therefore be more realistic than fine-scale mapping or modeling on the basis of habitat data as has been applied for other purposes . Furthermore, distribution area modeling is not advised for species with less than five locality records. We call the dataset containing MCPs and large buffer zones "Range Size A dataset" (RSA) and the dataset containing MCPs and small buffer zones "Range Size B dataset" (RSB). For each MCP, we furthermore determined the centroid using the xtools® extension in ArcView. Centroids for two-locality species were estimated half way between these localities. Range proximity among all mantellid species (measured as distance between polygon centroids in km) was determined by calculating a Euclidean distance matrix of polygon centroids using the ArcView "distance matrix" extension (© Jenness, J., 2005), and age-range correlation  was calculated based on range overlap in km2, transformed to overlap in percent of the smaller of each two polygons. We automatized the computation of range overlap in km2 and the resulting percentage of range overlap with a script in ArcView (© Schmalstieg, K.J., 2007). Centroid distances can be measured in species pairs characterized by fully allopatric distributions as well as in those with partly overlapping ranges. We furthermore preferred using centroid distances because these are less heavily affected by possible sampling gaps than are distances between range borders.
Under the hypothesis of peripatric speciation, range size differences are thought to be initially large, a pattern that can but does not necessarily have to dilute over time. Under initially large range size contrasts, we expect either a negative or no correlation between range size differences and evolutionary age of sister species. We tested for this expected, possibly triangular pattern of peripatric speciation in mantellids using quantile regression in R [72, 73] on sister species pairs. Quantile regression accounts for the fact that more than a single slope can describe the relationship between a response variable and a predictor and can discover predictive relationships between variables in cases where there is no or only a weak relationship between the variable means. It allows computing regressions of different sets of the data (e.g., the 90% quantile is the regression slope above 90% of the data points). Multiple slopes are used to describe the relationship between variables that would be missed by other regression models .
Due to limited number of localities and small extent of distribution area for many species, environmental niche modeling did not make sense for our dataset. Instead, 21 climatic variables for each locality per species were extracted from the WORLDCLIM climatic maps (1 km × 1 km resolution, interpolated from lower resolution) [71, 74]. To correct for co-variation among these 21 climatic variables, a Principal Component Analysis (PCA) was performed in Varimax-rotated coordinate system, yielding four factors (PCs) with Eigenvalues >1 (Additional file 1, Tables S5, S6). The highest Eigenvalues were 10.4 for PC1 and 5.9 for PC2, followed by lower Eigenvalues for PCs 3 and 4 (2.4, and 1.1, respectively). From these four factors we calculated squared Mahalanobis distances (which we chose because of multiple locality information per species) between all mantellid species and extracted the data for mantellid sister species with significant branch support from this triangular matrix. These bioclimatic distances served as a covariate of spatial distance.
Testing MEP hypotheses: body size, range size, and diversification
Maximal Snout-Vent Length (SVL) of males has been used as a proxy for body size of frog species before [75, 76]. We compiled values for maximal male SVL for 249 mantellid species and candidate species from Glaw and Vences (2007) and complemented these with own, unpublished data. For the complete list of SVL data, see Additional file 1, Table S1. We computed range size frequency distributions for all mantellids, for mantellid sister species using STATISTICA (Tulsa, OK), and counted the number of species with non-overlapping ranges (allopatric species pairs) and partially or fully overlapping ranges (sympatric species pairs).
A first set of statistical tests was carried out based on the 53 well-supported pairs of sister species as identified in our phylogenetic analysis. We used these pairs as independent data points in tip-based phylogenetic contrast method (PCM) approaches. We computed standardized tip contrasts for range size and body size between them ((RSA1 - RSA2)/SQRT(branch length 1 + branch length 2); ((RSB1 - RSB2)/SQRT(branch length 1 + branch length 2); (SVL1 - SVL2)/SQRT(branch length 1 + branch length 2)) . The branch lengths were taken from the Bayesian phylogeny before ultrametric correction.
We tested whether contrasts in SVL are also a predictor for contrasts of range size in mantellid sister species pairs using univariate regression analyses through the origin. We furthermore correlated range size contrasts with body size contrasts to infer whether small values in both are associated with each other.
To infer whether small range size and/or body size contrasts are pronounced in proximate and recently diverged lineages we performed a factorial regression analysis through the origin. This analysis estimated the effect of evolutionary age, range proximity (expressed as centroid distance), bioclimatic distance (as a covariate to spatial distance, for spatial structure of the climatic niche) and their respective interaction terms on SVL and range size (RSA and RSB) contrasts between sister species pairs. Older species pairs with larger range sizes and/or larger body size were expected to more likely have larger range size contrasts and larger body size contrasts, accounting for the possibility of (potentially asymmetric) post-speciation range shifts or range size changes.
Additional to the results obtained by the sister species tip contrasts we applied phylogenetic comparative methods (PCM) that utilize the whole tree. Effects of SVL on RSA/RSB as well as the effect of these characters on the mitochondrial substitution rate of the all-taxa phylogeny were determined using the software CoEvol . The approximated synonymous substitution rate (dS) and the continuous characters were jointly modeled as a multivariate Brownian diffusion process of unknown covariance matrix  on the concatenated dataset with fixed divergence times. The covariance matrix, phylogenetic variation of the substitution rates, and the continuous characters are then jointly estimated by a Bayesian MCMC process. Because all parameters are modeled in a single multivariate distribution process, substitution rates and morphological traits can be analyzed in a single statistical framework .
We tested the effect of body size and range size on clade diversity using a second PCM implemented in the software MacroCAIC , which is a modified version of comparative analysis by independent contrasts [22, 77]. Species richness contrasts (RRD) as implemented in MacroCAIC are positive, when clades containing species with large values of the inherited character in question are more species-rich than their sister clade, leading to a positive correlation between variable contrasts and richness contrasts . In our example we expect negative richness contrasts in clades with high SVL, leading to a negative correlation between SVL contrast and richness contrast (defining the MEP). To determine the effect of SVL changes on species richness, we performed a regression through the origin for the contrasts produced by MacroCAIC using STATISTICA .