The Trichoderma harzianum demon: complex speciation history resulting in coexistence of hypothetical biological species, recent agamospecies and numerous relict lineages

Background The mitosporic fungus Trichoderma harzianum (Hypocrea, Ascomycota, Hypocreales, Hypocreaceae) is an ubiquitous species in the environment with some strains commercially exploited for the biological control of plant pathogenic fungi. Although T. harzianum is asexual (or anamorphic), its sexual stage (or teleomorph) has been described as Hypocrea lixii. Since recombination would be an important issue for the efficacy of an agent of the biological control in the field, we investigated the phylogenetic structure of the species. Results Using DNA sequence data from three unlinked loci for each of 93 strains collected worldwide, we detected a complex speciation process revealing overlapping reproductively isolated biological species, recent agamospecies and numerous relict lineages with unresolved phylogenetic positions. Genealogical concordance and recombination analyses confirm the existence of two genetically isolated agamospecies including T. harzianum sensu stricto and two hypothetical holomorphic species related to but different from H. lixii. The exact phylogenetic position of the majority of strains was not resolved and therefore attributed to a diverse network of recombining strains conventionally called 'pseudoharzianum matrix'. Since H. lixii and T. harzianum are evidently genetically isolated, the anamorph - teleomorph combination comprising H. lixii/T. harzianum in one holomorph must be rejected in favor of two separate species. Conclusions Our data illustrate a complex speciation within H. lixii - T. harzianum species group, which is based on coexistence and interaction of organisms with different evolutionary histories and on the absence of strict genetic borders between them.


Background
The unique nature of fungi, when the closely related organisms exploit incomparably different strategies for reproduction (mostly sexual vs. exclusively asexual vs. sexual and asexual), leads to existence of a variety of overlapping species concepts. Some fungal species can be differentiated based on sexual compatibility of their members, which are in turn reproductively isolated from the other species (i.e. the biological species concept). The situation is more complicated with those fungi which either do not mate in vitro or have lost the ability to reproduce sexually in nature. In addition, the taxonomy of almost every large group of fungi suffers from certain historical biases which usually come from the applied value of the organisms. The introduction of molecular methods in evolutionary mycology has already resulted in the dramatic taxonomic changes [1] though the effort is still needed to rich the clarity for individual genera. The mitosporic genus Trichoderma (Hypocrea, Ascomycota, Hypocreales, Hypocreaceae), a fungus beneficially used in agriculture, is a striking example for this: the number of its morphologically recognized species (i.e. the morphological species concept) is still around 30 [2][3][4], while the application of genealogical concordance between unlinked DNA loci (i.e. the phylogenetic species concept) resulted in 100 phylogenetic species recognized by 2006 [5] and this number is growing quickly.
The special scientific interest to this genus is largely connected with the modern pace towards the development of the market for organic farm products, which already covers approximately 2% of total world farmland and more than 10% in some European countries such as Austria, Switzerland and Sweden [6]. The management of organic farms requires the integration of biological pest control (biocontrol) in agricultural practices. The control or prevention of some plant diseases by such mycoparasitic fungi or 'biofungicides' such as Trichoderma is an attractive alternative to the use of chemical fungicides [7] and therefore is an important component of modern organic farming. These fungi not only antagonize plant pathogens [8,9], but are also rhizosphere competent and can enhance plant growth in endophytelike associations [10]. However, as the prerequisite to applying biofungicides to farm fields, the biology of every active agent should be understood. Some molecular aspects of Trichoderma mycoparasitism and endophytism -such as the role and regulation of formation of cell wall hydrolytic enzymes and antagonistic secondary metabolites -have been intensively investigated [8].
On the other hand, the genetic stability of the fungus in the environment, its population structure, reproduction strategies and geographic distribution, have received little attention and remain poorly studied.
The asexual (or anamorphic) T. harzianum is the most familiar Trichoderma species as it is the most frequent Trichoderma sp. in the majority of samples worldwide [11][12][13]. It has often even been synonymzed with Trichoderma biocontrol agents in general [14], as it is the principal component in several commercial biofungicide formulations. Although the biology of T. harzianum has not been studied in detail, it was studied taxonomically. It was originally defined as a "species aggregate" [15], but Chaverri et al. [16] reported that it consists of seven genetic lineages which would fulfil the basic criteria of cryptic phylogenetic species within a large morphological species. The latter authors also stated that H. lixii is a teleomorph (sexual stage) of T. harzianum thus raising the probability of genetic recombination in nature. Despite the detected genetic polymorphism, the authors proposed the existence of a single although complex H. lixii/T harzianum species. However, no more definitive data have been published on the reproduction, global geographic distribution and speciation within the H. lixii/T. harzianum.
In this paper we reveal the complex speciation history within H. lixii/T. harzianum sensu Chaverri et al. [16] and show that it consists of several hypothetical biological species, some recent agamospecies and numerous relict lineages with a monophyletic origin altogether.
We also show that T. harzianum sensu stricto and H. lixii s.s. are genetically isolated and therefore are two separate species.

Sample set
As a first prerequisite for this study we established a representative sample set, originally comprising > 300 strains of worldwide origin which were identified as H. lixii/T. harzianum by the ITS1 and 2 barcode [11] and diagnostic morphophysiological characteristics [15,16]. The sample was condensed to 93 strains which were representative for the total genetic and geographic variation observed. The final sample set included 10 strains which were collected as Hypocrea teleomorphs from decaying wood (Table 1), whereas the strains isolated as Trichoderma anamorphs were predominantly from various soil ecosystems. Representative strains of all neighbour species (the Harzianum-Catoptron Clade; [17,18]) were included in order to determine the speciation history within the clade, to set up genetic borders for morphological H. lixii and T. harzianum species, and to be used as a negative control for recombination analyses. The spans of the morphological species (separately considered for teleomorphs and anamorphs) are shown on the right inset in Figure 1. Species names abbreviations presented in Figure 1

Detection of phylogenetic species
We screened among five potential phylogenetic markers for their ability to differentiate within our sample. Besides the finally analyzed loci the initial set included exon fragments of RNA polymerase (rpb2) and translation elongation factor 1-alpha (tef1) genes. However, only three unlinked loci were sufficiently variableintron containing regions of the calmodulin (cal1) and tef1 genes and the coding fragment of the GH18 chitinase gene (chi18-5). Their nucleotide characteristics are given in Additional file 1. We analyzed them as a combined dataset as well as individually.
As seen in Figures 1 and 2, all phylograms confirmed a monophyletic origin for selected 93 strains. Despite the monophyletic origin, the level of intraspecies genetic polymorphism (deduced from the branch lengths) within H. lixii/T. harzianum sensu Chaveri et al. [16] was comparable with that of interspecific variation within the  whole Harzianum-Catoptron Clade (Figure 1 left inset). The application of the genealogical concordance phylogenetic species recognition concept [19], using the criteria of Dettman et al. [20] (i.e. that a clade is an independent evolutionary lineage [phylogenetic species] if it is supported in at least one gene tree and not contradicted in any of the others) identified two distinct phylogenetic species -one was a clade that contained the ex-type strain of T. harzianum (hz.07, = CBS 226.95), which is supported by high posterior probabilities (PP) in tef1, cal1 and in the combined phylograms and not contradicted in the chi18-5 tree (Figures 1 and  2). It includes 15 strains isolated as Trichoderma anamorphs from North America, Europe, Siberia and South Africa. Because it contains the ex-type strain, we consider it to represent T. harzianum sensu stricto and will further term it "T. harzianum" throughout the manuscript. The second phylogenetic species, supported with high significance by all three individual tree topologies, contains anamorphic strains of tropical, mainly African origin, and we thus term it T. sp. nov.'afroharzianum nom. prov.' (for convenience T. sp.  Figure 1), which is statistically supported in the combined phylogram. However, a comparison of individual gene trees ( Figure 2) reveals that this separation is mainly caused by the chi18-5 topology, whereas it is contradicted in the tef1 and the cal1 trees. Nevertheless, a small subclade, which includes both teleomorphs and anamorphs from North America, is supported in the cal1 and tef1 tree and not contradicted in the chi18-5 tree, and we therefore consider this holomorphic subclade as a hypothetical phylogenetic species (H. sp. 'paralixii nom. prov.', Figure 1).
The phylogenetic position of the 14 strains in our sample isolated as Hypocrea teleomorphs proved to be diverse. One of European origin (li14) clustered in the vicinity of H. sp. 'paralixii'. The other European strains were found as terminal branches of Subclade III in the combined phylogram, but this relationship was not supported by individual loci. Two other teleomorph isolates from Vietnam (X.12) and the USA (X.17) could not be attributed to any clade either. Three teleomorph isolates from Cameroon (li07, li08 and li09) formed a supported terminal clade in the tef1 and cal1 trees, and this was not contradicted by chi18-5. We ranked them as another hypothetic phylogenetic species H. sp. "cameroonense nom. prov." (Figure 1).
The phylogenetic position of the remaining strains was discordant in trees derived from single loci. We will use the term 'pseudoharzianum matrix' to specify them along with hypothetical phylogenetic species throughout the rest of the paper (Figure 1).
In order to test whether they would be co-specific with any of the "H. lixii" lineages identified by Chaverri et al. [16], we also included the twelve teleomorph strains investigated by them. The chi18-5 allele was omitted in this analysis. The results showed that 9 of these 12 teleomorph isolates also clustered within the Lixii subclade or in Subclade III (data not shown). Of the remaining strains, one (G.J.S. 85-119) was found in Subclade IV, one (G.J.S. 97-106) formed a sister lineage to T. sp. 'afroharzianum' within Subclade IIa, and one  strain had an unresolved position in the 'pseudoharzianum matrix'. The detected phylogenetic species, hypothetical phylogenetic species and newly introduced terms with and without taxonomic significance are listed in Table 2.
Recombination between and within T. harzianum, T. sp. 'afroharzianum' and strains of the "pseudoharzianum matrix" The uneven distribution of teleomorph isolates in the H. lixii -T. harzianum species complex, particularly their absence in T. harzianum s.s. and T. sp. 'afroharzianum' but their presence in the Lixii subclade, may suggest that either the tested strains are virtually genetically identical and likely clonal (= clonal sterility) or that all teleomorph stains within the sample were completely reproductively isolated (= interspecific sterility). As these fungi can not be crossed in vitro, we applied three alternative computational methods for detection of genetic recombination from sequence data. To this end we used the T. harzianum strains as a control sample for clonal sterility and seven representative strains of species from the Harzianum-Catoptron Clade which were genetically most distant to H. lixii and T. harzianum as a control for interspecific sterility.
First, the partition homogeneity test (PHT; [21]) was used to examine the congruence between gene trees. This test produces artificial datasets by multiple (10 000) re-sampling and random swapping of observed datasets and subsequent construction of maximum-parsimony trees for every newly sampled 'gene' sequence. For clonally reproducing populations (= no sexual recombination), the sums of the lengths of the gene trees for the observed and re-sampled data should be similar. However, under recombination the sums of the tree lengths should be longer than that for the actual data because of introduction of homoplasy into unlinked sites. This test confirmed our analysis of topologies of single locus trees -the clades containing T. harzianum and T. sp. 'afroharzianum' showed congruence of data suggesting their clonality (Figure 1). Other clades, which were incongruent in different trees, provided evidence for recombination. However, recombination was also not detected with strains occurring at unresolved positions on the combined tree (X.01 -X.16) including the ex-type strain of T. inhamatum (X.15 = CBS 273.78) and two strains isolated as teleomorphs ( Figure 1). As none of these strains fulfilled the criteria of a phylogentic species, we assume that they represent multiple relict lineages which are incompatible of recombination either due to the loss of sexuality or due to mating incompatibility evolved in a course of habitat isolation. The relatively long genetic distances between them support this view. The controls (= representatives of Harzianum-Catoptron Clade) produced essentially negative results ( Figure 1, right inset).
As a second means, we used the index of association (IA) test on a subset of 'clone corrected' data (i.e. individuals with identical alleles of the three loci were excluded so that each haplotype was represented only once; cf. [22]). In this test, complete panmixia (sexual compatibility resulting in recombination) would be indicated by a value of 0 (= the null hypothesis). This value was neither obtained with the complete dataset nor with any of the individual clades (data not shown). Yet the Lixii subclade, and Subclades III and IV gave values lower than 1, thus supporting the null hypothesis of sexual compatibility between strains within them. In contrast T. harzianum and T. sp. 'afrohazianum' yielded values above 1, rejecting the recombination. Subclade V was not analyzed as it consisted of only two concatenated haplotypes. In accordance with results from PHT, IA values for strains at unresolved positions (X.01 -X. 16) also showed no evidence for recombination.
Finally we applied the Phi-test, which uses the pairwise homoplasy index (PHI, Ö) to detect refined incompatibility [23]. This method assumes the infinite sites model of evolution [24] in which the detection of incompatibility for a pair of sites indicates recombination. Application of this test to the same subsamples based on phylogenetic species and clades of the combined tree confirmed the results of the previous analyses, and also detected no recombination in T. harzianum, T. sp. 'afroharzianum' and between representative strains of the Harzianum-Catoptron Clade (P = 0.1897, P = 0.2773 and P = 0.3406, respectively). All other subclades showed positive recombination signals (P < 0.05). The corresponding network, produced by Splitstree, is shown on the inset on the right bottom of Figure 1.
Since the Phi-test is a very robust means which can detect recombination even in the presence of recurrent mutation, we decided to use this method to define the borders of recombining populations. To this end, we first set up a non-recombining sample consisting of the most terminal strains of a clade to be investigated, and the most distant strains from other phylogenetic species. Then we gradually added phylogenetically closer strains until evidence for recombination was detected (P < 0.05). To determine the outer border of the recombining population, we excluded the first strain which indicated recombination and started to add phylogenetically more distant isolates from the same subclade or clade. Such approach has the advantage that it also allows the analysis of small subclades which can not be analysed alone due to insufficient data. This technique was applied to all meaningful combinations of species/clades/subclades which were present on the phylograms shown (around 90 individual tests). The positive results (recombination, P < 0.05) obtained are shown by shadowed areas on the main part of Figure 1, and provided interesting insights into the reproduction history of the clades found. For example, although the isolates of T. harzianum were proven to be clonal by all three methods used, they still revealed a recombination history with strains X.01 -X.03 from Northern Africa which occupied an otherwise unresolved phylogenetic position in the tef1, cal1 and chi18-5 trees. Another presumably recent agamospecies, T. sp. 'afroharzianum', exhibits a history of recombination with five phylogenetically plesiotypic strains of different geographic origins (Figure 1). Not all strains of the Lixii subclade showed a recombination history in this test. Nevertheless two groups with positive recombination signal were detected -one includes strains of H. sp. 'paralixii' and two temperate strains from North America and Europe; the second covers the latter two strains, H. sp. 'cameroonense' and the ex-type strain of H. lixii. A limited recombination was also detected for strains from Subclade III occupying terminal positionsthree Austrian teleomorphs and two anamorphic strains isolated from Sardinian soils were recombining with each other but not with strains occupying basal branches of the subclade. Similarly, the four terminal strains from a Subclade IV (all from Africa) display a history of recombination. In addition, all strains of the panmictic subclade V showed evidence for recombination. This result corresponds to the fact that they occupy conflicting positions in the single gene trees but belong to the same subclade of ech18-5 phylogram indicating their common evolutionary path. The most genetically polymorphic isolates of the 'pseudoharzianum matrix', which formed almost no phylogenetic structure (Subclade X), showed no evidence for recombination signal when they were confronted either with each other or with other subclades. However, when some of them were analysed together with representatives of the Harzianum -Catoptron Clade, a significant recombination signal was detected indicating traces of interspecific sexual compatibility (open shadow area on Figure 1).

Population divergence and stability
To estimate the degree of differentiation within the 'pseudoharzianum matrix', we applied methods for analysis of populations and computed the F ST values [25] for the main subclades of the combined phylogram. Qualitatively, an F ST value in the range close to 0 indicates low differentiation (= fixation of characters) between populations (compared populations are composed of equally different individuals) and close to 1 indicates great differentiation when populations are composed of similar individuals but there is a big difference between populations [26]. F ST values between T. harzianum and other subclades were in the range of 0.67 -0.83 indicating essential genetic separation of this species. The F ST value between Subclades III and IV of the 'pseudoharzianum matrix' was low (0.17) which may indicate that some genetic exchange still occurs between their strains. The population parameters θ (for haploids 2 Nm, where N is effective population size and m is the mutation rate per site and generation) and population growth rates were calculated using LAMARC package (see Methods). Consistent with the occurrence of H. lixii -T. harzinum species complex as one of the most frequent taxon of the genus, the growth rate of most subclades is large, which is indicative of a large effective population size and population expansion (data not shown). The recombining subclades had the highest growth rate. Only T. harzianum exhibited a significantly smaller value for population growth (5.90) and also a 5-20-fold smaller θ-value than the other subclades. This suggests that this species apparently occupies a niche with a limited potential for expansion.

Discussion
The perception of fungal species as dynamic entities which arise, persist for a longer or shorter period, modify, decline and then become extinct and replaced by other species leads to a variety of existing species concepts in mycology. In addition, the taxonomy of almost every large fungal genus is biased either because of its importance to mankind (e.g. for convenient differentiation of pathogenic or industrial organisms) or by its history of taxonomic description. Here we have analysed H. lixii -T. harzianum species complex, one of the most commonly sampled groups of fungi because of its dominant presence in the majority of soil ecosystems worldwide and its occupation of a broad diversity of ecological niches. From the results of this paper, the evolutionary success of the H. lixii -T. harzianum species aggregate may be attributed to the very complex structures of the contemporary populations of these fungi, which can be differentiated into nearly all possible 'types' of fungal species: reproductively isolated biological species, sympatric and allopatric phylogenetic species, recent agamospecies and numerous relict lineages with unresolved phylogenetic positions. This complexity comes from the fact that many of these species 'types' are overlapping and therefore it frequently happens that two closely related organisms become attributed to different species recognized based on incomparable criteria. Our data illustrate a speciation history within H. lixii -T. harzianum species complex, which is based on coexistence and interaction of organisms with different evolutionary strategies and on the absence of strict genetic borders between them.
The genealogical concordance phylogenetic species recognition [GCPSR, [19]] concept is the most advanced approach to defining species in modern fungal taxonomy. Yet our results demonstrate that even this method has its limits within a population of isolates with different recombination histories. Basically, the whole of the H. lixii -T. harzianum species complex might be considered as a single species as its monophyletic origin is supported by all three loci tested in this paper. Chaveri et al. [16] used the indistinguishable morphology of these fungi in favour of such a conclusion. However, this approach appears to be invalid when T. harzianum sensu Chaveri et al. [16] is compared with its genetically closest neighbouring species. The genetic distances calculated within H. lixii -T. harzianum species complex are comparable to the distances between well diverged species of the Harzianum-Catoptron Clade, which are characterized by different phylogenies, morphologies, physiologies and ecological characteristics [17]. Within the H. lixii -T. harzianum complex, a conservative application of the GCPSR concept justifies two anamorphic phylogenetic species (T. harzianum and T. sp. 'afroharzianum'), and does not contradict postulating two further holomorphic phylogenetic species (H. sp. 'paralixii' and H. sp. 'cameroonensis'). No consistent phylogenetic structure could be detected, however, in the remaining strains.
The results from this work also shed new light on the concept of the H. lixii/T. harzianum holomorph and the synonymization of T. inhamatum with T. harzianum. All methods used clearly differentiated T. harzianum sensu stricto from the remaining isolates as an agamospecies with global distribution in temperate ecosystems. Since T. harzianum and H. lixii are genetically isolated and evidently do not appear in the same life cycle, the holomorph H. lixii/T. harzianum must therefore be rejected. Also, the unclear phylogenetic position of the ex-type strain of H. lixii, and the fact that teleomorphs with 'H. lixii' morphology are also present in subclade III which is reproductively isolated from H. lixii sensu stricto, render the naming of most of the isolates investigated here as "H. lixii" doubtful. Finally, we show that T. inhamatum is a separate phylogenetic lineage only distantly related to H. lixii or to T. harzianum, and its name should therefore be maintained.
Recombination is a powerful evolutionary force that merges historically distinct genotypes. Yet the extent of recombination within many organisms is unknown, and even determining its presence within a set of homologous sequences is a difficult question. Molecular traces of sexual recombination were detected for the majority of tested strains and phylogenetic species, and were correlated with the occurrence of most of the teleomorph isolates within recombining clades. This finding is the opposite of what has been seen with other apparently asexual fungi such as the human pathogens Coccidioides posadasii [27], C. immitis [28] and Paracoccidioides brasiliensis [29], the facultative pathogen Aspergillus fumigatus [22], and the mycotoxin producer A. flavus [30]. In all these cases, teleomorphs were not found, whereas phylogenetic evidence for recombination was obtained. In our sample, such a situation was seen only for Subclade V which is composed of evidently recombining strains exclusively isolated as anamorphs.
In this study, we introduced a trial-and-error approach to detect borders of recombination within a sample. The limited distribution of sexual recombination within the phylogenetic clades underlines that these fungi pass through periods of sexual and asexual recombination. Based on the unresolved structure of the "pseudoharzianum matrix" in the individual and combined phylogenetic trees, we originally expected sexual compatibility between all of its strains. Yet our data showed that distantly related strains from the same subclades had already lost their ability for genetic exchange. The most striking example was the absence of recombination between the H. lixii ex-type strain and two other teleomorphic strains from the same subclade, which further supports the postulation of H. sp. 'paralixii' as a separate albeit closely related taxon.
With the exception of H. sp. 'paralixii' and Subclade IV (whose isolates were exclusively derived from Africa, South-East Asia, Australia and New Zealand), all other clades exhibited a global geographic distribution. While a worldwide distribution of fungi was at one time believed to be the rule, the application of molecular genetic methods has recently shown that most of these globally distributed taxa actually consist of several allopatric cryptic species. To the best of our knowledge, the only other similar finding of panmixis for a mitosporic fungus has recently been presented for A. fumigatus [22]. The fact that most of the clades detected in this study could maintain (in part) a panmictic population is interesting, because it is known (e.g. for insects) that dispersal and subsequent allopatric speciation can occur in very short times, even in the absence of severe population bottlenecks [31,32]. This lack of allopatric speciation -which otherwise seems to occur in several other species of Hypocrea/Trichoderma [cf. [33,34]] -must be due to a continuous, unrestricted gene flow, whose mechanism warrants being identified.
T. harzianum sensu Chaveri et al. [16] is one of several Trichoderma species which are successfully used in biological control of plant pathogenic fungi. The results from this study show that the respective strains must be members of one of several phylogenetic species with different recombination frequencies. In a preliminary test of a few commercially used "T. harzianum" biocontrol agents, none of them showed the gene sequences characteristic for the strains from the clonal T. harzianum and T. sp. 'afroharzianum' clades, and therefore may be members of the recombining populations and phylogenetic species (C.P. Kubicek, unpublished data). Recombination of a biocontrol strain clearly could have a significant impact on its stability in the field. A study aiming at clarifying this situation is currently in progress.
Fungi belonging to 'pseudoharzianum matrix' are abundantly isolated from various environments and are frequently selected as biocontrol strains. Therefore their correct identification is of great importance. This study shows that because some of them are able to recombine while others are probably sexually incompatible, their attribution to a single taxon would be very doubtful. We suggest the introduction of the provisional temporary name H. sp. 'pseudoharzianum nom. prov. dub.' in order to separate strains belonging to 'pseudoharzianum matrix' from phylogenetic species such as T. harzianum s.s., T. sp. 'afroharzianum', H. lixii s.s. and T. inhamatum, which can be correctly identified by the analysis of tef1 4 th large intron sequence using either the similarity search tools or phylogeny. The development of an integrated bioinformatic tool for the haplotype-based identification of fungi within H. lixii -T. harzianum species complex and for the potential prediction of their mycoparasitic abilities has been started in the laboratory of the corresponding author.

Conclusions
In this work we did not defeat the 'harzianum demon'. However, we demonstrated its power, span and perhaps its limits, and believe that the current study provides some understanding of the forces driving speciation in these fungi. A major challenge of future work will be the elaboration of standardized methods by which the phylogenetic species and the 'pseudoharzianum matrix' detected here can be conveniently differentiated with predictable biological activities. Preliminary data in our laboratory showed that the use of 95-carbon source phenotype microarrays may be helpful in this regards (J. Bissett and I.S. Druzhinina, unpublished data).

Strains and gene sequences
The strains, sequences and NCBI GenBank accession numbers are listed in Table 1, and are maintained in long-term storage facilities at Vienna University of Technology (Austria) and ECORC (Canada) laboratories.

DNA extraction, PCR amplification and sequencing
Mycelia were harvested after 2 -4 days of growth on malt extract agar at 25°C and genomic DNA was isolated using QIAGEN DNeasy® Plant Maxi Kit following the manufacturer's protocol. Amplification of DNA fragments comprising ITS1 and 2 and the 5.8S rRNA gene, the endochitinase chit18-5 (= ech42; [35]) and the large 4 th intron of tef1, amplicon purification and sequencing was performed as described in detail previously [36]. Previously published sequences used for phylogenetic analyses in this study were retrieved from GeneBank and are identified by their respective accession numbers.

Phylogenetic analyses
For the phylogenetic analysis DNA sequences were aligned with Clustal X 1.81 [37]. As the 4 th large intron of tef1 is highly polymorphic, the alignment contained several ambiguous areas which could contain homoplasiuos characters resulted from multiple substitutions and/or saturation. In order to detect such areas we have processed the concatenated alignment using the Gblocks server [ [38], URL: http://molevol.cmima.csic.es/castresana/Gblocks_server.html]. When the default stringent parameters of Gblocks were applied, nearly complete locus of the tef1 intron has been removed (not shown); when the less stringent options were used, only 10% of the alignment was removed leaving the considerable part of tef1 intron in the analysis (Additional file 2). The poorly aligned areas of tef1 selected by Gblocks have been carefully edited manually by inserting extra gap columns in order to reduce the difference between sequences originating from hypothetically homoplasious characters. The original and edited alignments are available from Additional file 3. The possibility of intragenic recombination, which would prohibit the use of the respective loci for phylogenetic analysis, was tested by linkage disequilibrium based statistics as implemented in DnaSP 4.50.3 [39]. The neutral evolution of coding fragments (cal1 and chi18-5) was tested by Tajima test implemented in the same software. The interleaved NEXUS files were formatted using PAUP*4.0b10 [40]. The best nucleotide substitution model for the each locus was determined using jMODELTEST [41]. As Akaike and Bayesian Information criteria (AIC [42] and BIC [43], respectively) selected different nucleotide substitution models for every locus and due to the relatively small size of individual datasets (1379 characters per 107 sequences for the biggest) the unconstrained GTR + I + G substitution model was applied to all sequence fragments (Additional file 1). Metropolis-coupled Markov chain Monte Carlo (MCMC) sampling was performed using MrBayes v. 3.0B4 with two simultaneous runs of four incrementally heated chains that performed 5 million generations. The length of run (number of generations) for each dataset was determined using AWTY graphical system [44] to check the convergence of MCMC. Bayesian posterior probabilities (PP) were obtained from the 50% majority rule consensus of trees sampled every 100 generations after removing the first trees using the "burnin" command. Number of discarded generations was determined for every run based on visual analysis of the plot showing generation versus the log probability of observing the data. According to the protocol of Leache and Reeder [45] PP values lower than 0.95 were not considered significant while values below 0.9 are not shown on the resulting phylograms. Phylograms calculated based on alignments trimmed in Gblocks are given in Additional file 4. The phylogenetic tree inferred based on the complete corrected alignment was used for the data interpretation. Model parameters summaries after MCMC run and burning first samplings as well as nucleotide characteristics of used loci are collected in Additional file 1.

Detection of Recombination
The congruence or incongruence of the three gene genealogies was used to infer recombination between isolates, thereby excluding isolates of the "sensu stricto" group (see Results). To this end, three different tests were employed: the incongruence length difference/partition homogeneity test (ILD/PHT) [21,46] using a score of P < 0.05 to reject the null hypothesis of congruence between loci; the Index of Association [IA; [47]] test, in which the data were compared to the IAs of artificially recombined datasets [27,48]; and the Phi-test implemented in SplitsTree, which uses the pairwise homoplasy index, PHI (= Φ) statistic, to detect refined incompatibility indicating recombination [23].
In addition we applied split decomposition implemented in the SplitsTree program, version 4.0 [49], using pairwise distances under the Kimura 3-ST model [24]. This method visualizes recombination events by depicting the shortest pathway linking sequences, rather than forcing them into a bifurcating tree [49].

Population divergence and stability
Population growth parameters and theta (θ) values were inferred using the program LAMARC 2.0 [50]. All polymorphic sites were used to assess the population parameters. To estimate the population growth parameter, we used 10 initial chains with 2,000 genealogies sampled and two final chains with 20,000 genealogies sampled. Population parameters were inferred using both Bayesian and Maximum Likelihood criteria. Submit your manuscript at www.biomedcentral.com/submit