Population structure of Venturia inaequalis, a causal agent of apple scab, in response to heterogeneous apple tree cultivation

Background Tracking newly emergent virulent populations in agroecosystems provides an opportunity to increase our understanding of the co-evolution dynamics of pathogens and their hosts. On the one hand host plants exert selective pressure on pathogen populations, thus dividing them into subpopulations of different virulence, while on the other hand they create an opportunity for secondary contact between the two divergent populations on one tree. The main objectives of the study were to explore whether the previously reported structure between two Venturia inaequalis population types, virulent or avirulent towards Malus x domestica cultivars carrying Rvi6 gene, is maintained or broken several years after the first emergence of new virulent strains in Poland, and to investigate the relationship between ‘new’ and ‘native’ populations derived from the same commercial orchards. For this purpose, we investigated the genetic structure of populations of the apple scab fungus, occurring on apple tree cultivars containing Rvi6, Rvi1 or Rvi17 resistance gene or no resistance at all, based on microsatellite data obtained from 606 strains sampled in 10 orchards composed of various host cultivars. Results Application of genetic distance inferring and clustering methods allowed us to observe clear genetic distinctness of the populations virulent towards cultivars carrying Rvi6 gene from the Rvi6-avirulent populations and substructures within the Rvi6-group as a consequence of independent immigration events followed by rare, long-distance dispersals. We did not observe such a structuring effect of other genes determining apple scab resistance on any other populations, which in turn were genetically homogenous. However, in two orchards the co-occurrence of strains of different virulence pattern on the same trees was detected, blurring the genetic boundaries between populations. Conclusions Among several resistance genes studied, only Rvi6 exerted selective pressure on pathogens populations: those virulent toward Rvi6 hosts show unique and clear genetic and virulence pattern. For the first time in commercial Malus x domestica orchards, we reported secondary contacts between populations virulent and avirulent toward Rvi6 hosts. These two populations, first diverged in allopatry, second came into contact and subsequently began interbreeding, in such way that they show unambiguous footprints of gene flow today. Electronic supplementary material The online version of this article (10.1186/s12862-018-1122-4) contains supplementary material, which is available to authorized users.


Background
Fungal pathogens may affect plant physiology and reduce plant fitness by reducing the quality and quantity of seeds or fruits. Consequently, diseases caused by fungal pathogens are a vital component of global yield losses and therefore require intensive crop protection practices against pathogens, including intensive chemical use. However, the increasing demand for organic and chemical residue-free food requires the development of alternative protection practices, such as genetic pest control via plant breeding. This is even more important for agrosystems in which crop rotation is impossible, such as the perennial apple, grape or citrus tree crops. Venturia inaequalis (Cooke) G. Winter is a haploid fungus from the Ascomycotina class that is responsible for the most damaging apple disease reported in almost all apple-growing regions -apple scab. Its annual cycle includes sexual reproduction on infected apple leaf litters in the winter followed by several cycles of asexual reproduction during the apple growing season, what makes disease management challenging. Disease control predominantly relies on leaf litter management and repeated fungicide applications in the spring [1,2].
Most genetic pest controls for V. inaequalis rely on gene-for-gene interactions [3], such that the infection outcome is determined by the interaction between the products from a specific locus in the plant (the main resistance R gene) and a gene from the pathogen (avirulence gene). In this kind of interaction, the product of a given avirulence gene is recognized by the plant harbouring the matching resistance allele of the corresponding R gene. If a plant harbours the susceptibility allele or if the pathogen has the alternative virulence allele, infection occurs. To date, 17 main R genes that determine resistance against apple scab have been identified within the Malus species [4]. For example, in Poland, the first scab resistant cultivars were imported from the USA in the early 1970s. The R gene Rvi6 (previously named Vf, [4], originating from M. floribunda) has been widely introgressed in cultivars of M. x domestica since that time, while the Rvi1 (previously named Vg, [4], originating from 'Golden Delicious' cultivar of M. x domestica) and Rvi17 (previously named Va1, [4], originating from ' Antonovka' cultivar of M. x domestica) cultivars have been registered in Poland in the 1990s. On the other hand, a number of apple resistance genes that were recently introgressed into M. x domestica cultivars from wild genetic resources of Malus were overcome by the V. inaequalis fungus, as reported from many areas of apple cultivation in Europe ( [5][6][7] and others). However, in Poland, the Rvi6 cultivars (mainly Topaz, Rubinola and Freedom) were considered as free from V. inaequalis infections until the first report of apple scab symptoms in the Topaz cultivar in 2009 [8].
In the subsequent years, apple scab was observed on other Rvi6 cultivars, but the severity of the disease varied.
Previous population genetic studies have shown that apple resistance genes might induce specialization in pathogen populations, resulting in host-related adaptations [9,10]. The presence or absence of a single resistance gene in the host plant (Rvi6) reportedly has a very strong effect on the structure of populations that infect both genotypes that can be generally maintained, even at the orchard scale [11,12]. Recent demographic history studies [13] have revealed that the Rvi6 and non-Rvi6 V. inaequalis populations diverged several thousand years ago (8000-50,000 years) with no detectable gene flow between the two lineages. Based on extensive studies comprising populations at the European scale, Lemaire et al. found strong statistical support for a scenario in which the populations infecting the current Rvi6 varieties present in the agroecosystem emerged from existing populations present on the wild progenitor Japanese crabapple M. floribunda, which was further followed by several migration events of Rvi6-virulent individuals to European orchards. In a French orchard composed of both Rvi6 and non-Rvi6 hosts, Leroy et al. demonstrated that secondary gene flow between both populations is possible in nature, leading to both genetic and epidemiological changes [14].
Based on observations of disease symptoms in Rvi6 cultivars, we concluded that this mechanism of resistance has also been overcome in Poland. Since little is known about the genetic structure of the newly observed pathogen populations affecting Rvi6 cultivars in Poland or their relationship with 'native' V. inaequalis populations, we wanted to assess the occurrence and characterize these pathogens in several organic orchards. One of our main goals was to assess the maintenance of reproductive isolation between populations in different commercial organic orchards, including those in which Rvi6 and non-Rvi6 cultivars occur sympatrically. This assessment would elucidate the evolutionary potential of fungal populations which diverged in allopatry for some period of time, what should be reflected in their genetic distinctness, yet now are observed in the same orchards, experiencing the secondary contacts. Moreover, we wanted to explore whether monogenic resistances other than the Rvi6 present in Malus x domestica cultivars in Poland could differentiate pathogen populations. We wanted to explore the genetic relationship between the Rvi6-virulent populations and other virulent pathogen populations, particularly those affecting apple cultivars containing Rvi17 (Va1), Rvi1 (Vg) or those without any known R genes by tracking the recombination possibilities between them. To reach this goal, we analysed 633 infected leaves, collected in 10 orchards across Poland and genotyped these samples at 11 microsatellite loci, since microsatellite markers were found as a good method used for V. inaequalis genotyping, suitable for inference about differentiation at fungal population level [2,10,11,15,16].

Fungal material and DNA isolation
During the late spring and early summer of 2012-2014, two commercial chemically controlled, six organic fungicide-free and two with mixed type of control apple tree orchards located in the central, southern and northern parts of Poland (Fig. 1, Table 1) were sampled, including the orchard in which the first Rvi6-virulent strains were reported in Poland (Lublin). In all the orchard trees of Malus x domestica were sampled, and one population was collected from crab apple trees -F1 seedling of Malus x zumi var. 'Colocarpa' originated from open pollination (OZD, Dabrowice). Infected leaves with clear scab lesions were randomly collected from mono-R-genic cultivars (Rvi1, Rvi6 or Rvi17) and from cultivars without any known sources of apple scab resistance (hereafter Rvi0). Once the leaves were air dried, single and distinct scab lesions were cut from the leaves, placed into Eppendorf tubes and frozen at − 20°C. prior to DNA extraction, the samples were submerged in liquid nitrogen and homogenized using a sterile mortar and pestle. The total DNA was extracted using the GeneMatrix Plant & Fungi DNA Purification Kit (EURx, Gdańsk, Poland) according to the manufacturer's instructions and eluted with 100 μl of the provided elution reagent. In total, DNA from 20 orchard populations, each comprising of 30-35 single scab lesions, were subjected to further analyses.

Amplification of V. inaequalis microsatellite markers
The DNA from each sample was genotyped at 11 microsatellite loci. The amplifications were performed in one simplex and five multiplexes PCRs. Each forward primer in the applied primer sets (1tc1a, 1tc1b, 1aac3b, 1tc1g [17], Vitcca7/P, Vica9/152, Vitg11/70, Viga7/116, Vica9/ X, Vicacg8/42 [15] and M42 [11]) was labelled with a non-radioactive fluorescent dye (HEX, TAMRA or FAM, Genomed SA, Warsaw, Poland). The reactions were performed in a total volume of 15 μl containing 1-10 ng of DNA, 0.2 mM of each dNTP, 333.3 nM of each primer, 0.6 U of GoTaq® G2 Flexi DNA Polymerase (Promega Corp., Madison, USA), 1 x optimized reaction buffer (Promega), 1.5 mM MgCl 2 and double distilled water. The conditions for each reaction consisted of an initial denaturation at 95°C for 3 min, followed by 40 cycles of denaturation at 95°C for 30 s, annealing at 58°C for 30 s (50°C for the 1tc1g primer set), polymerisation at 72°C for 30 s and a final extension step at 72°C for 5 min. The amplicons sizes were scored using an ABI PRISM 310 Genetic Analyzer with GeneScan™ 500

Dataset preparation
Because the samples were collected during the asexual phase of the V. inaequalis life cycle, repeated identical multilocus genotypes were identified using the Geno-Type application [18] and discarded from the data set. A clone corrected dataset was used for the analyses.

Clustering methods
To determine the most likely number of populations represented by the samples and to assess the levels of differentiation, five clustering and assignment methods were used. [19] was created from the dataset using MSA programme [20] that presented the geometric distance between the data without biological assumptions. Based on the matrix, principal coordinate analysis (PCoA) was performed using GenAlEx 6.5 programme [21] for the whole data set and for the Rvi6 and non-Rvi6 populations separately. 2. Individuals were assigned to regional sample groups using GeneClass 2.0 [22]. The probability of individuals coming from a group other than the one in which they were primarily assigned (i.e. Rvi6-group, Rvi1-group, Rvi17-group or Rvi0group) was calculated using the standard Bayesian criteria described by Rannala and Mountain [23] and by simulating 1000 individuals per population using the method provided by Paetkau et al. [24]. Individuals were assigned to the population group that had the highest probability of being their source. 3. The STRUCTURE 2.3.4 programme [25], which implements the Bayesian model-based clustering method for inferring population structures, was used to assign individuals to a specific number of clusters (K). The implemented models assumed that the loci within populations are at Hardy-Weinberg equilibrium and linkage equilibrium. To estimate the optimal number of clusters, K values ranging from 1 to 15, with ten repetitions for each value, were tested to examine the likelihood convergence values for each K value. After the initial burn-in period of 5 × 10 4 iterations, 1 × 10 4 Monte Carlo Markov Chain schemes were performed for each run. No prior information regarding individual samples related to their location or their host was assumed, and a model with correlated allele frequencies and admixture among the populations was applied. The number of populations that best represented the observed data under the model implemented was determined by maximizing the estimated lnlikelihood of the results obtained for the different K values and the ΔK index, which is based on the rate of change in the ln-likelihood of the data between successive K values, to find the best K value estimates [26] using the online programme STRUCTURE HARVESTER. The ΔK values were plotted against K values ranging from 2 to 15, and the optimal cluster solution indicated by the programme represents the K value for which the highest peak was observed. The proportion of ancestry represented by the Q-value matrices obtained for K values ranging from 2 to 5 were summarized using the CLUMPAK package [27] to obtain the average individual membership coefficients in the inferred clusters. 4. Another programme TESS 2.3.1, implementing an individually based spatially explicit Bayesian clustering algorithm for population genetic studies [28,29], was used to determine the admixture proportion for the individuals. In the analysis, individual assignments to the different genetic clusters (K values from 2 to 10) were simulated 100 times using an admixture model [30] with 5000 Markov Chain Monte Carlo (MCMC) sweeps and 1000 burn-in MCMC sweeps. The deviance information criterion (DIC), a statistical measure of the model prediction capabilities, was computed by TESS for each simulation. A comparison of the best simulations based on the DIC values per K value was used to determine the most likely number of genetic clusters for each analysis. Mean DIC values, obtained for the ten runs with the lowest DIC values, were plotted against K values ranging from 2 to 10. The optimal cluster solution indicated by the programme represents the K value for which the inflection point was observed. The proportion of ancestry represented by the Q-values matrices of the top 10% simulations per K value (i.e., the lowest DIC value per K value) were summarized using the CLUMPAK package.

A chord distance matrix between the populations
Based on the CLUMPAK outputs obtained from TESS and STRUCTURE Q-values, graphical presentations of the average individual membership coefficients were performed using R script available with TESS software [31]. The estimated membership probabilities for each individual are presented in a bar chart, allowing for the inference of whether individuals shared ancestry with multiple clusters and to verify the fraction of individual genotypes that were assigned to each cluster according to the implemented admixture model. 5. For the purpose of comparison with Bayesian-based clustering algorithms, the clustering method based on distance matrix was applied. GDA program [32] was used to obtain matrices of pairwise estimates of genetic distance among 20 populations (according to Nei, 1978, [33]). Based on the obtained matrices and using Neighbour-joining (NJ) and UPGMA algorithms, the program constructed dendrograms, which were subsequently visualized using TreeViewX free software.

Standard genetic diversity analyses
For each population, the mean number of alleles per locus and the number of private alleles (i.e., alleles that were found only in a single population among a broader collection of populations) were estimated using the GDA, allelic richness (calculated as the average number of alleles per locus) and unbiased average gene diversity (H D , [34]) using FSTAT 2.9.3.2 [35], and allele frequency using ARLEQUIN 3.5.1.2 [36] programmes.

Analysis of genetic variation between populations and groups of populations
Population genetic structure was examined by means of a hierarchical analysis of molecular variance (AMOVA) as implemented in ARLEQUIN. Two-way AMOVA was conducted to estimate the variation associated with the differences among groups of populations, populations within groups and individuals within populations. For the comparison, one-way AMOVA was conducted by considering each group of populations separately to compare the distribution of genetic variation among vs within the populations within groups. GenAlEx was used to compute and test the statistical significance of Fistatistics using a nonparametric approach based on 1000 random permutations of haplotypes.
The levels of differentiation between all populations and among populations within the assumed groups of populations (i.e. Rvi6-group, Rvi1-group, Rvi17-group, Rvi0group or general non-Rvi6-group) were estimated using Weir and Cockerham's pairwise estimator Ɵ [37], which is an analogue of Wright's F ST , and its statistical significance was assessed using a permutation test (1000 permutations) implemented in the GENETIX [38] programme.
Nei's unbiased [33] pairwise estimates of genetic distance between the assumed groups of populations were calculated using GenAlEx 6.5.
Pairwise gene flow (Nm) was estimated in GenAlEx for all populations, where Nm < 1 indicated low levels of gene flow, Nm = 1 indicated that the drift effects were exactly counterbalanced by the effects of gene flow so that populations neither diverged nor converged, and Nm > 1 suggested high levels of gene flow [39].

Multilocus linkage equilibrium analysis
Allele associations among different loci were examined using a generalized measure of multilocus linkage equilibrium (rBAR d ) [40,41], which is a modification of the index of association (I A ) and is independent from the number of loci included in the analysis. The null hypothesis of the random association among alleles at different loci (rBAR d = 0), indicating random mating, was tested across all assumed groups of populations using the programme MULTILOCUS 1.2.2. [41]. The observed value of the statistic was compared to that obtained after 1000 randomizations to simulate recombination.

Isolation by distance
To test whether the genetic structures of the populations were influenced by Wright's isolation by distance theory, the Mantel test was performed using the GenAlEx and Isolation By Distance Web Service 3.23 [42] programmes. The programmes computed the correlation between the pairwise genetic distance matrix (e.g., linearized F ST by Slatkin or Rouset's) and the pairwise Euclidean geographical distance matrix, both of which were log or not transformed.

Preparation of the dataset and allele frequency estimations in the populations
The single, clear amplicons comprising 11 microsatellite loci were obtained for 633 samples. Clone correction using GenoType software revealed the presence of repeated haplotypes in 8 populations, while the highest clone percentage was observed for the END (Rvi6) population (Table 1). After removing clones, the data from 606 individuals belonging to 20 populations (each comprising 25-32 individuals) were analysed. A total of 171 alleles were found across the 11 microsatellite loci, with the number of alleles per locus ranging from 5 for 1aac3b to 32 for 1tc1g (mean = 15.6). For the groups of populations, the lowest allelic richness was observed for the Rvi6-group (3.563), while the values for the other groups were higher and at a similar level (Rvi17-group = 6.527, Rvi1-group = 6.275 and Rvi0-group = 6.370). The highest value for a population was for the ASI population (from the Rvi17group, Table 1). The frequency of private alleles was the highest in two Rvi17 populations, ASI and ABR, while the lowest frequency or total absence was observed mostly in the Rvi6 populations ( Table 1). The highest number of alleles per locus was observed for the 1tc1g locus (32 alleles), while the lowest was observed for the 1aac3b locus (5 alleles).

Clustering and assignment methods
PCoA analysis separated the V. inaequalis populations into two distinct groups, with the first and second principal coordinates accounting for 25.06% and 10.51% of the total variation, respectively. The first principal coordinate separated the populations into two main groups, in which one group contained only the Rvi6 populations (Fig. 2a). When these groups of populations were tested separately, the first and second principal coordinates divided eight Rvi6 populations from Lublin, Jeziorsko, Nowy Dwor and Brzeziny according to their geographical origin (Fig. 2b), whereas no spatial structures were observed for the non-Rvi6 populations (Fig. 2c).
The exclusion-based method implemented in GeneClass produced an accurate assignment rate (the probability that an individual belongs to the reference population within which it was sampled) of 22.8% when all the individuals and populations were tested. Of the Rvi6, Rvi17, Rvi1 and Rvi0-groups of populations, the highest rate of accurate assignment was observed for the Rvi6-group (61.5%), while the lowest rate of accurate assignment was observed for the miscellaneous group comprising populations from cultivars without known apple scab resistance genes (Rvi0, 37.2%). Overall, the rate of misassignment (an individual having a genotype that is most likely to occur in a population other than the one in which it was sampled) was high, as many individuals from one host cultivar group tended to be assigned with high probability to all the other host groups, but the individuals from the Rvi6-group were mostly assigned to the other populations from this group.
To estimate the most likely number of ancestral populations (K), individual-based Bayesian clustering analyses were performed using the STRUCTURE and TESS programmes. Using STRUCTURE, we observed the highest Evanno's ΔK index for K = 4, but this index had low statistical support (ΔK = 14.73) (Additional file 1: Fig. S1, left graph). Using TESS, we determined that the average DIC values for the K values between 2 and 10 clearly gradually decreased from K = 2 to K = 5 and then marginally varied (lowest DIC for K = 7), supporting 5 to 7 clusters (Additional file 1: Fig. S1, right graph). Taken together, we did not find evidence for an 'optimal K' number of clusters. To avoid overinterpretation, we elected to interpret a series of admixture plots for K values from 2 to 5 (Fig. 3). At K = 2, both programmes gave very similar results and divided all the individuals according to their virulence towards Rvi6 hosts. Indeed, most samples collected from the Rvi6 trees were assigned to a single group, while samples collected from the other trees (Rvi1, Rvi17 and Rvi0) were grouped together. In both analyses, the algorithms detected some admixed individuals. From K = 3 to K = 5, STRUCTURE and TESS found evidence of additional groups, but they were irrespective from any other sources of resistances, suggesting that the algorithms actually indicated different levels of substructure within both the Rvi6 and the non-Rvi6 cluster. At K = 3, STRUCTURE found support for a substructure within the non-Rvi6 gene pool, while TESS found a third group within the Rvi6 cluster that preferentially comprised individuals of populations from Lublin and non-Rvi6 populations. At K = 4 using STRUCTURE, all the Rvi6 populations were divided into two clusters that each comprised populations from two distant geographical locations, whereas the rest of the populations were separated to two other clusters. At K = 4, the fourth cluster in TESS (green colour) favoured both Rvi6 and non-Rvi6-virulent populations from Lublin. Individual clustering observed for K = 5 in STRUC-TURE was very similar to that at K = 4. Four subgroups among the Rvi6 populations were distinguished at K = 5 in TESS, corresponding with the geographical origin of the populations, whereas non-Rvi6 populations were in one heterogeneous cluster, with different levels of admixture for the individuals. For K = 7, indicated as the optimal number of clusters using TESS, some substructures were observed among the Rvi6 populations. However, generally most individuals were poorly assigned to the inferred clusters (Additional file 2: Fig. S2). Due to some factors that can substantially affect clustering and ancestral population inference, such as effective sample size or linkage between markers, there is a risk that clusters indicated by the Bayesian algorithms may not correspond to real populations [25]. In practice, rigorous estimation of K is a difficult statistical problem, even when all the assumptions of the underlying model are assumed to hold [43]; thus, the clustering results should be interpreted with care. This could be the possible explanation for why the same individuals were divided into various numbers of optimal clusters dependent upon the method used, although this had only a low level of support. In spite of that, all cluster analyses clearly indicated the existence of two main clusters, one comprising Rvi6virulent individuals and the other comprising Rvi6-avirulent individuals. Interestingly, at K = 2, a higher average level of admixture was observed in the DND and MGL populations, which were collected in the orchards in Nowy Dwor and Lublin, respectively, with co-occurrence of the Rvi6 and non-Rvi6 cultivars (Fig. 4 a and b).
In cluster analysis conducted in GDA, on UPGMA dendrogram three clades were designated: two with only Rvi6 populations and one with the rest of the populations, while on NJ dendrogram all Rvi6 populations were in one clade, and other populations were scattered, although the support for the nodes was low (< 50) (Additional file 3: Fig. S3).

AMOVA
The two main groups of populations revealed in clustering methods (one containing only Rvi6 populations and the other containing the remaining non-Rvi6 populations were analysed using two-way AMOVA. Analysis of molecular variance for the two groups revealed that differentiation was low, albeit significant, at all hierarchical levels (among populations in a group F SC = 0.076, among groups F CT = 0.076, among total genetic distance between all populations F ST = 0.1123). The largest genetic variation was partitioned between individuals within populations (85%), while variation among groups (7.71%) was comparable to those among populations within groups (6.97%). When each group of populations was considered separately, the largest genetic variation was again partitioned between individuals within populations (84.0% for the Rvi6-group and 96.5% for the non-Rvi6-group), but the variation among populations was much higher in the Rvi6-group (16%) than in the non-Rvi6-group (3.50%). The Fi-test conducted using GenA-lEx confirmed a significant difference between these two groups (Φ' PT = 0.013, p = 0.001).  (Table 1). When the groups of populations were tested, assumed according their virulence towards the Rvi6, Rvi1, Rvi17 and Rvi0 hosts, the lowest mean H D value (0.501) from all the analysed population groups was noted in the Rvi6-group, while mean values for the other groups were determined to be 0.684 (Rvi17), 0.678 (Rvi1) and 0.698 (Rvi0).

Gene diversity and pairwise estimates among populations and groups of populations
Nei's unbiased genetic pairwise estimates identity or distance calculates the genetic identity or diversity between populations across all loci simultaneously with the assumption that differences can arise due to both mutation and genetic drift. The values obtained for the two  Table S1).
F ST values can range from 0 for identical populations (no population differentiation) to 1 for populations sharing no common alleles (complete differentiation in which populations are fixed for different alleles) [44].  Table S2). Apart The observed H D and Nei's values and the F ST estimates clearly showed that the Rvi6 populations were genetically closer to other Rvi6 populations, especially those derived from the same orchard. However, they were subdivided within the whole Rvi6-group. Moreover, the Rvi6 populations were clearly distanced from the non-Rvi6 populations that came from other orchards, which were in turn more closely related to each other. An exception was observed between the Rvi6 and non-Rvi6 populations occurring in sympatry, for which the F ST values indicated very close (Lublin) to the closest (Nowy Dwor) genetic relationship among the populations.
These data are consistent with those from the clustering and assignment methods and explain the differences among the Rvi6 and non-Rvi6 populations, the substructures within the Rvi6-group, and the genetic heterogeneity among the Rvi17, Rvi1 and Rvi0 groups of populations.
Gene flow, considered as a movement of genes, can be detected in populations by observing the movement or dispersal of whole organisms or genomes from one population to another [45]. Pairwise estimates of gene flow (Nm) between all populations (Additional file 6: Table S3) and groups of populations (Table 2) were always higher than 1, suggesting gene flow between populations, although with different intensities. The highest values were observed among pairs of the Rvi6 populations (2.321-33.836, mean 11.960), derived from the same orchard. The mean Nm value between the Rvi6-group and the non-Rvi6-group was 2.441, while the mean values between the other groups (Rvi17, Rvi1 and Rvi0) were between 7.063 and 8.495. These data may suggest that gene flow between the Rvi6 and non-Rvi6 populations can be limited but not excluded.

Random mating hypothesis testing
The index of association relies on the variance of genetic distances between pairs of individuals. Low variance occurs in a population with free recombination across loci, while high variance occurs when recombination is rare [46]. The null hypothesis of random alleles association (rBAR d = 0) from random mating can be rejected only when the value is significantly different from zero after 1000 randomizations. The null hypothesis of random mating could not be rejected for any of the tested groups of populations (i.e. Rvi6, Rvi1, Rvi17, Rvi0 groups) or for the whole non-Rvi6-group. Thus, we assumed that among the populations within the groups, free recombination was possible. When the groups were tested in pairs, the null hypothesis was rejected only in pairs with the Rvi6-group, indicating that random mating did not occur between the Rvi6 and the non-Rvi6 populations. However, when the Rvi6 and non-Rvi6 populations sampled from the same orchard (Lublin or Nowy Dwor) were tested, the null hypothesis of random mating between these populations could not be rejected, indicating that random mating was possible between the sympatric populations.

Isolation by distance
Because the regression correlation analysis between genetic and geographic distances among all populations was not significant (p > 0.05), the genetic variation among populations cannot be explained by the geographical origin of the populations. When the Rvi6 and non-Rvi6 groups of populations were tested separately, the same tendency was observed. This indicates that genetic population structures are not influenced by the geographic origin of the populations.

Discussion
Resistance genes introgressed by humans to crops are generally broken down soon after they are deployed into the agroecosystem as a result of the emergence of a virulent pathogen population. For quickly evolving organisms, such as plant pathogens, new adaptations occur easily in just a few generations after an environmental change [47]. New virulent strains are generally assumed to originate from new de novo mutations at the avirulent locus from populations already present in agroecosystems. Pathogens can rapidly adapt to changes since they generally have higher fecundity, larger effective sizes and shorter generation times than their hosts [48]. As an alternative scenario of virulence emergence, Leroy et al. suggested that virulent strains can be selected from pre- existing virulent populations in non-agricultural habitats. The latter scenario of adaptation from standing genetic variation can be favoured, especially when pathogens infect various non-agricultural hosts, including weeds and wild plants growing near cultivated plants or those widely spread by humans. All these factors can explain the rapid breakdown of resistance genes in agroecosystems [49], including the Rvi6 gene [13].
Genetic uniqueness of V. Iniaequalis populations derived from Rvi6 hosts The pathosystem V. inaequalis -Malus spp. was found to be suitable for studying both gene flow and host adaptation processes, which are expected to shape the genetic structure of pathogen populations overall [12]. The previous findings of V. inaequalis populations overcoming Rvi6 gene-related resistance in apple cultivars showed that newly emerging fungal populations were genetically different from populations from plants without Rvi6 and demonstrated the clonal structure [9,10,15]. The results obtained in our study also indicate the genetic distinctness of Rvi6-virulent populations from the populations infecting Rvi17-, Rvi1and cultivars without any known R genes. First, based on results obtained in the clustering methods that implemented various algorithms (Bayesian inference, PCoA and distance matrix-based), the division of all populations into two main groups was evident. The Rvi6 populations formed a separate cluster or clusters depending on the algorithm used, which did not contain any non-Rvi6 populations. However, in each Rvi6 or non-Rvi6 cluster, individuals showing admixed ancestry from both clusters were observed (STRUCTURE, TESS). In parallel, the probability of assignment of Rvi6 individuals to non-Rvi6 populations was low (less than or equal to 10%). Second, the 1tc1g locus, which had the highest number of alleles in the Rvi6 populations, was represented by only 7 alleles out of a total of 32 alleles observed for this locus. However, these alleles were also present in almost all the other populations. Since the 1tc1g locus is considered to be linked to the AvrRvi6 locus of V. inaequalis [11], the observed tendency for a reduced number of alleles at this locus in Rvi6-virulent populations is consistent with the genomic signature of the selective sweeps hypothesis [50]. This hypothesis states that when a selectively favourable mutation occurs in a population and subsequently becomes fixed in that population, selective sweeps change the allelic frequency at closelylinked loci, which can result in a local reduction of genomic variation. Moreover, the observed reduced diversity in multiple regions of the genome is also consistent with other studies [9,11]. The fact that the Rvi6 populations exhibited the lowest allelic richness, frequency of private alleles and mean genetic diversity across the 11 loci and the highest clonal fraction show that the genetic structure of these populations is the response to a main apple resistance gene breakdown. On the one hand, the observed reduced genetic diversity at the population level fits the primary concept of gene-for-gene interaction [3], which can be realized by the 'mutation-to-virulence' scenario in which a molecular event occurring in a compatible avirulence gene in a single individual leads to the selection of a new subpopulation and the emergence of virulence within the agroecosystem. On the other hand, the second scenario considered here is the possibility of the migration of a few virulent individuals from preexisting virulent populations in non-agricultural reservoirs. In such case, maintenance of a high population differentiation in several genomic regions is expected between the 'native' and immigrating populations because of population divergence in allopatry, isolation for long periods and current restricted gene flow between them [49]. Based on the observed reduced genomic diversity as well as the results of the assignment and clustering analyses, Guérin and co-workers [11] raised the hypothesis that the origin of Rvi6-virulent individuals can be external, and these individuals did not emerge from the selection of a mutated avrRvi6 allele in any of the sampled non-Rvi6 populations. They assumed that the Rvi6 populations derived from a single ancestral population after a substantial reduction of variability caused by a demographic bottleneck, and the current Rvi6 populations were founded by just a few related individuals. Lemaire et al. used demographic modelling to demonstrate that the two distinct Rvi6 and non-Rvi6 lineages diverged several thousands of generations ago without subsequent detectable gene flow [13]. Their studies proved the hypothesis that populations infecting Rvi6 varieties emerged from pre-existing populations present on the non-agricultural progenitor M. floribunda, donor of Rvi6 gene, thus supporting the second scenario regarding the migration of virulence between ecosystems. Secondary contact between subpopulations is currently observed in some European apple tree orchards, representing the opportunity for adaptive pathogenic evolution [14], including Poland. Whereas the overall level of genetic polymorphism was low in the Rvi6 populations, we observed that the level of variation among these populations was almost five times higher than that within the non-Rvi6 group (AMOVA). Moreover, the Rvi6-group exhibited the highest F ST values among groups, within itself, suggesting that Rvi6 populations are genetically subdivided. The lack of correlation between the genetic and geographic distances for the populations indicated that the genetic structure of the populations was not influenced by their geographic origin. However, some grouping related to geographic origin was observed in the segregation performed by the TESS, STRUCTURE (for K = 5) and PCoA algorithms. The strong evidence of substructures among the Rvi6 populations can be explained by the hypothesis assuming several recent instances of migration of virulent individuals from the non-agricultural to the agricultural pathosystem and then undergoing further distribution by rare, long-distance dispersal events. The orchards with Rvi6 cultivars are not very common in Poland; the distance between the examined neighbouring orchards containing Rvi6-cultivars is in range from 50 to 220 km, while the experimental orchard with various mono-and poli-R-gene apple cultivation is located in Dabrowice, in a distance from 18 to 220 km to the nearest commercial orchards containing Rvi6 cultivars. Therefore, we assume that there is a limited exchange of virulent strains between orchards, and the emergence of disease in Rvi6 hosts in previous decade is most likely a result of human activities regarding the transportation of infected apple tree leaves over long distances. In our study, it was possible to observe a structure related to the host only for the Rvi6 populations that was maintained over years because of impeded gene flow between immigrants and local populations. In the non-Rvi6 populations, neither significant population structure nor correlation between the assigned cluster and the host cultivars were detected. Even if the geographic distance between the non-Rvi6 populations was significant, they shared common genotypes. Here, free dispersal of the fungus through large apple tree areas both by wind or human-mediated actions together with gene flow lead to genetic homogeneity among populations within regions. Similarly, the lack of substructure linked to different hosts was observed for the populations derived from various host in single orchards [9,12]. The findings of higher allelic and private allelic richness as well as higher genetic diversity in the non-Rvi6 populations compared to the Rvi6 populations together with the small or absent clonal fractions support the hypothesis that non-Rvi6 populations are evolutionarily older than the Rvi6 populations [51]. The other possibility is that the Rvi6 individuals observed currently in the agroecosystem are only a small representation of the whole Rvi6-virulent population.
Genetic distinctness of V. Inaequalis populations derived from Rvi6 and non-Rvi6 hosts The results obtained in this study revealed clear genetic distinctness among populations that were virulent or non-virulent towards the Rvi6 apple tree cultivars, which was detectable in most of the sampled orchards. The assignment test performed using GeneClass showed that the Rvi17, Rvi1 and Rvi0 groups exchanged individuals quite randomly, causing relatively low genetic distance between these groups as detected by other programmes, whereas the individuals from the Rvi6-group were frequently assigned to other populations within the same group. Genetic distance, inferred from Nei's unbiased measure of genetic distance and F ST values, was always the highest between the Rvi6-group and other groups of populations. The Rvi6 populations were clearly distanced from non-Rvi6 populations coming from other orchards and genetically closer to other Rvi6 populations, especially those derived from the same orchard. Moreover, the F ST values estimated for the groups was the highest within the Rvi6-group, indicating higher differentiation between populations in the Rvi6-group than between populations in each of the non-Rvi6 groups, which agrees with the observed genetic subgroups among the Rvi6 populations. Conversely, the non-Rvi6 populations, even those derived from distant orchards, were genetically mixed but with an overall low level of genetic differentiation and variation. The random mating hypothesis testing showed that random mating was not possible between populations from the Rvi6-group and other groups, but was possible among individuals from the Rvi17, Rvi1 and Rvi0 groups. This probably explains why the clustering programmes assumed that non-Rvi6 populations originated from a single panmictic population. Our finding regarding random mating within the Rvi6group differs from that of Guérin et al.; their linkage disequilibrium testing results were not consistent with random mating among Rvi6 populations [11].
The high structuring effect of the presence or absence of a single resistance gene in host plants, Rvi6 in this study, led to a split of the pathogen population into two subpopulations. This phenomenon was previously observed in commercial orchards between infected M x domestica cultivars carrying or not carrying the Rvi6 gene on large scale [9,11,13]. Reported studies demonstrated the lack of gene flow between individuals from Rvi6 and non-Rvi6 hosts, what was observed between populations sampled at multiple sites on multiple host species and varieties [13], but also between sympatric populations [52]. Gladieux et al. [10] confirmed very high host specificity of V. inaequalis strains virulent toward Rvi6 cultivars, that correlated with strong selection against immigrants in V. inaequalis populations. The strong genetic differentiation between the two populations may indicate, that the pathogenic capability of Rvi6-virulent individuals on non-Rvi6 host and their ability to compete with Rvi6-avirulent individuals on non-Rvi6 hosts may have been reduced. This in turn reduces the probability of meeting and mating within the host, resulting in detectable reproductive isolation between sympatric populations [10]. Therefore, the population structure between Rvi6 and non-Rvi6 populations is usually maintained in agroecosystems, even when Rvi6 and non-Rvi6 cultivars are planted in the same orchard [10,12].
On the other hand, the genetic distinctness between Rvi6 and non-Rvi6 populations was broken between the sympatric populations derived from orchards located in Nowy Dwor and Lublin, which was supported by five observations. First, the evidence of admixture between the two types of populations, expressed as Rvi6-virulent and avirulent individuals grouped in one cluster, was detected using TESS. This was most clear for populations from Lublin (K = 4, green colour) and was slightly detected for populations from Nowy Dwor (K = 5, all sympatric populations have the admixture of green colour, which is 'dedicated' to Rvi6 populations on this plot, Fig. 3). Among them, the DND (Nowy Dwor) and MGL (Lublin) populations showed the highest average level of admixture between Rvi6 and non-Rvi6 populations from all non-Rvi6 populations. Second, the F ST pairwise estimates of differentiation were usually lower between the Rvi6 and non-Rvi6 populations sampled from the same orchard than between these populations from other orchards. Third, the misassignment rate in the populations in Nowy Dwor and Lublin orchards was high, revealing that individuals were assigned with high probability to more than one population. Fourth, the null random mating hypothesis could not be rejected for sympatric populations, neither in Lublin nor in Nowy Dwor, indicating that random mating was possible between the sympatric populations. Moreover, comparisons of the general pairwise gene flow estimates between the Rvi6 and non-Rvi6 groups revealed that gene flow was detectable, although the values were low. These results allowed us to assume that virulent strains towards Rvi6 cultivars could infect non-Rvi6 cultivars and that they coexisted on host trees without the Rvi6 gene during the sampling time in Nowy Dwor and Lublin.
The coexistence of strains of two types of virulence on one host was reported previously in glasshouse conditions [10], but evidence of gene flow between the populations has thus far only been observed in one experimental dessert apple orchard [14]. To our knowledge, this study is the first report of secondary contact followed by gene flow between strains infecting Rvi6 and non-Rvi6 cultivars in commercial organic orchards in Poland that were sampled from various Malus x domestica hosts. The coexistence of two strains with different virulence patterns that diverged a long time ago and were then isolated from one another by ecological barriers on one apple tree creates the possibility for mating and recombination. Expected recombination between the two populations might promote the evolution of aggressiveness, a major component of fitness in fungal pathogens, by combining aggressiveness factors from each type of population in hybrids [14,47]. This helps hybrids adapt to monogenic resistant hosts that are newly introduced into the agroecosystem. The consequences of the rapid evolution of V. inaequalis that arose from secondary contact need to be investigated by tracking not only aggressiveness but also the different stages of the life cycle of the pathogens to explore whether admixed individuals show higher or lower fitness than their ancestors. The discovery of new Rvi6virulent populations and the evidence of gene flow occurring between the new and the existing 'native' populations in organic orchards in Poland and other European countries is important from an epidemiological aspect since Rvi6 cultivars are not protected in these regions by fungicide sprays, thus the virulent populations and possible hybrids are maintained in the ecosystem. Planting fully susceptible cultivars in the same orchard facilitates the breakdown of ecological reproductive isolation between agricultural and nonagricultural pathogen populations [14], which can seriously influence sustainable disease management.
The structure of Rvi6-virulent populations agrees with the hypothesis that gene-for-gene relationships exert a structuring effect for the genetic shape of compatible fungal populations. We expected to observe the same effect of main R gene of other host plants on V. inaequalis populations, however, this was only evident in newly emerging virulent populations and not for those that have been present for a long time in the pathosystem. Considering the possibility that the two populations -virulent and avirulent to Rvi6 -can infect the same hosts and mate, borders between them are expected to melt, especially in the orchards in which these populations occur in sympatry. Based on the observations of migrants and F1 hybrids at the European scale and hybrid swarms in some orchards, Lemaire et al. and Leroy et al. concluded that the immigrant inviability barrier is eroding because of possible mating events between Rvi6 and non-Rvi6 strains on host varieties susceptible to both kind of pathogens [13,14]. Assuming that the gene flow between divergent individuals might promote evolution of aggressiveness in pathogen populations [14], the risk of spread of hybrids with transgressive traits over other apple growing regions cannot be excluded.

Conclusions
Population structure analysis of V. inaequalis populations sampled from several orchards across Poland concluded that there are two main population groups, one infecting Rvi6 cultivars and one infecting cultivars that have other sources of resistance or no known resistance at all. Based on the Bayesian analysis of 20 fungal populations and distance based methods, we also found support for subgroups within the Rvi6 populations. This is consistent with several independent introduction events