Genetic diversity of Ophiocordyceps sinensis, a medicinal fungus endemic to the Tibetan Plateau: Implications for its evolution and conservation

Background Ophiocordyceps sinensis (syn. Cordyceps sinensis), endemic to alpine regions on the Tibetan plateau, is one of the most valuable medicinal fungi in the world. Huge commercial demand has led to excessive harvest and a dramatic decline in its numbers. The diversity of terrains and climates on the Tibetan Plateau and the broad insect host range (more than 50 species in the family Hepialidae) may have resulted in substantial intraspecific genetic diversity for this fungus. The objective of this study was to evaluate the population distribution of O. sinensis from geographically diverse regions of the Tibetan Plateau based on nrDNA ITS and MAT1-2-1 gene sequences. Understanding of the genetic diversity and genesis of O. sinensis will provide important information for the evolution and conservation of this fungus. Results Significant sequence variations in the ITS and MAT1-2-1 genes (27 and 23 informative sites, eight and seven haplotypes, respectively) were observed. Phylogenetic analysis based on ITS sequences, MAT1-2-1 sequences, or their combined data set, clustered isolates from northern regions in one clade (clade I), whereas isolates from southern regions were dispersed in all four clades (clade I-IV). Single-strand conformation polymorphism (SSCP) analyses of 2639 ITS clones from seven samples revealed 91 different SSCP patterns that were subsequently sequenced. ITS heterogeneity was found in XZ-LZ07-H1 (Nyingchi population), and 17 informative sites and five haplotypes were detected from 15 clones. The five haplotypes clustered into three clades (clade I, II, and IV). Conclusions Significant genetic divergence in O. sinensis was observed and the genetic diversification was greater among southern isolates than that among northern isolates. The polymorphism of nrDNA ITS sequences suggested that O. sinensis spread from a center of origin (the Nyingchi District) to southern regions and subsequently to northern areas. These results suggest that southern populations are important reservoirs of genetic diversity and should be taken into account in conservation programs.

suggest that southern populations are important reservoirs of genetic diversity and should be taken into account in conservation programs.

Background
The ascomycete Ophiocordyceps sinensis (Berk.) Sung, Sung, Hywel-Jones and Spatafora (syn. Cordyceps sinensis) [1], commonly known as the Chinese caterpillar fungus, has been widely used in traditional Chinese medicine for the treatment of asthma, bronchial, lung inflammation, and other diseases [2]. The medicinal activity is from the complex of the moth caterpillar parasitized by the fungus and the fungal stroma biomass. The caterpillars, which belong to the family Hepialidae, live underground in soil burrows in the Tibetan plateau and feed on plant roots. When an infection is established, the fungus grows within body cavity of the insect larvae and eventually kills and mummifies them in the underground burrows. Fungal fruiting body emerges from the front end of the caterpillar and pokes out of the ground in spring and early summer [3].
The natural distribution of this fungus, however, is limited to alpine regions on the Tibetan Plateau and huge commercial demand has led to excessive harvest and a dramatic decline in its numbers [4]. As a result, the price for natural O. sinensis has increased rapidly and it cost as much as $32,000 (USD) per kg for top quality in late 2006 [5]. Although several fermented mycelial products from the anamorph (Hirsutella sinensis) of the fungus have been commercialized in China [6], the teleomorph, which is the component for traditional Chinese medicine, has not yet been commercially cultivated. The natural population of this fungus needs to be preserved for the fragile Tibetan Plateau ecosystem and for sustainable supply of this natural resource.
Ophiocordyceps sinensis occupies a diverse habitat on the Tibetan Plateau, which is the largest and highest plateau on Earth, covering more than 2.5 million km 2 at an average elevation of over 4.5 km [7]. It is surrounded by and interspersed with towering mountain ranges. Its complex geomorphology and climate generate substantial interregional variation which results in remarkable faunal and floral diversity and high levels of endemism [8][9][10][11]. O. sinensis is endemic to the Tibetan Plateau and occurs at elevations ranging from 3000 m up to the snow line [4]. This distribution pattern suggests that the evolution of O. sinensis was significantly influenced by the tectonic movements during the strong uplift of the Tibetan Plateau. The spread of O. sinensis, depending on the shooting of ascospores, is limited to defined areas in a mountain and is unlikely through discontinuous mountains. Therefore, gene flow between different populations on discontinu-ous mountains is expected to be low, and divergence may have occurred [12].
Intraspecific genetic diversity of O. sinensis may also be generated by its wide host range. O. sinensis infects soilborne larvae of more than 50 species of ghost moths, mostly in the genus Hepialus and to a lesser extent in the genera Hepialiscus, Forkalus, and Bipectilus [13,14]. Coevolution between the fungus and its host insects may have originated a long time ago [15]. Although the host insects in general have a limited distribution in the Tibetan Plateau and their distribution varies among different mountain ranges and even from different sides and habitats of the same mountain [16], the greater mobility of moths may also drive the genetic diversity of the fungus.
To date, genetic diversity of O. sinensis has only been studied at the local level of limited geographic areas. Three populations found in the north (Menyuan, Maqu, and Luqu), the middle (Yushu and Chengduo), and the south (Baima Snow Mountain, Renzhi Snow Mountain, Dacaodi, and Chongcaoxiawa) of the Tibetan Plateau, have been recognized from 29 samples based on RAPD analyses [12]. Inter-simple sequence repeat (ISSR) analyses of O. sinensis from 11 counties in Qinghai Province also resulted in three groups (around the Qinghai Lake, central eastern Qinghai, and southern Qinghai) [17]. Based on ITS sequences of samples from 11 localities in southwestern China, O. sinensis was divided into two populations (one from southwestern Sichuan to the northern slopes of the Himalayan Mountains, the other from northern Sichuan to Qinghai and Gansu) [18]. However, ITS sequence analyses in another study did not indicate any subgroup for 17 O. sinensis samples from different geographical regions [19]. Overall, the number of O. sinensis samples used in earlier studies was limited, and more samples from Tibet, a major production area, should be surveyed. In addition, other molecular markers, especially protein-encoding genes, should be employed to clarify the genetic diversity and to refine the phylogenetic origin of this fungus.
Mating-type genes play an important role in the evolution of fungal species. Sexual development is controlled by a single mating-type locus (MAT) in the fungi of Ascomycota. The term "idiomorph" instead of "allele" is usually used to describe the two alternate forms (MAT1-1 and MAT1-2) at the mating-type locus [20]. Mating-type genes evolve at a faster rate than other sequences, such as ITS and glyceraldehyde 3-phosphate dehydrogenase [21]. Although MAT1-2-1 sequences have been used to investigate the phylogenetic relationships among closely related species and usually provide high resolution [22], other studies found that the variability of MAT1-2-1 within species is low [21]. A putative mating-type gene (MAT1-2-1) of O. sinensis was cloned in our laboratory. Similar to MAT1-2-1 orthologs of other ascomycetes, the O. sinensis MAT1-2-1 gene contained the conserved HMG motif. The sequence homologies to Cordyceps militaris and Cordyceps takaomontana were ~50% between cDNA sequences and 40% between amino acid sequences (Zhang et al., unpublished data). During our preliminary experiment, several distinctive base changes within the MAT1-2-1 sequences were found among some O. sinensis isolates. Therefore, this gene was selected to investigate whether it can provide reasonable information on the genetic diversity of O. sinensis.
The objective of this study was to evaluate the population distribution of O. sinensis from geographically diverse regions of the Tibetan Plateau based on nrDNA ITS and MAT1-2-1 gene sequences. Understanding of the genetic diversity and genesis of O. sinensis should provide valuable information for the protection and sustainable utilization of this natural resource.

Sequence variations of nrDNA ITS and MAT1-2-1 genes among O. sinensis isolates
Fifty-six O. sinensis isolates from different geographical regions were used in this study (Additional file 1). Their nrDNA ITS and MAT1-2-1 sequences were analyzed. For the 531-535 bp ITS sequences, the detectable nucleotide variations occurred at 35 sites, including 27 parsimony informative sites and eight singleton variable sites. Among the 27 parsimony informative sites, 13 were T/C base transitions; eight were A/G transitions; five were transversions of T/G, T/A, or A/C; and one was T/A/C substitution (see Additional file 2). Most of these informative sites were located in the ITS1 (11 sites) and ITS2 regions (14 sites), plus one each in the 5.8S and 28S regions (at sites 319 and 514, respectively). The singleton sites were not included in the subsequent haplotype analysis because one polymorphism at such a site was represented by only one isolate and such variation might be introduced artificially during PCR or sequencing steps. In addition, a 4-bp deletion was detected within the ITS2 region of two isolates (XZ-LZ07-H1 and XZ-LZ07-H2).
Detectable nucleotide variations within the 877-882 bp MAT1-2-1 sequences occurred at 28 sites, including 23 parsimony informative sites and five singleton variable sites. There were 10 T/C transitions, seven A/G transitions, and six C/G transversions (see Additional file 3). Nucleotide variations in nine sites resulted in amino acid changes. Interestingly, the two isolates that had base deletions within the ITS region contained a 5-bp insertion within the first intron of the MAT1-2-1 gene.

Haplotypes within the 56 O. sinensis isolates
Haplotype usually refers to a group of alleles at multiple loci that are transmitted together on a single chromosome or to a set of single nucleotide polymorphisms (SNPs) [23]. Ascosporic fungi are haploid and haplotypes were identified in this study according to base change patterns at informative sites among individual isolates. For the 56 isolates used in this study, eight ITS haplotypes, seven MAT1-2-1 haplotypes, and 14 combined haplotypes were detected ( Figure 1, Additional file 1). The results showed that one haplotype might include isolates from different geographical regions (e.g., ITS type 7 and MAT1-2-1 type G), and isolates from the same location (e.g., XZ-NQ-176 and XZ-NQ-180) might contain different haplotypes.
For the eight ITS haplotypes (1-8), type 1 was the most dominant and it was represented by 28 isolates mainly from northern regions (Nagqu, Chamdo, Yushu, and Laji). Type 2 included nine isolates mainly from southern regions (Nyingchi and Baima). Other haplotypes were represented by two to five isolates ( Figure 1, Additional file 1).
For the seven MAT1-2-1 haplotypes (A-G), type A was the most dominant and widespread; and it was represented by 33 isolates mainly from northern regions (Nagqu, Chamdo, Yushu, Laji, and Qilian). The next most common haplotype was type B, which included seven isolates from southern regions, such as Nyingchi, Baima, and Garze. Type C was represented by seven isolates from southern Tibet (Nyingchi and Mila). Other haplotypes were represented by one to three isolates ( Figure 1, Additional file 1).
For the 14 combined haplotypes (Figure 1), type 1A was the most dominant and it included 28 isolates. The next most common combined haplotype was type 2C, which was represented by six isolates from Nyingchi. Other haplotypes were represented by one to three isolates.

Phylogenetic analyses
Minimum evolutionary phylogenetic analyses were performed with sequences of the nrDNA ITS region, the MAT1-2-1 gene, and their combined data set. In these phylogenetic trees, two major clades (I and II) and two small clades (III and IV) were distinguished ( Figure 1
Clade II was composed of southern isolates from southern Tibet (Nyingchi and Mila), Yunnan, and Sichuan. It had bootstrap support of 91%, 60%, and 64% in the three phylogenetic trees, and included three ITS haplotypes, three MAT1-2-1 haplotypes, and eight combined haplotypes ( Figure 1, Additional file 4).
The phylogenetic trees constructed with ITS sequences and with MAT1-2-1 sequences had very similar topological structures (see Additional file 4). Four clades were dis-criminated in both trees, and composition of isolates for each clade was almost identical between the two trees. However, there were some discrepancies between the two trees. For example, isolate SC-3 clustered in clade II in the ITS tree, but in clade I in the MAT1-2-1 tree. Clade II was the sister group of clade III in the ITS tree, but the sister group of clade I in the MAT1-2-1 tree. The parameters of nucleotide diversity and haplotype diversity were usually higher in the ITS sequences than in the MAT1-2-1 sequences (Table 1). Therefore, the ITS sequence is relatively more polymorphic than the MAT1-2-1 sequence.

Population structure and gene flow
For both ITS and MAT1-2-1, significant genetic differentiation was detected among isolates in Nyingchi versus Yushu, Nagqu, Chamdo, and Garze; Garze versus Nagqu, Chamdo, and Nyingchi; and Baima versus Yushu, Nagqu, and Chamdo (Table 2). In addition, there was also significant differentiation for ITS sequences among isolates in Nagqu versus Qilian; Nyingchi versus Qilian; and Garze versus Yushu (  (Table 3). Significant difference (P < 0.01) was observed for these grouping patterns. These results indicate that an organized population structure exists among O. sinensis isolates. a Fifty-six individuals were sequenced in this study (Additional file 1). "N" or "S" inside parentheses represents northern or southern populations, respectively (the same hereinafter). Numbers outside parentheses are values of ITS, and numbers inside parentheses are values of MAT1-2-1 (the same hereinafter). b Genetic parameters of the Shannan population and the unknown population (Additional file 1) are included in the "southern regions" and "overall" statistics, but they are not given individually because they each possess one isolate. c The "polymorphic sites" include both informative sites and singleton sites. For both ITS and MAT1-2-1, there is one polymorphic site shared by northern and southern isolates.

ITS heterogeneity by PCR-SSCP analyses
In total, 2639 ITS clones (1483 clones from four northern samples and 1156 clones from three southern samples) were analyzed by SSCP. Ninety-one clones with different SSCP patterns (60 for the northern samples and 31 for the southern samples) were sequenced (Table 4). Compared with the corresponding original sequence from direct sequencing, no ITS heterogeneity was detected for clones from each of these specimens except for XZ-NQ-166 and XZ-LZ07-H1, which had three (23 clones) and 17 informative sites (15 clones), respectively (Table 4). High nucleotide diversity (1.66 ± 0.34) and haplotype diversity (0.88 ± 0.07) were observed among clones of XZ-LZ07-H1. The 4-bp sequence deletion in XZ-LZ07-H1 from direct sequencing existed within some clones of XZ-LZ07-H1 (see Additional file 5). Among these 17 informative sites from 15 clones of XZ-LZ07-H1, 15 changes occurred at known informative sites and one at known singleton sites (see footnotes of Additional file 5). Every possible nucleotide change at these informative sites was shared by at least two to five clones. According to base changes at these informative sites, clones of XZ-LZ07-H1 can be divided into five haplotypes (I-V). The five haplotypes contained eight, three, two, one, and one clone(s), respectively (Fig-

Qilian (N) Laji (N) Yushu (N) Nagqu (N) Chamdo (N) Nyingchi (S) Mila (S) Garze (S) Baima (S)
Qilian (  Notes: F SC estimates the variation among populations relative to a regional grouping of populations. F ST estimates the proportions of genetic variation within populations relative to the genetic variation for the whole samples. F CT estimates the proportion of genetic variation among groups of populations relative to the whole species. F CT values that are statistically different are indicted. * P < 0.05; ** P < 0.01. ure 2). Clones from the four northern isolates were clustered together in agreement with their original sequences from direct sequencing ( Figure 2). Similarly, clones from two of the three southern samples (XZ-LZ07-30 and YN-6) were clustered together correspondingly with their original sequences from direct sequencing ( Figure 2). However, the 15 clones from isolate XZ-LZ07-H1 were scattered among three clades (clade I, II, and IV). It is noteworthy that HT-94P clustered with the northern isolates (clade I).

Discussion
The distance value (K value) of nrDNA ITS sequences from one fungal species typically ranges from 0 to 0.05 [24,25]. The maximal distance value among O. sinensis populations in this study was 0.043. Although five putative new species related to O. sinensis were reported based on morphological characters and geographical distribution [26][27][28], they should not be treated as different species based on nrDNA ITS sequences [29]. A high degree of intraspecific genetic diversity of O. sinensis has been detected, but it was still highly dissimilar even from the most closely related lineages within same genus, having about 10% of ITS difference with O. robertsii (see Additional file 6).
In general, mating-type genes are more variable between species than within species [21] and the high degree of variations between species has been used to study the phylogeny of some closely related fungal groups [22]. In contrast to the low within-species variation (0.15-0.31% nucleotide sites different) of Cochliobolus heterostrophus [21] and Phaeosphaeria nodorum [30] (Figure 3). O. sinensis might have spread eastward and westward first among southern regions on the Tibetan Plateau. This spread in southern Tibetan Plateau may be initiated along two large and parallel mountain ranges, the Himalaya Mountains (south of Nyingchi) and the Nyainqentanglha Mountains (north of Nyingchi), which run in east and west directions. The spread from south to north on the Tibetan Plateau might Phylogenetic analyses based on ITS sequences of 91 clones and their original sequences from direct sequencing   [13,14,16].
Host species of the fungal isolates used in this study and the genetic diversity of the hosts were not determined. The whole picture of O. sinensis genetic differentiation and the coevolution between O. sinensis and host insects needs to be further studied [15].
Two southern isolates, XZ-LZ07-H1 and XZ-LZ07-H2, had relatively distant genetic relationships from other isolates. Compared to the ITS or MAT1-2-1 sequences of other isolates, the sequences of these two isolates had two fixed 4or 5-bp indels and 13 unique base changes. According to the newly proposed "indel-associated substitution" theory, the high substitution frequency in these two isolates may be related to the existence of the two indels. The theory asserts that the single-nucleotide mutation rate increases in regions surrounding the insertion or deletion sites [31]. Furthermore, ITS heterogeneity was detected in XZ-LZ07-H1, but the potential sources of heterogeneity are still unknown. The fact that one clone of XZ-LZ07-H1 clustered in the northern clade suggests that hybridization is one of the possible mechanisms. Inherent mechanisms involving slippage events during DNA duplication may  Liao, 1990 [41].
also be one of the reasons for the existence of base indels in the sequence of XZ-LZ07-H1 and XZ-LZ07-H2.  ) and it has kept a trend of rapid uplifting since then [33].

Conclusions
Based on two DNA sequences (nrDNA ITS and MAT1-2-1), significant genetic divergence was detected among 56 isolates of O. sinensis collected from different regions of the Tibetan Plateau. The genetic diversification was greater among southern isolates than that among northern isolates, indicating that southern populations were important genetic reservoirs of this species to be considered in conservation programs. The polymorphism of nrDNA ITS sequences suggests that the Nyingchi District is the center of origin of O. sinensis; and that the fungus may have first spread from this center first to other southern regions and then to northern areas.

Ophiocordyceps sinensis
Fifty-six natural specimens representing 11 populations of O. sinensis were collected from the Tibet Autonomous Region, Qinghai Province, Sichuan Province, and Yunnan Province in southwestern China during 2005 and 2008 (Additional file 1). After dug out from the soil, each specimen was placed into a new zip plastic bag immediately. They were taken back to our laboratory by keeping at low temperature in a portable refrigerator and stored at 4°C after arrival. Anamorphic strains were isolated from stromatal tissues within three weeks according to the protocol of Liu et al [34]. The remaining fruiting bodies were stored at -20°C. Either axenic isolates or the stromatal section of natural specimens were used for DNA analysis.

Amplification of sequences from the nrDNA ITS region
Genomic DNA was extracted with the CTAB method [35]. ITS regions of nrDNA were amplified with the primers ITS5 (5'-GGAAGTAAAAGTCGTAACAAGG-3') and ITS4  . PCR products were purified, and sequencing was done in both directions using both the forward and reverse primers following the protocol described above.

SSCP analysis
SSCP is the electrophoretic separation of single-stranded nucleic acids based on subtle differences in sequence (often a single base pair). We used this technology to identify ITS heterogeneity of O. sinensis. Based on the results from analyses of ITS sequences, four samples from northern regions (QH-YS-197B, QH-YS-189, XZ-NQ-154, and XZ-NQ-166) and three samples from southern regions (XZ-LZ07-30, XZ-LZ07-H1, and YN-6) were selected for SSCP analyses. Each sample was divided into two or three sections (stromata, sclerotia, and/or external mycelial vela), and genomic DNA was extracted separately from each section. ITS sequences were amplified with the primers ITS1 (5'-TCCGTAGGTGAACCTGCGG-3') and ITS4 using the same conditions described for amplification of nrDNA ITS sequences. The exception was that Taq DNA polymerase (TaKaRa, Japan) instead of Pfu was used for the convenience of subsequent cloning. The PCR products were ligated into the TA plasmid pMD18-T (TaKaRa, Japan) and transformed into Escherichia coli DH5α. Col-ony PCR was then conducted to amplify the ITS1 region with the primers ITS1 and ITS2 (5'-GCTGCGTTCTTCATC-GATGC-3'). Clones with expected amplicons were used for SSCP analyses following the procedure of Wang et al [36]. Different migration profiles between clones were compared using the Image J software. Representative clones of distinct patterns were sequenced with the universal primer M13-47.

Sequence analyses
Sequences were aligned with Clustal X [37], and ambiguous regions in both sides were excluded from the subsequent analyses. For the ITS region, 531-535 bp of sequences including partial 18S (5 bp), ITS1 (160 bp), 5.8S (157 bp), ITS2 (171-175 bp), and partial 28S (38 bp) were used. For the MAT1-2-1 gene, 877-882 bp of sequences from 14 bases upstream of the start codon to six bases downstream of the stop codon were used. Pairwise distance matrices and minimum evolutionary (ME) phylogenetic analyses were conducted with the Kimura 2-Parameter model using MEGA 4 software [38]. To assess the confidence of phylogenetic relationships, the bootstrap tests were conducted with 1000 resamplings. Ophiocordyceps robertsii was used as the outgroup. DnaSP software (version 4.50.3) was used to estimate the genetic parameters of nucleotide diversity [39]. Genetic differentiation between populations and the analyses of molecular variance (AMOVA) were assessed using the program Arlequin 3.11 [40].