- Research article
- Open Access
Recurrent adenylation domain replacement in the microcystin synthetase gene cluster
BMC Evolutionary Biologyvolume 7, Article number: 183 (2007)
Microcystins are small cyclic heptapeptide toxins produced by a range of distantly related cyanobacteria. Microcystins are synthesized on large NRPS-PKS enzyme complexes. Many structural variants of microcystins are produced simulatenously. A recombination event between the first module of mcyB (mcyB1) and mcyC in the microcystin synthetase gene cluster is linked to the simultaneous production of microcystin variants in strains of the genus Microcystis.
Here we undertook a phylogenetic study to investigate the order and timing of recombination between the mcyB1 and mcyC genes in a diverse selection of microcystin producing cyanobacteria. Our results provide support for complex evolutionary processes taking place at the mcyB1 and mcyC adenylation domains which recognize and activate the amino acids found at X and Z positions. We find evidence for recent recombination between mcyB1 and mcyC in strains of the genera Anabaena, Microcystis, and Hapalosiphon. We also find clear evidence for independent adenylation domain conversion of mcyB1 by unrelated peptide synthetase modules in strains of the genera Nostoc and Microcystis. The recombination events replace only the adenylation domain in each case and the condensation domains of mcyB1 and mcyC are not transferred together with the adenylation domain. Our findings demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster.
Recombination is thought to be one of the main mechanisms driving the diversification of NRPSs. However, there is very little information on how recombination takes place in nature. This study demonstrates that functional peptide synthetases are created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.
Planktonic cyanobacteria often form heavy scums or blooms in freshwater lakes, ponds and reservoirs worldwide . Cyanobacterial blooms constitute a health-risk for human beings via recreational or drinking water through the production of a range of hepatotoxins and neurotoxins . Microcystins are a diverse group of low molecular weight cyclic heptapeptides and are the most common hepatotoxins produced by cyanobacteria. They are potent inhibitors of eukaryotic protein phosphatases 1 and 2A  and are linked to the deaths of wild animals and livestock worldwide .
There are over 65 structural variants of microcystins differing in modifications to the peptide backbone or the type of amino acids incorporated into the microcystin . The general structure of microcystins can be summarized as cyclo-D-Ala1-X2-D-MeAsp3-Z4-Adda5-D-Glu6-Mdha7where X and Z are variable L-amino acids (Figure 1). Many of these microcystin variants are synthesized simultaneously by the producing cyanobacterium . Structural variation has been encountered at all seven positions, but the highest degree of structural variation is found at the X and Z positions (Figure 1). The two most common microcystin variants, microcystin-LR and microcystin-RR, contain L-Leu or L-Arg at the X position and L-Arg at the Z position in the final cyclic heptapeptide. However, microcystins may also contain other proteinogenic, non-proteinogenic and dicarboxylic acids at these positions . Structural variants of microcystin do not have the same toxicities and microcystin-LR is an order of magnitude more toxic than microcystin-RR .
Microcystins are mainly produced by planktonic strains of the distantly related cyanobacterial genera Anabaena, Microcystis and Planktothrix . Microcystin production is also known from a small number of planktonic, benthic and terrestrial strains of the genera Nostoc [3–5], Hapalosiphon , and Phormidium . Insertional mutagenesis has demonstrated that all microcystin variants produced by Microcystis aeruginosa S-70, K-139 and PCC 7806 are synthesized by an enzyme complex encoded in a single 55-kb gene cluster [8–10]. The enzyme complex which directs the biosynthesis of microcystins includes peptide synthetases, polyketide synthases, mixed peptide synthetases-polyketide synthases, and tailoring enzymes [9–13]. Phylogenetic analyses suggest that the microcystin synthetase gene cluster was present in the last common ancestor of all present-day producer organisms . The sporadic distribution of microcystin synthetase gene clusters among cyanobacteria is proposed to be the result of gene loss rather than recent horizontal gene transfer [14, 15].
Many important antibiotics, siderophores and toxins are synthesized on NRPS enzyme complexes . NRPSs possess a highly conserved modular structure with each module being comprised of catalytic domains responsible for the adenylation, thioester formation and in most cases condensation of specific amino acids . The arrangement of these domains within the multifunctional enzymes determines the number and order of the amino acid constituents of the peptide product . Additional domains for the modification of amino acid residues such as epimerization, heterocyclisation, oxidation, formylation, reduction or N-methylation may also be included in the module [16–18]. The modular structure of NRPSs allows the rational design of novel peptides by targeted replacement of these catalytic domains . The adenylation domain appears to be the primary determinant of substrate selectivity in NRPSs [ and others]. High structural conservation of the adenylation domain allows prediction of amino acids lining the putative binding pocket which determines substrate specificity . However, recent studies predict an editing function for the condensation domain suggesting that condensation and adenylation domains in artificial junctions may be incompatible and block peptide synthesis . This finding lead to the hypothesis that in nature condensation and adenylation domains may act as an inseparable couple and be transferred together during natural rearrangements of NRPS gene clusters [18, 21].
The amino acids incorporated at the X and Z positions in structural variants of microcystin are recognized and activated by the McyB1 and McyC adenylation domains (Figure 2). Recombination between the adenylation domains of mcyB1 and mcyC is linked to the production of microcystin-RR by strains of the genus Microcystis . Recombination is thought to be an important factor contributing to the genetic diversity of the microcystin synthetase gene cluster in strains of the genus Microcystis . However, it is not clear how widespread this phenomenon is in other microcystin producers or if the condensation domain is also transferred with the adenylation domain. Here we undertake a multigene phylogenetic study in order to investigate the number and timing of recombination events during the evolution of the microcystin synthetase gene cluster in a variety of microcystin producing cyanobacteria. We show clear evidence for the recurrent exchange and replacement of the adenylation domain without the concomitant transfer of the condensation domain in a broad range of microcystin producing cyanobacteria.
Structural characterization of the identified microcystins
We documented the simultaneous production of 3 to 47 microcystin variants in these strains (see additional file 1). The microcystin variants produced by these strains differed in the methylation of the α-amino group of Mdha, the β-carboxyl of D-MeAsp and the C9 hydroxyl of Adda. However, most structural differences lay in the type of amino acid incorporated at the X position. Most strains produced microcystins that contained L-Leu at the X position (Figure 3a). The strains included in this study also produced microcystins which contained L-Arg, L-Hil, L-Hph, L-Hty, L-Phe, L-Try, L-Tyr, or L-Val at the X position (see additional file 1). Most strains produced microcystins that contained L-Arg at the Z position (Figure 3b). However, almost half of the microcystin variants contained L-Har at the Z-position in Nostoc sp. 152 (Figure 3b). Almost all variants produced in Hapalosiphon hibernicus BZ-3-1 contained L-Ala at the Z-position but this strain also produced minor microcystins variants in trace amounts that contained L-Leu or L-Val at this position (see additional file 1). The strains included in this study produced a wide range of common and rare microcystins. A large number of minor microcystin variants were identified through characteristic UV spectra. However, in most cases the low amounts of microcystins produced prevented characterization of the total structures.
Recombination breakpoints in mcyB1 and mcyC
Phylogenetic-compatibility analysis indicated extensive incongruence between the adenylation and condensation domains of mcyB1 and mcyC (Figure 4). Analysis of the nucleotide sequences of mcyB1 and mcyC identified recombination breakpoints in the adenylation and thiolation domains using six different methods to detect recombination (Figure 5). The recombination area extended across the entire adenylation domains spanning conserved core motifs A1–10 into the middle of the thiolation domain (Figure 5). Additional sets of breakpoints were identified within the adenylation domain in Anabaena sp. 18B6 replacing the adenylation domain elements A2–A10 (data not shown). In the case of Microcystis aeruginosa PCC7806 a second set of breakpoints were also identified spanning the substrate conferring portions of the adenylation domain between the core motifs A3–A8 (data not shown).
Phylogenetic analysis of McyB1 and McyC condensation and adenylation domains
Maximum-likelihood trees based over 7,000 bp of nucleotide data from 5 housekeeping genes and 3 microcystin synthetase genes were congruent and each topology received robust bootstrap support (Figure 6). Maximum-likelihood trees based on the amino acid sequence of the condensation domain from McyB1 and McyC are also broadly congruent with the phylogeny of the producer organism (Figure 7a). McyB1 condensation domains were monophyletic and grouped together with condensations domains with a D-peptidyl donor (Figure 7a). Interestingly the McyC condensation domains were also monophyletic but grouped with condensation domains with an L-peptidyl donor (Figure 7a) despite having D-MeAsp as the donor amino acid.
Maximum-likelihood trees based on the amino acid sequence of the condensation domains and the A3–A8 substrate conferring portion of the adenylation domain of McyB1 and McyC differed considerably (Figure 7b). The mcyB1 and mcyC adenylation domain nucleotide sequences of Anabaena spp. 90, 18B6, 66A and Hapalosiphon hibernicus BZ-3-1 were all more similar to one another than they were to other mcyB1 or mcyC sequences (Figure 7b). The nucleotide sequence similarity between each of these pairs of condensation domains was very low and ranged from 27 to 28% (Table 1). However, the nucleotide sequence similarity between each of these pairs of adenylation domains was very high and ranged from 93 to 97% (Table 1). There was no clear evidence for such recent recombination between mcyB1 and mcyC in Microcystis viridis NIES 102, Planktothrix agardhii NIVA 126/8, 213 or Nostoc sp. 152 (Figure 7b). The sequence divergence between mcyB1 and mcyC from these strains is higher than the sequence divergence between housekeeping genes and other microcystin synthetase genes within these genera [5, 23, 24]. In the case of Nostoc sp. IO-102-I and Microcystis aeruginosa PCC7806 the amino acid sequence of the McyB1 adenylation domain differed considerably from other McyB1 adenylation domains. This region of dissimilarity extended across the entire adenylation domain (A1–A10) in Nostoc sp. IO-102-I but was limited to the A3–A8 region of the adenylation domain in Microcystis aeruginosa PCC7806. This is reflected in the phylogenetic position of these two adenylation domains in maximum-likelihood trees based on the A3–A8 portions of the adenylation domain (Figure 7b).
Substrate specificities of the mcyB1 and mcyC adenylation domains
The L-Asp residue at position 235 and the L-Lys residue at 517 which interact with the α-amino and the carboxyl groups, respectively, to lock orientation of the L-α-amino acid upon activation  were conserved in all strains included in this study (Table 2). The McyB1 and McyC adenylation domain binding pockets differed between 1 and 6 amino acids in pairwise comparisons in most strains (Table 2). However, the amino acids lining the putative binding pockets of McyB1 and McyC in Anabaena sp. 18B6 and Hapalosiphon hibernicus BZ-3-1 were identical (Table 1).
We did not find separate congruent clusters of McyB1 and McyC adenylation domain sequences (Figures 6, 7) as might have been anticipated under an evolutionary scenario in which all microcystin synthetase genes share the same evolutionary history . Instead we found intermixed clusters of McyB1 and McyC adenylation domains (Figure 7b). In some instances we identified very low levels of sequence divergence in pairwise comparisons of the nucleotide sequences of mcyB1 and mcyC adenylation domain from the same strain (Table 1). This discordance together with the low levels of sequence divergence is consistent with multiple recent independent recombination events. Recombination would lead to the overwriting of the mcyB1 and mcyC adenylation domains contributing to sequence homogenization and explain the low divergence of the mcyB1 and mcyC adenylation domains relative to the mcyB1 and mcyC condensation domains in Anabaena spp. 90, 18B6, 66A and Hapalosiphon hibernicus BZ-3-1 (Table 1). However, in addition to these recent recombination events our phylogenetic analysis reveals evidence for replacement of the mcyB1 adenylation domain in Nostoc sp. IO-102-I and Microcystis aeruginosa PCC7806 (Figure 7). The high sequence divergence between the adenylation domains of mcyB1 in these two strains and other mcyB1 adenylation domain sequences included in this study (Table 1) could be explained by two independent replacement events involving a non-homologous adenylation domain from another peptide synthetase gene cluster. A recombination event has been proposed to replace the adenylation domain of mcyB1 and mcyC of Planktothrix agardhii NIVA 126/8 . However, the McyB1 and McyC adenylation domains of Planktothrix agardhii NIVA 126/8 and 213 as well as Nostoc sp. 152 clustered separately suggesting that the recombination event precedes the divergence of these two genera. Together, our results indicate that these two adenylation domains are recombination hotspots within the microcystin peptide synthetase gene cluster.
The recombination events at the mcyB1 and mcyC are limited to the adenylation domain and the condensation domains in mcyB1 and mcyC are highly divergent and group in separate clusters (Figure 7a). Recombination breakpoints are all limited to the adenylation domain (Figure 4, 5). The phylogenetic discordance between the adenylation and condensation domains is inconsistent with the hypothesis that adenylation and condensation domains are transferred together as a unit. Two rounds of peptide chain elongation are catalyzed by McyB, which typically activates and condenses L-Leu and D-MeAsp into the growing peptide chain . This protein directs the transfer of D-peptidyl intermediates involving a carboxy-terminal epimerase domain of McyA and the condensation domain of McyB1 . Peptide bond formation is achieved between the α-amino group of D-Ala and the α-carboxyl group of L-Leu . In keeping with this the McyB1 condensation domain clusters with domains involved in D-L peptide bonds (Figure 7a). The final condensation reaction is performed between the β-carboxyl group of β-MeAsp and the α-amino group of L-Arg by McyC prior to cyclisation and the resulting peptide bond is atypical . Interestingly, the condensation domain of McyC does not group with previously described D-L condensation domains but group instead with condensation domains with L-peptidyl amino acids as donors (Figure 7a). However, the McyC condensation domain also lacks the typical HHxxxDG his motif in its active site typically present in the condensation domains with D- and L-peptidyl donors . Although adenylation domains are the primary determinants of substrate specificity in NRPSs condensation domains are also reported to exhibit moderate to high substrate selectivity . It may be that differences in the substrate specificities of the condensation domains from McyB1 and McyC mean that the co-transfer of the adenylation and condensation domains would result in a non-functional peptide synthetase. Non-compatible adenylation and condensation domains are predicted to cause a drastic reduction of catalytic competence or even a complete failure to synthesize the desired peptide by the engineered NRPS . Replacement of condensation domains in mcyB1 and mcyC may lead to a disruption of the overall integrity of the peptide assembly process, in particular the order and timing of condensation reactions.
The simultaneous production of the microcystin -LR and -RR variants has been interpreted as a lack of specificity at the McyB1 adenylation domain [12, 13, 26]. We predicted the 10 amino acids lining the putative binding pocket in the adenylation domain of McyB1 and McyC though alignment against the GrsA adenylation domain . The 8 amino acids lining the binding pocket which interact with the side chain and functional group and were highly variable in McyB1 and to a lesser extent McyC (Table 2). Single amino acid changes in the amino acids lining the putative binding pocket of the adenylation domain are known to have an effect on the type of amino acid that is recognized and activated by the adenylation domain . Single amino acid change (V→I) in McyC could be responsible for shifting the incorporation of L-Arg towards L-Har in Nostoc sp. 152 (Table 2). Similarly a single amino acid change in (D→Y) in McyB1 could be responsible for shifting the incorporation of L-Arg towards L-Hty in Anabaena sp. 66A. Experimental expression and mutation of the adenylation domains in each case could verify this hypothesis.
Determining the amino acids lining the substrate conferring portions of the adenylation domains of non-ribosomal peptide synthetase gene clusters can yield invaluable predictions of substrate specificities of unknown peptide synthetases . The putative binding pocket of the adenylation domains of McyB1 and McyC in Hapalosiphon hibernicus BZ-3-1 are identical (Table 2). However, this strain incorporated 91% L-Leu at the X position and 99% L-Ala at the Z position (Figure 3). Our results suggest that caution should be taken when inferring substrate specificity given the general lack of knowledge about how widespread adenylation domain replacement is in nature.
Many important antibiotics, antimicrobial compounds, siderophores and toxins are synthesized on non-ribosomal peptide synthetase enzyme complexes . There is much current interest in engineering non-ribosomal peptide synthetases in order to create new peptides with potential biological activities . It has been suggested that peptide synthetase would gain most effectively through transfer of entire modules [18, 21]. Some artificial combinations of adenylation and condensation domains result in non-functional products . This led to the hypothesis that non-ribosomal peptide synthetase modules evolve as a unit . Here we have clear evidence for the exchange and replacement of the adenylation domain without the concomitant transfer of the condensation domain.
Our results demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster. We show clear evidence for the recurrent exchange and replacement of the adenylation domain in a broad range of microcystin producing cyanobacteria. Our results show that functional peptide synthetases can be created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.
Taxon sampling and LC-MS
We selected representative producers of microcystins from the genera Anabaena, Hapalosiphon, Microcystis, Nostoc, and Planktothrix (see additional file 1). To obtain sufficient biomass for LC-MS analysis 10 cyanobacterial strains were grown at a photon irradiance of 20–27 μmol m-2 sec-1 in 2.7 liters of Z8 medium aerated with filter sterilized compressed air. Cells from 21 day old cultures were homogenized with 425–1180 μm diameter glass beads and 1 ml of 85% acetonitrile. The mixture was shaken in a FP120 FastPrep cell disruptor (Savant Instruments Inc.) and then centrifuged at 10,000 × g for 3 min. The supernatant was passed sequentially through two-solid phase extraction cartridges (StrataX Polymeric Sorbent) equilibrated with 1 ml of 85% acetonitrile and a 0.2 μm pore-size filter (GHP Acrodisc).
Microcystins were analyzed by injecting 10 μl of this extract into an Agilent 1100 series modular HPLC system (Agilent technologies) equipped with a diode array detector and a mass spectrometer (Agilent XCT Plus Ion Trap). A Luna-C18 (2) column (5 μm, 2 × 150 mm, Phenomenex) at 40°C and a mobile phase gradient of 5% (0 min) to 100% (50 min) isopropanol (+0.1% formic acid) in 0.1% formic acid at a flow rate of 0.15 ml min-1 were used. Microcystins were distinguished from other peptides based on their characteristic UV maximum absorbance at 238 nm and on their mass spectral characteristics as MH+ values corresponding to the range of published microcystins, loss of neutral fragment 134 in the ion source, occurrence of ions m/z 585/599/627/641 [(MeAsp)-(H)ar-(DM/ADM)Adda-(Glu)+H+] and m/z 375/361 (Adda -134-Glu-(M)dha) in the MS2 spectrum. Comparison of LC-MS properties of reference strain microcystins aided assignment of structure to the microcystin. Electrospray ionization was performed in positive ion mode. Nebulizer gas (N2) pressure was 30–50 psi (lb/in2) (207–345 kPa), drying gas flow and temperature 8–12 L min-1 and 350°C, respectively. The capillary voltage was set to 3270 V, capillary exit offset to 317 V, skimmer 1 potential to 41.5 V with a trap drive value 82.8. Spectra were recorded as averages of 4 using ultra scan mode and a scan range from 50 to 1200 m/z. MS2 spectra were recorded as averages of 3 with manual and auto MS mode with fragmentation amplitude of 0.50 V. In auto MS mode 5 precursor ions from ion range 800 – 1200 m/z were detected with an isolation width of 4.0 m/z.
PCR and sequencing
Total genomic DNA was extracted from 40 ml of cyanobacterial cultures using a hot phenol method . We amplified portions of the 16S rRNA, rpoC1, rpoB, tufA, and rbcL genes using sets of specific oligonucleotide primers (see additional file 1). These 5 housekeeping genes are present in all cyanobacteria and are thought to be largely unaffected by horizontal gene transfer. PCR reactions were performed in a 20 μl final volume containing approximately 20–100 ng of DNA, 1 × DynaZyme II PCR buffer, 250 μM of each deoxynucleotide, 0.5 μM of each oligonucleotide primer, and 0.5 units of DynaZyme II DNA polymerase (Finnzymes, Espoo, Finland). The following protocol was used: 95°C for 3 min; 30 cycles of denaturation at 94°C for 30 sec, annealing at 56°C for 30 sec and elongation at 72°C for 1 min, followed by a final elongation of 72°C for 10 min. To study the evolution of the microcystin biosynthetic system in these strains we chose 5 regions of the microcystin synthetase gene cluster, mcyD, mcyE, mcyG, mcyB and mcyC using sets of specific oligonucleotide primers (see additional file 1). PCR reactions were performed as before but with primer concentration increased to 0.7 μM and a 3-minute elongation time to amplify the 3.5 kb mcyB and mcyC PCR products. The size of the PCR amplification products was checked in agarose gels and PCR products were purified using Montage™ PCR Centrifugal Filter Devices (Millipore, Billerica, MA, USA). The purified PCR products were Sanger sequenced with the external primers used in PCR and where necessary sets of internal primers (see additional file 1). Cycle sequencing products were purified and separated on an ABI PRISM 310 Genetic Analyzer. Chromatograms were checked and edited with CHROMAS 2.2 program (Technelysium Pty Ltd.). Contig assembly and alignment of the sequences were performed with the BIOEDIT Sequence Alignment Editor.
Detection of recombination
We screened mcyB1 and mcyC sequences using the program TREEORDERSCAN . The TREEORDERSCAN program provides a rapid method to detect intergenotype recombination among individual sequences. Based on the alignment of mcyB1 and mcyC genes from 10 strains of toxic cyanobacteria, the phylogenetic compatibility matrix was constructed through comparing congruence between subtrees of whole alignment. At first, 67 alignment fragments were obtained by moving a 300 nucleotide window along the alignment with a step of 50 bases, and neighbor-joining tree of each fragment was constructed by PHYLIP . Then phylogenetic violations of any two different subtrees were calculated by TREEORDERSCAN (Simmonic 2005 version 1.5), and presented proportionally as a colour gradient.
Detection of potential recombinant sequences, identification of likely parent sequences, and localization of possible recombination breakpoints were done in RDP3 . The RDP3 package uses a mixture of statistical and phylogenetic methods to both identify probable recombination events within individual sequences and a minimal subset of unique events detectable within an entire alignment. To investigate the extent of recombination within the data set, the aligned sequences were examined using RDP3 , GENECONV , BOOTSCAN , MAXIMUM CHI SQUARE , CHIMAERA , and SISTER SCAN  recombination detection methods as implemented in RDP3 . Standard settings in RDP3 for all methods were that sequences were considered as linear, the P-value cutoff was set to 0.05, the standard Bonferroni correction was used, consensus daughters were found and breakpoints were polished. With the set of unique recombination events identified by these 6 detection methods a breakpoint map containing the positions of all positively identified breakpoints was constructed by moving a 200 nucelotide window and counting all the identified breakpoints falling within each window. A breakpoint density graph was created by plotting these numbers at the position of the centre of the window. For each window, a permutation test was made for breakpoint clustering analysis and to define the thresholds areas.
We investigated competing hypotheses concerning the origin and timing of the recombination events in the microcystin synthetase gene cluster by reconstructing the evolutionary history of both the microcystin synthetase gene cluster and the producer organisms. We amplified and sequenced portions of genes from both the microcystin synthetase gene cluster and housekeeping genes. In order to reconstruct the evolutionary history of the microcystin synthetase gene cluster we assembled a 3199 bp data set comprised of a mixed polyketide synthase/peptide synthetase gene (mcyE) and polyketide synthase genes (mcyD and mcyG) (see additional file 1). The phylogenetic analysis was rooted as described previously using homologues identified in BLAST (blastp) searches . We used random taxon addition (10 replicates), tree-bisection-reconnection branch-swapping, and heuristic searches with 100,000 repartitions of the data. The data from all 3 genes was concatenated in order to increase the amount of information available in phylogenetic analyses. We reconstructed the evolutionary history of the producer organisms by assembling a 3586 bp data set comprised of 16S rRNA, rpoB, rpoC1, tufA and rbcL gene sequences (see additional file 1). These genes are involved in carbon fixation, transcription and translation, conserved and widely used as tools for phylogenetic classification. The 16S rRNA, rpoB, tufA, rpoC1 and rbcL gene sequences of the early branching cyanobacteria Gloeobacter violaceus PCC 7421 (BA000045) and Thermosynechococcus elongatus BP-1 (BA000039) were used as outgroups. The 16S rRNA, rpoB, tufA, rpoC1 and rbcL sequence data were concatenated into a single data set. Phylogenetic analyses of these two datasets were conducted using PAUP*4.0 . Priming sites and ambiguous regions of the alignment were excluded. Phylogenetic trees were inferred using maximum-likelihood optimization criteria. Maximum-likelihood analyses were performed with ten heuristic searches, random addition-sequence starting trees, and tree bisection and reconnection branch arrangements. The GTR model of DNA substitution with a gamma distribution of rates and constant sites removed in proportion to base frequencies was used in maximum-likelihood analyses. We analyzed 1000 bootstrap replicates to test the stability of monophyletic groups.
In order to investigate recombination between the adenylation domain of mcyB1 and mcyC we obtained sequence data from mcyB1 (3494–3566 bp) and mcyC (3581–3593 bp). The mcyB1 PCR product contained the condensation, adenylation and thiolation domains of the first module as well as a fragment of the condensation domain from the second module. The mcyC gene PCR product contained the condensation, adenylation, thiolation domain as well as part of the thioesterase domain. Sequence data was partitioned into adenylation and condensation domain sequences and analyzed separately. We obtained a selection of condensation and adenylation domains from NCBI and aligned them against the McyB1 and McyC adenylation and condensation domains amino acid sequences in BIOEDIT (see additional file 1). Regions of ambiguous alignment were excluded and we considered 352 aa of the condensation domain and 197 aa of the adenylation domain (A3–A8) for phylogenetic analyses. Protein maximum-likelihood phylogenies of each dataset were inferred using PROML implemented in the PHYLIP 3.6 package  with a JTT substitution model. Ten random additions with global rearrangements were used to find the optimal tree. We performed 1,000 distance bootstrap replicates using the SEQBOOT, PROTDIST (JTT substitution model), and CONSENSE programs of the PHYLIP 3.6 package .
Substrate specificities of the mcyB1 and mcyC adenylation domains
Manual alignment against the GrsA primary amino acid sequence between the core motifs A4 and A10 allowed extraction of the 10 amino acids predicted to line the binding pocket of the adenylation domain for both McyB1 and McyC . According to this model, the L-Asp residue at position 235 and the L-Lys residue at 517 interact with the α-amino and the carboxyl groups, respectively, to lock orientation of the L-α-amino acid upon activation . This configuration projects the side chain of the amino acid into the binding pocket where it is bound by the remaining 8 amino acids lining the pocket. Manual substrate specificities predictions were confirmed using the automated NRPSpredictor tool .
(2S,3S,8S,9S) 3-amino-9-methoxy-2,6,8-trimethyl-10-phenyl-(4E), (6E)-decadienoic acid
Sivonen K, Jones G: Cyanobacterial toxins. Toxic cyanobacteria in water. Edited by: Chorus I, Bartram J. 1999, London, E&FN Spon, 41-111.
MacKintosh C, Beattie KA, Klumpp S, Cohen P, Codd GA: Cyanobacterial microcystin-LR is a potent and specific inhibitor of protein phosphatases 1 and 2A from both mammals and higher plants. FEBS Lett. 1990, 264: 187-192. 10.1016/0014-5793(90)80245-E.
Sivonen K, Carmichael WW, Namikoshi M, Rinehart KL, Dahlem AM, Niemelä SI: Isolation and characterization of hepatotoxic microcystin homologs from the filamentous fresh-water cyanobacterium Nostoc sp. strain 152. Appl Environ Microbiol. 1990, 56: 2650-2657.
Sivonen K, Namikoshi M, Evans WR, Färdig M, Carmichael WW, Rinehart KL: Three new microcystins, cyclic heptapeptide hepatotoxins, from Nostoc sp. strain 152. Chem Res Toxicol. 1992, 5: 464-469. 10.1021/tx00028a003.
Oksanen I, Jokela J, Fewer DP, Wahlsten M, Rikkinen J, Sivonen K: Discovery of rare and highly toxic microcystins from lichen-associated cyanobacterium Nostoc sp. strain IO-102-I. Appl Environ Microbiol. 2004, 70: 5756-5763. 10.1128/AEM.70.10.5756-5763.2004.
Prinsep MR, Caplan FR, Moore RE, Patterson GML, Honkanen RE, Boynton AL: Microcystin-LA from a blue-green alga belonging to the Stigonematales. Phytochemistry. 1992, 31: 1247-1248. 10.1016/0031-9422(92)80269-K.
Izaguirre G, Jungblut AD, Neilan BA: Benthic cyanobacteria (Oscillatoriaceae) that produce microcystin-LR, isolated from four reservoirs in southern California. Water Res. 2007, 41: 492-498. 10.1016/j.watres.2006.10.012.
Dittmann E, Neilan BA, Erhard M, von Döhren H, Börner T: Insertional mutagenesis of a peptide synthetase gene that is responsible for hepatotoxin production in the cyanobacterium Microcystis aeruginosa PCC 7806. Mol Microbiol. 1997, 26: 779-787. 10.1046/j.1365-2958.1997.6131982.x.
Nishizawa T, Asayama M, Fujii K, Harada K, Shirai M: Genetic analysis of the peptide synthetase genes for a cyclic heptapeptide microcystin in Microcystis spp. J Biochem (Tokyo). 1999, 126: 520-529.
Nishizawa T, Ueda A, Asayama M, Fujii K, Harada K, Ochi K, Shirai M: Polyketide synthase gene coupled to the peptide synthetase module involved in the biosynthesis of the cyclic heptapeptide microcystin. J Biochem (Tokyo). 2000, 127: 779-789.
Tillett D, Dittmann E, Erhard M, von Döhren H, Börner T, Neilan BA: Structural organization of microcystin biosynthesis in Microcystis aeruginosa PCC 7806: an integrated peptide-polyketide synthetase system. Chem Biol. 2000, 7: 753-764. 10.1016/S1074-5521(00)00021-1.
Christiansen G, Fastner J, Erhard M, Börner T, Dittmann E: Microcystin biosynthesis in Planktothrix: genes, evolution, and manipulation. J Bacteriol. 2003, 185: 564-572. 10.1128/JB.185.2.564-572.2003.
Rouhiainen L, Vakkilainen T, Lumbye-Siemer B, Buikema W, Haselkorn R, Sivonen K: Genes coding for hepatotoxic heptapeptides (microcystins) in the cyanobacterium Anabaena strain 90. Appl Environ Microbiol. 2004, 70: 686-692. 10.1128/AEM.70.2.686-692.2004.
Rantala A, Fewer DP, Hisbergues M, Rouhiainen L, Vaitomaa J, Börner T, Sivonen K: Phylogenetic evidence for the early evolution of microcystin synthesis. Proc Natl Acad Sci USA. 2004, 101: 568-573. 10.1073/pnas.0304489101.
Kurmayer R, Christiansen G, Fastner J, Börner T: Abundance of active and inactive microcystin genotypes in populations of the toxic cyanobacterium Planktothrix spp. Environ Microbiol. 2004, 6: 831-841. 10.1111/j.1462-2920.2004.00626.x.
Marahiel MA, Stachelhaus T, Mootz HD: Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem Rev. 1997, 97: 2651-2674. 10.1021/cr960029e.
Sieber SA, Marahiel MA: Molecular mechanisms underlying nonribosomal peptide synthesis: approaches to new antibiotics. Chem Rev. 2005, 105: 715-738. 10.1021/cr0301191.
Lautru S, Challis GL: Substrate recognition by nonribosomal peptide synthetase multi-enzymes. Microbiology. 2004, 150: 1629-1636. 10.1099/mic.0.26837-0.
Stachelhaus T, Schneider A, Marahiel MA: Rational design of peptide antibiotics by targeted replacement of bacterial and fungal domains. Science. 1995, 269: 69-72. 10.1126/science.7604280.
Stachelhaus T, Mootz HD, Marahiel MA: The specificity-conferring code of adenylation domains in non-ribosomal peptide synthetases. Chem Biol. 1999, 8: 493-505. 10.1016/S1074-5521(99)80082-9.
Mootz HD, Schwarzer D, Marahiel MA: Construction of hybrid peptide synthetases by module and domain fusions. Proc Natl Acad Sci USA. 2000, 97: 5848-5853. 10.1073/pnas.100075897.
Mikalsen B, Boison G, Skulberg OM, Fastner J, Davies W, Gabrielsen TM, Rudi K, Jakobsen KS: Natural variation in the microcystin synthetase operon mcyABC and impact on microcystin production in Microcystis strains. J Bacteriol. 2003, 185: 2774-2785. 10.1128/JB.185.9.2774-2785.2003.
Tanabe Y, Kaya K, Watanabe MM: Evidence for recombination in the microcystin synthetase (mcy) genes of toxic cyanobacteria Microcystis spp. J Mol Evol. 2004, 58: 633-641. 10.1007/s00239-004-2583-1.
Kurmayer R, Christiansen G, Gumpenberger M, Fastner J: Genetic identification of microcystin ecotypes in toxic cyanobacteria of the genus Planktothrix. Microbiology. 2005, 151: 1525-1533. 10.1099/mic.0.27779-0.
Rausch C, Hoof I, Weber T, Wohlleben W, Huson DH: Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol Biol 2007, 7:78. 2007, 7: 78-10.1186/1471-2148-7-78.
Kurmayer R, Dittmann E, Fastner J, Chorus : Diversity of microcystin genes within a population of the toxic cyanobacterium Microcystis spp. in Lake Wannsee (Berlin, Germany). Microb Ecol. 2002, 43: 107-118. 10.1007/s00248-001-0039-3.
Lautru S, Deeth RJ, Bailey LM, Challis GL: Discovery of a new peptide natural product by Streptomyces coelicolor genome mining. Nat Chem Biol. 2005, 1: 265-269. 10.1038/nchembio731.
Giovannoni SJ, Britschgi TB, Moyer CL, Field KG: Genetic diversity in Sargasso Sea bacterioplankton. Nature. 1990, 345: 60-63. 10.1038/345060a0.
Simmonds P, Midgley S: Recombination in the genesis and evolution of hepatitis B virus genotypes. J Virol. 2005, 79: 15467-15476. 10.1128/JVI.79.24.15467-15476.2005.
Felsenstein J: Phylogeny Inference Package (PHYLIP). Version 3.5. 1993, University of Washington, Seattle
Martin D, Rybicki E: RDP: detection of recombination amongst aligned sequences. Bioinformatics. 2000, 16: 562-563. 10.1093/bioinformatics/16.6.562.
Padidam M, Sawyer S, Fauquet CM: Possible emergence of new geminiviruses by frequent recombination. Virology. 1999, 265: 218-225. 10.1006/viro.1999.0056.
Martin DP, Posada D, Crandall KA, Williamson C: A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum Retroviruses. 2005, 21: 98-102. 10.1089/aid.2005.21.98.
Maynard Smith J: Analyzing the mosaic structure of genes. J Mol Evol. 1992, 34: 126-129.
Gibbs MJ, Armstrong JS, Gibbs AJ: Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics. 2000, 16: 573-582. 10.1093/bioinformatics/16.7.573.
Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. 1998, Massachusetts: Sinauer Associates, Sunderland
Rausch C, Weber T, Kohlbacher O, Wohlleben W, Huson DH: Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res. 2005, 33: 5799-5808. 10.1093/nar/gki885.
This work was supported by grants from the European Union PEPCY (QLK4-CT-2002-02634) and projects as well as grants from the Academy of Finland to DF (1212943) and KS (53305 and 214457). We thank Dr. Christina Lyra and Anne Rantala M.Sc. for valuable comments on this manuscript. We are grateful to Lyudmila Saari for her valuable help in handling the cultures.
The author(s) declares that there are no competing interests.
DPF conceived of the study, carried out the molecular genetic studies, participated in the sequence alignment, performed the phylogenetic analysis and drafted the manuscript. LR conceived of the study and drafted the manuscript. JJ performed the LC-MS analysis and drafted the manuscript. MW performed the LC-MS analysis. HW performed the recombination analysis. KL carried out the molecular genetic studies and drafted the manuscript. KS participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.