Fused eco29kIR- and M genes coding for a fully functional hybrid polypeptide as a model of molecular evolution of restriction-modification systems

Background The discovery of restriction endonucleases and modification DNA methyltransferases, key instruments of genetic engineering, opened a new era of molecular biology through development of the recombinant DNA technology. Today, the number of potential proteins assigned to type II restriction enzymes alone is beyond 6000, which probably reflects the high diversity of evolutionary pathways. Here we present experimental evidence that a new type IIC restriction and modification enzymes carrying both activities in a single polypeptide could result from fusion of the appropriate genes from preexisting bipartite restriction-modification systems. Results Fusion of eco29kIR and M ORFs gave a novel gene encoding for a fully functional hybrid polypeptide that carried both restriction endonuclease and DNA methyltransferase activities. It has been placed into a subclass of type II restriction and modification enzymes - type IIC. Its MTase activity, 80% that of the M.Eco29kI enzyme, remained almost unchanged, while its REase activity decreased by three times, concurrently with changed reaction optima, which presumably can be caused by increased steric hindrance in interaction with the substrate. In vitro the enzyme preferentially cuts DNA, with only a low level of DNA modification detected. In vivo new RMS can provide a 102-fold less protection of host cells against phage invasion. Conclusions We propose a molecular mechanism of appearing of type IIC restriction-modification and M.SsoII-related enzymes, as well as other multifunctional proteins. As shown, gene fusion could play an important role in evolution of restriction-modification systems and be responsible for the enzyme subclass interconversion. Based on the proposed approach, hundreds of new type IIC enzymes can be generated using head-to-tail oriented type I, II, and III restriction and modification genes. These bifunctional polypeptides can serve a basis for enzymes with altered recognition specificities. Lastly, this study demonstrates that protein fusion may change biochemical properties of the involved enzymes, thus giving a starting point for their further evolutionary divergence.

Background DNA restriction-modification systems (RMS) are prokaryotic tools against invasion of foreign DNAs into cells [1]. They play an important evolutionary role as subcellular barriers restricting horizontal gene transfer and thereby providing microbial biodiversity. Usually, RMS comprise of a restriction endonuclease (REase) and modification DNA methyltransferase (MTase) enzyme recognizing the same short 4-8 nucleotide sequence. RMS functioning includes methylation of recognition DNA sequences by MTase. All non-modified sites can be cut by a cognate REase [1]. Type II REases are indispensable tools in creating recombinant DNA molecules [2]. Their widespread practical application has stimulated research to discover and characterize more of these systems. Currently, more than 6000 different sequences corresponding to REases of type II alone are listed in REBASE, the database holding all known and many putative RMS [3].
The high number of known RMS is reflected also in high diversity of their organization or functioning and, hypothetically, in multiplicity of their evolutionary pathways. One of these pathways could be fusion of preexisting ORFs with formation of a gene capable of producing a protein with an array of new activities and functions. It could be suggested that type IIC RMS carrying both REase and MTase in a single polypeptide might appear by this mechanism [4]. Here we report direct evidence how a fully functional type IIC REase could appear by fusion of the appropriate genes as a result of a few point mutations.
As an object of our experiment Eco29kI RMS was chosen. This RMS is carried by the natural plasmid pECO29 found in clinical E. coli 29kI isolate [5]. On this plasmid there were genes coding for 214 aa REase and 382 aa DNA MTase [6,7]. By nomenclature Eco29kI RMS belongs to IIP group as one recognizing the palindromic site [4]. Its REase contains GIY-YIG nuclease domain that was identified in homing endonucleases, DNA repair and recombination enzymes, and restriction endonucleases [8,9]. Recently its mechanism of action was established: R.Eco29kI monomers dimerize on a single cognate DNA molecule forming the catalytically active complex [10]. An Eco29kI RMS is organized in such a way that REase ORF precedes MTase ORF, and the REase Stop codon and the MTase Start codon overlap ( Figure 1A and 1B). Similar organization is also characteristic to such well-known RMS as SalI and Hin-dIII. Using site-directed mutagenesis, eco29kI R and M ORFs were fused to give a fully functional hybrid protein. It was purified and characterized. Its biochemical properties, namely, REase and MTase specificities and the optima of reaction conditions were compared with those of the original enzymes.

Construction of overproducing strain and purification of RM.Eco29kI
To construct an overproducing strain for the protein RM.Eco29kI, ORFs of Eco29kI REase and MTase were amplified from the natural plasmid pECO29 where they are oriented as shown in Figure 1A[5],. On this plasmid the Stop codon of Eco29kI REase and the Start codon of Eco29kI MTase overlap. Subsequently they were cloned in the same orientation into the pET19mod vector. By site-directed mutagenesis their separating Stop and Start codons were substituted for 2 Glycine codons, as shown in Figure 1B, thus forming the eco29kI.RM gene. The resulting RM.Eco29kI polypeptide contains both REase and MTase enzymes joint by a flexible 2 Glycine hinge and 6 His-tag on its N terminus. The purification scheme for RM.Eco29kI was based on affinity (Ni-CAM, nickel chelate affinity matrix, Sigma) chromatography. From a Ni-CAM column RM.Eco29kI was eluted by linear steps of 20, 50, 75, 100 and 150 mM imidazole. Finally, 200 ml of cell culture gave about 0.4 mg of >98% purified enzyme ( Figure 1C) with a molecular weight of~67 kDa.
When the Stop codon of Eco29kI REase gene was substituted for Glycine without modifying the first Methionine codon of Eco29kI MTase gene, expression of M. Eco29kI, but not RM.Eco29kI, was detected (unpublished observations). Hypothetically, it might happen due to a pronounced secondary structure of mRNA in the region corresponding to R.Eco29kI ORF or, as it was reported personally by M. Nagornykh, to transcription initiation from an alternative promoter closely preceding M.Eco29kI ORF [13].

Characterization of RM.Eco29kI REase activity
Two enzymes, R.Eco29kI and RM.Eco29kI, purified to homogeneity, were assayed for their recognition specificity and catalytic reaction optimum at varied concentrations of NaCl, KCl, temperature and pH. R.Eco29kI was purified as described previously [14]. Figure 2 shows hydrolysis patterns of 80vir DNA with R.Eco29kI (lane 4) and with RM.Eco29kI (lane 3). The patterns are the same, so the hybrid protein RM.Eco29kI retained the specificity of R.Eco29kI.
Then the enzyme RM.Eco29kI was assayed for catalytic reaction optimum. The maximal catalytic REase activity of the hybrid protein was observed at 0-50 mM NaCl; 0-25 mM KCl, 10 mM MgCl 2 ; pH 7.0, at 30-37°C (Figure 3 and 4). Table 1 presents reaction optimum comparison for R.Eco29kI and RM.Eco29kI. As seen, the optima were slightly changed: 100 mM less for NaCl, 50 mM less for KCl, and 1 unit less for pH. It means that after fusion the biochemical properties of the REase part of the protein were changed, despite its intact amino acid sequence. Under optimal conditions R.Eco29kI had a specific activity of 60 AU/pMol, whereas RM.Eco29kI had 20 AU/pMol amounting to 33% of the native value. Thus, after fusion with M.Eco29kI its activity was decreased by about 3 times.

Characterization of RM.Eco29kI MTase activity
The biochemical characterization of Eco29kI MTase is presented in [15]. The enzyme methylates the second Cytosine in the sequence CC Me GCGG. Specificity of RM.Eco29kI MTase activity was proved to be the same, because there was no incorporation of labeled methyl groups into substrates pretreated with M.Eco29kI enzyme and non-labeled AdoMet. The optima of reaction conditions also remained unchanged: both enzymes showed their maximal activities at 50 mM NaCl; 5 mM EDTA; pH 7.0-8.5, and 37°C. Under optimal conditions M.Eco29kI had a specific activity of 10 AU/pMol, whereas RM.Eco29kI had 8 AU/pMol amounting to 80% of the native value. Thus, after fusion with R.Eco29kI, activity of its MTase part was almost unchanged.

Characterization of RM.Eco29kI functioning in vivo
To characterize in vivo functioning of new RMS, we performed phage restriction experiments. In these experiments 100-fold dilutions of phage λvir (10 0 , 10 -2 , 10 -4 , 10 -6 ) were spotted on lawns of bacterial cells. The results are presented in Figure 5. BL21(DE3)xp29k11 cells carry only gene coding for Eco29kI MTase and lack Eco29kI REase activity, which allows evaluating total concentration of infective phage λvir virions. BL21 (DE3)xpECO29 cells carry natural pECO29 plasmid, having both MTase and REase activities of the wild type Eco29kI RMS. BL21(DE3)xpECO29RM cells carry gene   more effectively, giving a 10 4 -fold restriction, so that only one out of 10, 000 virions could infect the cells. Thus, the new RMS protected its host cells 10 2 times less effectively than the wild type RMS. But 10 2 phage restriction value reflects the ability of the new RMS to protect cells against foreign DNA invasion; so this function of the wild type RMS was conserved in the RM. Eco29kI RMS.

Characterization of RM.Eco29kI behavior in vitro
To assess the in vitro interaction of the bifunctional enzyme RM.Eco29kI with DNA and the effect of Ado-Met on its REase activity, we incubated this protein with phage 80vir DNA in conditions optimal for REase ( Figure 6A) and MTase ( Figure 6B) in the absence or presence of excess AdoMet (10 μM). In MTase reaction mixture we substituted 5 mM EDTA for 10 mM MgCl 2 to supply magnesium ions to the REase part of the enzyme. While in REase buffer the hydrolysis patterns looked identical both in the presence and absence of AdoMet, in MTase buffer with AdoMet a slightly incomplete hydrolysis could be observed even at the lowest enzyme dilutions ( Figure 6B), which could be explained by DNA methylation with the MTase part of the enzyme. It follows from these results that in vitro RM.Eco29kI enzymatic reaction is strongly biased towards DNA hydrolysis, while only a small portion of DNA can be modified in the same reaction mixture; and that AdoMet does not influence REase activity of the enzyme, unlike other type IIC proteins known so far.

Discussion
Nomenclature of the RM.Eco29kI enzyme To date, the following 12 proteins have been proved to show both REase and MTase activities by one polypeptide chain: AloI, BcgI, BseMII, BseRI, BsgI, BspLU11III, CjeI, Eco57I, HaeIV, MmeI, PpiI, and TstI [16][17][18][19][20][21][22][23][24][25][26]. They can be considered as members of the type IIC RMS group. Their properties are given in Table 2. All of them also belong to IIB (cutting on both sides of their recognition sequences) or to IIG (stimulated or inhibited with AdoMet) groups of RMS [4]. The RM.Eco29kI protein also falls in the category of IIC enzymes, but differs from its regular members by many features. It recognizes a true non-interrupted palindromic sequence; it cuts within the recognition site; its REase is not influenced by AdoMet; and its MTase belongs to m 5 C type, while others contain m 6 A type MTases. Altogether, properties of RM.Eco29kI expand the limits of type IIC enzymes.

The role of gene fusion in molecular evolution of RMS
In our study a fully functional hybrid polypeptide was generated by fusion of Eco29kI REase and MTase proteins. New protein had REase and MTase specificities of the original enzymes. Its MTase activity was almost unchanged and amounted to 80% of that of the M. Eco29kI under optimal reaction conditions for both enzymes: 50 mM NaCl; 5 mM EDTA; pH 8.0, and 37°C Phage dilutions 10 0 10 -2 10 -4 10 -6 [15]. Its REase activity was decreased by three times, which could be attributed to increased steric hindrance in interaction with substrate. Besides, the reaction optimum for REase activity of RM.Eco29kI differed from that of the R.Eco29kI as follows: 100 mM less for [NaCl], 50 mM less for [KCl], and 1 unit less for pH ( Table 1). The particular reason for this shift is unclear because many physical and chemical properties of the protein, such as molecular weight, isoelectric point, total charge, geometry of the protein, surface charge distribution, etc., were different from those of the original proteins. Hypothetically, the gap between their properties could yield as a result of the natural selection, after many generations, a novel protein with altered biochemical properties and different functions in the cell. This work presents experimental evidence for molecular evolution of RMS or multifunctional proteins in general. It has been directly shown that a few point mutations can result in a protein with a novel combination of activities and altered properties. On the one hand, it may be considered as a molecular mechanism for appearance of type IIC RMS enzymes. Figure 7 and 8 show a schematic gene organization of the described type IIC RMS. As seen, all of them could result from fusion of head-to-tail oriented endonuclease and methyltransferase ORFs. Possible involvement of gene fusion was proposed for the formation of AloI, BseMII, CjeI, Eco57I, HaeIV, and MmeI. The enzymes AloI, CjeI, and HaeIV probably resulted from fusion of HsdS, HsdM and REase domains; Eco57I -from Mod and Res subunits of type III enzymes; while BseMII and MmeI -from REase and MTase domains [16,18,23,25].
It can be predicted that novel type IIC RMS may appear via fusion of genes from bipartite RMS having head-to-tail gene organization. This type of gene organization is quite common for RMS of all types. Type I RMS such as CfrA, EcoA, EcoB, EcoD, EcoE, EcoK,   [12]. These possibilities could be facilitated by increased affinity of DNA MTases for hemimethylated substrates in comparison with non-methylated ones, unlike their REase counterparts. Therefore, if a fused RMS appeared from preexisting bipartite RMS and its MTase activity was lower of the original enzyme, its propagation would be facilitated by facts that all recognition sites in the host genome were methylated or, after replication, hemimethylated by the preexisting enzyme and that it had increased affinity to hemimethylated substrates. Otherwise, decreased MTase activity of a new fused RMS could lead to appearance of unprotected recognition sequences in host genome, which will be cut by REase, host cells will die and this RMS will not propagate in a bacterial population. On the other hand, this study provides an example of a more general mechanism for gaining new functions by existing proteins. Hypothetically, any pair of ORFs can be joint in-frame by point mutations/deletions/insertions/inversions/translocations or their combinations (Figure 9). Then the newly generated polypeptide may serve as an evolutionary intermediate for the natural selection in improving old or accommodating new functions in the cell. For example, M.SsoII-related bifunctional enzymes, including its izoschizomers, with regulatory and MTase domains could arise by natural fusion of the appropriate ORFs at some stages of their evolutionary history [11,27]. The latter suggestion is supported by the presence of NlaX MTase, a close homolog of M.SsoII without the regulatory N-terminal domain. As shown in Figure 10, both polypeptides display high identity after 70 amino acids of the M.SsoII N-terminal domain known to be involved in its gene autoregulation [11]. Similar fusion of preexisting headto-tail oriented ORFs coding for C-protein and endonuclease in BamHI, Eco72I, MunI, PvuII, SmaI RMS, could give REases with transcription regulatory functions. Depending on gene organization, there are also possibilities for fusion of two MTases from preexisting RMS, such as DpnII and HgaI, leading to bifunctional MTases like FokI and LlaI [12].
In principle, any other adjacent gene of appropriate orientation could be fused with the REase or MTase part of RMS. In this case the fact that RMS elimination is lethal for cells would defend the fused ORF from being lost [28]. It occurs because long-lasting endonucleolitic activity results in multiple cuts on non-protected genomic DNA, thereby killing cells that lack RMS. Thus, joining RMS genes would promote maintenance of a fused ORF and its spreading in bacterial populations, provided their functioning is not disturbed.

Practical applications of gene fusion for generating REases with novel recognition sites
Restriction enzymes are robust tools for the recombinant DNA technology. Despite the fact that more than 200 enzymes with different recognition sites have been isolated from various bacterial strains, many specificities have not been discovered [23,26]. To create enzymes with altered recognition specificities, methylation activity-based selection (MABS) and target recognition domain (TRD) reassortment [23,26] approaches were proposed. Using these techniques, bifunctional enzymes of type IIC such as Eco57I, AloI, PpiI, and TstI were manipulated to yield a generation with novel specificities. Most importantly, both of their activities operated on the same target sequence, thus providing a possibility to use the DNA-modification activity of these enzymes for the selection of mutants with altered sequence specificity. To enlarge the list of enzymes available for these manipulations, a gene fusion approach, similar to one used in this work, could be applied. By this procedure, hundreds of bifunctional enzymes could be created, e.g., from type I and III head-to-tail oriented RMS, thus giving an opportunity of significant contribution to the existing recognition specificities.

Conclusions
Altogether, our work presents an example of molecular mechanism for appearance of type IIC restriction-modification and SsoII MTase-related enzymes as well as other multifunctional proteins. It demonstrates that gene fusion could play an important role in evolution of restriction-modification systems and be responsible for enzyme subclass interconversion. Based on the proposed approach, hundreds of novel type IIC enzymes could be generated from head-to-tail oriented type I, II and III restriction and modification genes. These new bifunctional polypeptides could be useful for creating enzymes with altered recognition specificities. Lastly, our work shows that protein fusion can change biochemical properties of the involved enzymes, thus giving a starting point for their further evolutionary divergence, which, after many generations, gives a novel protein with both altered biochemical properties and different functions in the cell.

Bacterial Culture
All strains used were grown in LB liquid medium with high aeration. For selection ampicillin (100 μg/ml) and kanamycin (25 μg/ml) were added to LB agar plates [29].

DNA Manipulations
All genetic engineering methods were performed as described [29]. Selection of transformants with plasmids carrying Eco29kI RMS genes and screening of recombinant clones were performed as elsewhere [30,31].
Construction of Eco29kI REase and MTase fusion RM. Eco29kI protein overproducing strain and biomass production Eco29kI REase and MTase genes were amplified from the natural plasmid pECO29 with primers TTTGTCGACATGCACAATAAGAAATTTGATA (forward; containing start ATG codon shown in bold) and CCTGGATCCCTTTTAATTGAAGTTAGAGCACAA (reverse; containing stop TTA codon shown in bold), carrying SalI and BamHI sites, respectively, for subsequent cloning [5]. Appropriately digested PCR product was cloned into pET19mod vector. Site-directed mutagenesis has been performed to fuse Eco29kI REase and MTase ORFs, using the high fidelity Pfu DNA polymerase and oligonucleotide primers: Fus1: AAGAGTAATTTTACAGGAGGGAGATCAT-TAGAG and Fus2: CTCTAATGATCTCCCTCCTGTAAAATTACTCTT.
A novel plasmid containing staggered nicks was generated. Following the thermal cycling, the reaction mixture was treated with DpnI REase that digests hemimethylated parental DNA, leaving behind the newly amplified nicked DNA with the mutation of interest. The mutation and eco29kI.RM gene were confirmed by sequencing. The resulting construct was referred to as p29RM.
The plasmid p29RM was introduced into E. coli BL21 (DE3) strain. Cells were grown in LB medium supplemented with ampicillin and kanamycin at 20°C. After reaching OD 590 = 0.6, the recombinant protein RM. Eco29kI synthesis was induced by addition of isopropylβ-D-thiogalactopyranoside (IPTG) to a final concentration of 0.1 mM followed by overnight incubation at 20°C. Cells were harvested by centrifugation, frozen and stored at -70°C. All subsequent steps were carried out at 4°C.

Construction of natural pECO29 plasmid carrying fused Eco29kI REase and MTase genes
To make Eco29kI REase and MTase gene fusion on natural pECO29 plasmid, its BclI-PvuII fragment was substituted for BclI-PvuII fragment of p29RM plasmid, containing the gene fusion mutations. Resulting plasmid was referred as pECO29RM.

Methylation Assay
To test RM.Eco29kI DNA methyltransferase activity, we used DE-filters assay according to [35].

Analysis of Protein Concentration
Protein concentration was determined from the absorption at 280 nm on a Shimadzu UV-1601 spectrophotometer (Japan). An extinction coefficient was calculated by ProtParam tool http://www.expasy.ch: 94 240 M -1 cm -1 .

Unit definition
The amount of enzyme required to transfer 1 pmol of ( 3 H)-methyl groups to DNA per minute with saturating concentrations of substrates at 37°C has been taken as 1 AU of the RM.Eco29kI DNA methyltransferase activity. The amount of enzyme required to cut 1 μg of phage 80vir DNA during 1 h at 37°C has been taken as 1 AU of the RM.Eco29kI restriction endonuclease activity. Both reactions were carried out under predefined optimal reaction conditions.