Interference with histidyl-tRNA synthetase by a CRISPR spacer sequence as a factor in the evolution of Pelobacter carbinolicus

Background Pelobacter carbinolicus, a bacterium of the family Geobacteraceae, cannot reduce Fe(III) directly or produce electricity like its relatives. How P. carbinolicus evolved is an intriguing problem. The genome of P. carbinolicus contains clustered regularly interspaced short palindromic repeats (CRISPR) separated by unique spacer sequences, which recent studies have shown to produce RNA molecules that interfere with genes containing identical sequences. Results CRISPR spacer #1, which matches a sequence within hisS, the histidyl-tRNA synthetase gene of P. carbinolicus, was shown to be expressed. Phylogenetic analysis and genetics demonstrated that a gene paralogous to hisS in the genomes of Geobacteraceae is unlikely to compensate for interference with hisS. Spacer #1 inhibited growth of a transgenic strain of Geobacter sulfurreducens in which the native hisS was replaced with that of P. carbinolicus. The prediction that interference with hisS would result in an attenuated histidyl-tRNA pool insufficient for translation of proteins with multiple closely spaced histidines, predisposing them to mutation and elimination during evolution, was investigated by comparative genomics of P. carbinolicus and related species. Several ancestral genes with high histidine demand have been lost or modified in the P. carbinolicus lineage, providing an explanation for its physiological differences from other Geobacteraceae. Conclusions The disappearance of multiheme c-type cytochromes and other genes typical of a metal-respiring ancestor from the P. carbinolicus lineage may be the consequence of spacer #1 interfering with hisS, a condition that can be reproduced in a heterologous host. This is the first successful co-introduction of an active CRISPR spacer and its target in the same cell, the first application of a chimeric CRISPR construct consisting of a spacer from one species in the context of repeats of another species, and the first report of a potential impact of CRISPR on genome-scale evolution by interference with an essential gene.


Conclusions:
The disappearance of multiheme c-type cytochromes and other genes typical of a metal-respiring ancestor from the P. carbinolicus lineage may be the consequence of spacer #1 interfering with hisS, a condition that can be reproduced in a heterologous host. This is the first successful co-introduction of an active CRISPR spacer and its target in the same cell, the first application of a chimeric CRISPR construct consisting of a spacer from one species in the context of repeats of another species, and the first report of a potential impact of CRISPR on genome-scale evolution by interference with an essential gene.

Background
Clustered regularly interspaced short palindromic repeats (CRISPR), which consist of direct repeats of a short sequence (21-47 bp) separated by nonrepetitive sequences of similar size, have been identified in the genome sequences of almost all archaea and numerous bacteria, with a variable complement of adjacent CRISPR-associated (cas) genes [1][2][3][4][5][6][7][8][9]. A fraction of the spacer sequences between repeats have been found to match sequences termed "proto-spacers" within genes, from which they may be derived [8,10,11], and the fact that many of these genes belong to phage or plasmid entities led to the hypothesis that CRISPR and the Cas proteins may function as an RNA interference-based immune system [6]. The link between specific CRISPR spacers and proto-spacers and phage resistance has been established by mutational analysis in Streptococcus thermophilus [12,13], and by testing synthetic CRISPR constructs in Escherichia coli [14]. Similarly, resistance of Staphylococcus epidermidis to a conjugative plasmid has been shown to depend on a CRISPR spacer and the corresponding proto-spacer [15]. Expression of CRISPR loci as long transcripts processed into smaller RNA molecules has been observed in several archaea [5,[16][17][18][19] and bacteria [14,20]. A complex of Cas proteins has been shown to carry out this processing in E. coli and to be required for resistance to infection [14]; a different protein (Cas6) processes CRISPR transcripts in Pyrococcus furiosus [21]. CRISPR-derived RNAs have been shown to form RNA-protein complexes in P. furiosus [19], which leads to degradation of RNAs containing matching proto-spacers [22], whereas DNA was shown to be the target of interference by spacer-containing RNAs in S. epidermidis [15]. Although CRISPR are widely regarded as an immunological phenomenon, CRISPR and cas genes have also been implicated in spore development of Myxococcus xanthus [3,23] and in inhibition of biofilm formation and swarming of Pseudomonas aeruginosa by a lysogenic phage [24], and there has been speculation that spacers with matches to housekeeping genes represent a novel mechanism of gene regulation [25].
The Geobacteraceae, a Fe(III)-respiring family of Deltaproteobacteria, are of interest for their role in bioremediation of U(VI)-contaminated environments and their ability to donate electrons directly to graphite electrodes, producing an electrical current [26,27]. Pelobacter carbinolicus is a member of the Geobacteraceae that grows by fermentation of acetoin and 2,3-butanediol, as well as by indirect Fe(III) respiration with ethanol as the electron donor and acetate as the end product [28,29]. Unlike its relatives in the genus Geobacter, P. carbinolicus cannot reduce Fe(III) directly in the absence of sulfur or sulfide [30], or produce electricity [31]. The genome of P. carbinolicus was sequenced for the purpose of comparison to those of Geobacter species, three of which have been extensively curated: Geobacter sulfurreducens [32], Geobacter metallireducens [33] and Geobacter bemidjiensis (Aklujkar et al., submitted). This report explores how evolution of the P. carbinolicus genome may have been influenced by a spacer within the CRISPR locus that matches a proto-spacer within histidyl-tRNA synthetase (hisS), resulting in the elimination of ancestral genes containing multiple closely spaced histidines. The interfering nature of the spacer was confirmed by introducing it to a transgenic G. sulfurreducens strain containing the target gene.

Analysis of CRISPR spacers
The CRISPR locus was identified when manual curation of the P. carbinolicus genome revealed a series of suspiciously repetitive predicted genes. The nonredundant nucleotide sequence database was queried with each of the 111 CRISPR spacers of P. carbinolicus using the BLAST algorithm [34], with the minimum possible word size of 7 bp and without filtering out low-complexity regions of the queries. Alignments with five or fewer mismatches out of 32 bases were considered significant.

Phylogenetic analysis of HisS and HisZ proteins
The sequences of all predicted hisS gene products of the Geobacteraceae, together with HisS and HisZ protein sequences representative of various families of Bacteria and Archaea, were aligned by TCoffee [35] and trimmed using Mesquite (Maddison, W. P., and Maddison, D. R. 2006. Mesquite: a modular system for evolutionary analysis. Version 1.12). Phylogenetic trees were constructed using Phylip (Felsenstein, J. 2005. PHYLIP (Phylogeny Inference Package) version 3.6) with 500 bootstrap runs.
Quantitative real-time PCR of reverse-transcribed RNA P. carbinolicus strain DSM2380 was grown as previously described [36] with ethanol as the electron donor and Fe(III) as the electron acceptor. RNA was isolated from triplicate chemostat cultures as previously described [37,38]. Transgenic G. sulfurreducens strains were grown in NBAF medium [39] and RNA was isolated from actively growing triplicate batch cultures at an OD 600 of 0.20 to 0.31. The absence of DNA contamination was confirmed by PCR as previously described [36] with primer pairs specific for CRISPR spacer #1, for hisS and for hisZ (Table 1), using P. carbinolicus or G. sulfurreducens genomic DNA (isolated with the MasterPure DNA Purification Kit from EPICENTRE Biotechnologies, Madison, WI) as a control. Six to twelve clones of each genomic DNA PCR product were sequenced to verify the specificity of the primers. Reverse transcription was performed with the Enhanced Avian First Strand Synthesis Kit (Sigma-Aldrich, St. Louis, MO) as described previously [37], using each primer individually at 2 μM concentration with 400 ng of RNA in 20 μl total volume. Successful reverse transcription and the feasibility of DNA amplification in the presence of RNA were verified by PCR using 5 μl of this reaction. Quantitative real-time PCR (QRT-PCR) was performed with two to four technical replicates (9.5 μl of a tenfold dilution of cDNA, corresponding to 19 ng of RNA) for each of three biological replicates in a Taqman 7500 instrument using 2 × Power SYBR Green PCR master mix (Applied Biosystems, Foster City, CA) and primer pairs at 9 nM concentration in 25 μl total volume, for 50 cycles with an annealing temperature of 60°C and triplicate standards of spacer #1 and hisS PCR products from P. carbinolicus genomic DNA and a hisZ PCR product from G. sulfurreducens genomic DNA, encompassing four orders of magnitude.

Recombinant DNA techniques
All restriction enzymes were purchased from New England Biolabs; LA Taq polymerase was from Takara Mirus Bio; plasmids were propagated in E. coli TOP10 cells from Invitrogen; DNA purification kits for plasmids and agarose gel slices were from QIAGEN, and the MasterPure kit for genomic DNA extraction was from EPICENTRE. To construct a transgenic strain of G. sulfurreducens in which the native hisS gene was replaced with hisS from P. carbinolicus, three primer pairs (Table  1) were used to amplify the 5′-side and 3′-side flanking regions of hisS GSU1659 and the coding sequence of hisS Pcar_1041. The two flanking regions were digested with Spe I and ligated; this product and the Pcar_1041 amplicon were separately TOPO-cloned (Invitrogen) and sequenced. Digestion with BspH I (overlapping the start codon) and Spe I (overlapping the stop codon) was used to insert the Pcar_1041 gene between the flanking regions of GSU1659. As a selectable marker, a gentamicin resistance cartridge was amplified from plasmid pCM351 [40] with Xba I site-containing primers (Table  1), maintained as a TOPO clone, and ligated into the Nhe I site between the Spe I site and the 3′ flanking region of GSU1659. (A similar construct, in which only the marker was inserted without Pcar_1041, was used in unsuccessful attempts to delete GSU1659.) The entire hisS replacement construct was excised using EcoR I, purified from an agarose gel, and electroporated into the wild type G. sulfurreducens strain DL1 as previously described [39]. An isolated gentamicin-resistant colony was streaked for purity before transfer to liquid. The genotype of this strain, called MA159G, was confirmed by PCR of genomic DNA, with primers MA0334 and MA0335 (which amplify hisS Pcar_1041 but not hisS GSU1659) as well as MA0330 and MA0333 (which give a larger product for MA159G than for DL1, due to the inserted marker). The marker, which had loxP sites on either side, was removed from the chromosome of strain MA159G by introducing the Cre recombinase expression plasmid pCM158 [40] by electroporation and selecting for resistance to kanamycin (Sigma). Four colonies of the resultant strain called MA159 were streaked for purity and their genotypes were confirmed by PCR; amplicons were digested with Pst I to distinguish Pcar_1041 from GSU1659. This strain was electroporated with a plasmid called pRG6 (R. Glaven, personal communication), which is incompatible with pCM158 and selectable with spectinomycin (Sigma); it differs from pRG5 [41] only in that it carries the lacI repressor gene. The chromosomal genotype of this strain was confirmed by PCR and Pst I digestion.
A plasmid vector called pMA36, incompatible with pRG6, was constructed for isopropylthio-β-D-galactopyranoside (IPTG)-inducible expression of a chimeric CRISPR containing spacer #1 from P. carbinolicus between two repeats typical of the CRISPR2 locus of G. sulfurreducens. The lacI repressor gene and tacla-cUV5 promoter of plasmid pCD341 [42] were amplified by PCR with Pci I and Sph I site-containing primers (Table 1), TOPO-cloned and sequenced, and excised for ligation into plasmid pCM66 [43], resulting in plasmid pMA36. The chimeric CRISPR, consisting of annealed oligonucleotides MA0429 and MA0430 (Table 1), was ligated between the BamH I and EcoR I sites of pMA36. The sequence of this plasmid, called pMA35-1, was confirmed using the sequencing primer MA36R (Table 1). Serendipitously, two variants were discovered: pMA35-2 in which the chimeric CRISPR had expanded to two copies of spacer #1 with a third copy of the repeat between them, and pMA35-! in which spacer #1 was disrupted by duplication of a single G:C base pair at its exact centre (underlined in Table 1). All three chimeric CRISPR expression plasmids were electroporated into DL1 and MA159(pRG6), in parallel with pMA36 as a control. The genotypes of multiple kanamycin-resistant colonies of each transformation were confirmed by PCR of Pcar_1041 followed by Pst I digestion as well as cloning and sequencing, and by sequencing of plasmids present in the genomic DNA extracts (after transformation into E. coli to improve DNA quality). Another variant of pMA35-1 called pMA35-0 was serendipitously discovered in which spacer #1 had been deleted by recombination of the repeats on either side.

Growth conditions
Transformants of G. sulfurreducens were selected on NBAF medium [39] containing 1.5% Agar Noble (Difco), supplemented with 5 mM cysteine hydrochloride and 0.1% yeast extract in an anaerobic chamber. Growth experiments were carried out with liquid cultures in either NBAF medium supplemented with 1 mM cysteine hydrochloride or FWAFC medium [39] modified to contain 10 mM acetate and supplemented with 1 mM ferrous ammonium sulfate, in an atmosphere of N 2 and CO 2 (80%:20%) in rubber-stoppered 26 ml glass tubes at 30°C.

Bioinformatics
Codon usage was determined using the CodonFrequency algorithm of the Genetics Computer Group Wisconsin Package version 10.3 (Accelrys Inc., San Diego, CA). A script to compute the number of histidines and the distances between them for every protein sequence in a list was written in Perl.

Results
The CRISPR locus of P. carbinolicus includes a spacer matching its own histidyl-tRNA synthetase During manual curation of the P. carbinolicus genome annotation, the CRISPR locus was identified as 112 repeats of the sequence 5′-GAGTTCCCCGCA-GATGCGGGGATGAACCG-3′ (bases in bold predicted to form a hairpin), separated by spacer sequences of 32 bp ( Figure 1). This repeat sequence belongs to phylogenetic cluster 2 of the CRISPR classification system [44] and the adjacent cas genes ( Figure 1) are of the subtype "Ecoli" [3]. The nonredundant nucleotide sequence database was queried in an attempt to identify genes from which the 111 CRISPR spacers of P. carbinolicus might be derived. The only hits with five or fewer mismatched bases were hits with zero mismatches within the P. carbinolicus genome itself: spacer #1, located at the "trailer" end of the locus, farthest from the AT-rich "leader sequence" and cas genes encoding CRISPR-associated proteins (Figure 1), matched a sequence within the histidyl-tRNA synthetase (hisS) gene Pcar_1041 ( Figure 2); spacer #43 matched the adjacent spacer #44; and spacer #28 matched the nonadjacent spacer #50. Spacer #1 is likely to be the oldest spacer because new spacers are added next to the leader sequence upon exposure of streptococci to bacteriophage [12,13,[45][46][47], and closely related strains of bacteria and archaea contain identical spacers only near the trailer ends of their CRISPR [2,5,11,18,45,[48][49][50][51]. This observation led to the hypothesis that P. carbinolicus has experienced interference with the hisS gene, encoding an essential housekeeping enzyme, over a significant period of its evolutionary history.

Quantitative detection of CRISPR spacer #1 transcripts
In an attempt to determine whether spacer #1 is transcribed into RNA that could have interfered with the hisS gene at one time, and which strand of trailer end RNA is predominant in P. carbinolicus, two oligonucleotide primers were designed flanking spacer #1 ( Figure 1, Table 1): MA0326 within spacer #2 and MA0327 just outside the outermost repeat of the CRISPR. Reverse transcription of P. carbinolicus RNA into cDNA was attempted with each single primer, followed by quantitative real-time PCR amplification with both primers. The amount of spacer #1-containing RNA including the AT-rich "leader" sequence of CRISPR

QRT-PCR amplicon of spacer #1
primer MA0327 primer MA0326 noncoding "trailer" sequence  QRT-PCR amplicon predicted structure of processed spacer #1 RNA hisS gene sequence found on the sense strand of hisS, detected by primer MA0327, was not significantly different from the amount of spacer #1-containing RNA corresponding to the antisense of hisS, detected by primer MA0326 (Figure 3). Control PCR reactions without reverse transcription yielded no product, indicating that DNA contamination was negligible and only RNA of both strands was detected.
The sense strand spacer #1-containing RNA detected in this experiment may represent the 3′ end of a long transcript initiated near the leader sequence, whereas the antisense strand spacer #1-containing RNA may be produced independently from a promoter at the opposite end of the CRISPR. It is also possible that one strand is produced from the other by an unidentified RNA-directed RNA polymerase. If the sense strand spacer #1-containing RNA undergoes processing similarly to the E. coli CRISPR transcript [14], which belongs to the same phylogenetic cluster of repeat sequences as the P. carbinolicus CRISPR [44], cleavage within the stemloops of the repeats flanking spacer #1, followed by 3′ end trimming, would release a short RNA with predicted secondary structure (Figure 2b). This or the corresponding antisense strand spacer #1-containing RNA may hybridize to the proto-spacer DNA sequence within the hisS gene (Figure 2c).
Phylogenetic and experimental evidence that interference with hisS cannot be compensated It is surprising that spacer #1 is retained by P. carbinolicus if it interferes with the essential function of histidine activation for protein synthesis. Comparative genome analysis revealed that P. carbinolicus and its close relatives, the Geobacteraceae, possess two full-length hisS-like genes, whereas other bacteria have only one. Interference with Pcar_1041 by spacer #1 might have had negligible effect if Pcar_0202 also produced histidyl-tRNA synthetase activity. However, both phylogenetic and mutational studies suggest that Pcar_1041 is essential, being the only real hisS, as detailed below.
In some bacteria there is a hisS-related gene called hisZ, which produces a protein that lacks the C-terminal anticodon loop recognition domain of a true histidyl-tRNA synthetase, functioning instead as a regulatory subunit of ATP phosphoribosyltransferase, the first enzyme of histidine biosynthesis [52]. In bacteria that possess hisZ, the hisG gene encoding the catalytic domain of ATP phosphoribosyltransferase is shorter than in bacteria that do not possess hisZ [53]. A short hisG gene is present in P. carbinolicus and all other Geobacteraceae, but unlike previously described hisZ genes, both hisS-like genes contain obvious anticodon loop recognition domains. Phylogenetic analysis showed that the Pcar_1041 gene product and orthologous protein sequences of Geobacteraceae cluster among the HisS proteins of other bacteria, whereas the Pcar_0202 gene product and its orthologs belong among the HisZ proteins ( Figure 4). Furthermore, the ortholog of Pcar_0202 in G. sulfurreducens (GSU3307) could be deleted (Aklujkar and Lovley, manuscript in preparation), whereas the ortholog of Pcar_1041 (GSU1659) could only be replaced with Pcar_1041 (this study). Three electroporation attempts failed to delete GSU1659 outright. This result indicates that Pcar_0202 and its orthologs lack significant histidyl-tRNA synthetase activity, and suggests that interference with Pcar_1041 by spacer #1 would exert severe pressure on P. carbinolicus. The phylogenetic tree also demonstrates that the hisS gene Pcar_1041 was not acquired laterally; it is clearly an ancestral gene containing a proto-spacer that is not present in its closest relatives.
Spacer #1 inhibits growth of a transgenic G. sulfurreducens strain containing hisS of P. carbinolicus It is not yet possible to make mutations in P. carbinolicus. Therefore, interference of spacer #1 with hisS was tested in the more genetically tractable species G. sulfurreducens, in which the repeat sequence of the CRISPR2 locus (5′-GTGTTCCCCGCATGCGCGGG-GATGAACCG-3′) is very similar to that of the P. carbinolicus CRISPR. A plasmid called pMA35-1 was designed for IPTG-inducible expression of a chimeric CRISPR construct consisting of spacer #1 of P.
carbinolicus between two copies of the G. sulfurreducens repeat, and a transgenic strain of G. sulfurreducens called MA159 was generated in which the native hisS gene GSU1659 was replaced by Pcar_1041, the hisS gene of P. carbinolicus. The tRNA-His sequences of the two species are very different ( Figure 5)

Geobacter uraniireducens HisS
Geobacter daltonii HisS the histidyl-tRNA synthetase of P. carbinolicus might have difficulty recognizing its substrate in G. sulfurreducens. However, replacement of GSU1659 with Pcar_1041 resulted in a viable strain, which grew more slowly than the wild type ( Figure 6). Despite prior expression of the LacI repressor protein from plasmid pRG6 to prevent premature expression of the chimeric CRISPR, electroporations of MA159(pRG6) with pMA35-1 and two serendipitously obtained variants (pMA35-2 with two copies of spacer #1 and pMA35-! with spacer #1 interrupted by a single base pair insertion as shown in Table 1) were marginally successful (yielding zero to 4 colonies per attempt), whereas electroporation with an equal amount of the empty vector pMA36 produced hundreds of colonies per attempt. Electroporations of the wild type G. sulfurreducens strain DL1, carried out in parallel, yielded hundreds of colonies for all three chimeric CRISPR expression plasmids. These observations suggest that even leaky expression of the chimeric CRISPR containing spacer #1, or its  mere presence as DNA, is largely incompatible with the hisS gene of P. carbinolicus containing the matching proto-spacer, which is present in the MA159 host, but not with the native hisS gene of G. sulfurreducens in the DL1 host. Growth experiments provided further proof that spacer #1 interferes with hisS of P. carbinolicus. The G. sulfurreducens transformants were first checked by PCR, restriction digestion, and sequencing to verify that the hisS transgene and spacer #1 were intact. In one transformant, spontaneous recombination of the repeats on either side of spacer #1, eliminating it from the chimeric CRISPR expression plasmid, resulted in a strain called MA159(pMA35-0) that possessed both hisS of P. carbinolicus and the repeat sequence on the plasmid, but no spacer #1. Compared to this control that grew similarly to MA159, the presence of spacer #1 in the other MA159 transformants (i.e., with hisS of P. carbinolicus) resulted in long lag periods and somewhat reduced growth rates in NBAF medium with fumarate as the electron acceptor (Figure 6a), and very poor growth in FWAFC medium with Fe(III) citrate as the electron acceptor (Figure 6b). This effect was the same with either one copy of spacer #1 in MA159(pMA35-1) or two copies in MA159(pMA35-2), or with a single base pair insertion in spacer #1 in MA159(pMA35-!) -for the sake of clarity, only MA159(pMA35-2) is shown. The only exception was that in one experiment, triplicate cultures of MA159(pMA35-!) grew especially poorly after three transfers in NBAF (not shown). Wild type growth patterns were observed when any of the three plasmids was present in the DL1 host (i.e., with hisS of G. sulfurreducens) -for clarity, only DL1(pMA35-2) is shown ( Figure 6). Although expression of spacer #1 is expected to be low in the absence of IPTG, growth inhibition of the MA159 strains was observed, and addition of IPTG had no effect, indicating that expression of chimeric CRISPR RNA was not the limiting factor for inhibition of growth.
Spacer #1 reduces the amount of hisS RNA in transgenic G. sulfurreducens no more than it affects hisZ RNA Total RNA was isolated from NBAF-grown cultures of strains containing hisS of P. carbinolicus. The amount of hisS RNA was higher in the control MA159(pMA35-0) strain than in the growth-inhibited MA159(pMA35-2) strain with spacer #1 (Figure 7), and lowest in the MA159(pMA35-!) strain that had the most severe growth defect in a parallel growth experiment using the same inoculum. However, when the amount of hisZ RNA was compared across the same three strains as a control, a similar pattern was observed (Figure 7), suggesting that reduced expression of other housekeeping genes besides hisS occurs when growth is slowed by the incompatibility between spacer #1 and hisS.
The P. carbinolicus genome has fewer genes with numerous or closely spaced histidine codons than closely related genomes The evidence that P. carbinolicus expresses CRISPR spacer #1, and that spacer #1 inhibits growth of a G. sulfurreducens strain that is dependent on hisS of P. carbinolicus, led to the question of whether any effect of this interference during recent evolution could be discerned in the genome of P. carbinolicus. If the expected shortage of histidyl-tRNA were occasionally severe enough for ribosomes to stall during translation of genes with numerous histidine codons, one would expect these genes to be predisposed for elimination from the genome, because abortive expression wastes energy and because any selective advantage of the genes would be diminished. Missense mutations of closely spaced histidine codons would also be favoured as long as they did not interfere with an essential function.  Therefore, the number of histidine codons in every gene and the harmonic mean distance between histidine codons in every gene were computed for P. carbinolicus and its closest relative with a nearly completely sequenced genome, Desulfuromonas acetoxidans [GenBank:NZ_AAEW00000000], as well as for the completely sequenced and manually curated genome annotations of the more distantly related G. sulfurreducens [32], G. metallireducens [33] and G. bemidjiensis (Aklujkar et al., submitted). A plot of the fraction of protein sequences in each genome that have a given minimum number of histidines shows that the P. carbinolicus genome is deficient in genes with 35 or more histidine codons, and possesses none with 45 or more ( Figure  8a). To identify ancestral genes that might have been counterselected in P. carbinolicus due to close spacing of histidine codons, an index of histidine demand was computed as the number of histidine codons in a gene divided by the harmonic mean distance between them. Fewer genes with histidine demand above 5.0 are present in the P. carbinolicus genome, and none has an index above 10.0 (Figure 8b). Despite these trends, the overall frequency of histidine codons in the P. carbinolicus genome is 22.50 per thousand, very similar to D. acetoxidans (23.94 per thousand), G. sulfurreducens (20.55 per thousand), G. metallireducens (20.42 per thousand) and G. bemidjiensis (19.76 per thousand). This observation is consistent with the expected effect of an acute histidyl-tRNA shortage in the vicinity of gene transcripts with multiple or closely spaced histidine codons undergoing translation, whereas a defect in histidine biosynthesis prior to histidyl-tRNA synthetase would be expected to affect histidine codon usage in general.
The P. carbinolicus genome has lost ancestral genes with numerous or closely spaced histidines Genes of D. acetoxidans, G. sulfurreducens, G. metallireducens and G. bemidjiensis that contain 35 or more histidine codons, or have a histidine demand index above 5.0, were examined in order to identify ancestral genes that have reduced their histidine demand or have been lost specifically in the P. carbinolicus genome (Additional file 1: Table S1). Many genes found in Geobacter species are not necessarily ancestral to P. carbinolicus; they lack homologs in either the unfinished D. acetoxidans genome or the partial genome sequences of a mixture of D. acetoxidans and D. palmitatis (D. R. L. and coworkers, unpublished). Other genes that are present in D. acetoxidans, but not Geobacter species, could have been acquired after divergence from P. carbinolicus. Five gene families actually show increased histidine demand in P. carbinolicus compared to other Geobacteraceae, and in many other cases, a P. carbinolicus gene has lower histidine demand than its orthologs, but is still above the cutoff value of 5.0, or contains a similar  number of histidine codons. However, after all these considerations there remain sixteen clearly ancestral gene families with typically high histidine demand that are missing or have reduced histidine demand specifically in P. carbinolicus (Additional file 1: Table S1). The functional annotations of these gene families are briefly described below.
The nuoL-1 gene family, with a high-histidine-demand representative in every Geobacter species and D. acetoxidans, encodes the L subunit of an NADH dehydrogenase I complex that P. carbinolicus has entirely lost. The NuoL protein is thought to have a proton-pumping function [54], for which the imidazole groups of its clustered histidines are well suited.
The znuA (or zntC) gene encodes the periplasmic protein of an ATP-binding cassette transporter, within which a cluster of histidines is thought to bind zinc with high affinity [55]. The znuA gene of D. acetoxidans has eight histidines in its putative metal-binding cluster, whereas its closest relative, the P. carbinolicus znuA gene (Pcar_3026), has only four.
D. acetoxidans possesses two clearly ancestral genes related to arsenite S-adenosylmethyltransferase (Dace_2134, Dace_3081), with homologs in Geobacter lovleyi and other Deltaproteobacteria that contain numerous histidines. The closest homolog of Dace_3081 is in P. carbinolicus (Pcar_2089), with fewer histidines and a much reduced histidine demand index. P. carbinolicus lacks an ortholog of Dace_2134.
Two ancestral genes encoding polyketide synthasetype enzymes, with representatives in D. acetoxidans, G. lovleyi, Geobacter uraniireducens, Geobacter bemidjiensis, Geobacter sp. M21, Geobacter sp. M18, Geobacter daltonii, and Pelobacter propionicus that contain numerous histidines, have clearly been lost by P. carbinolicus. Both polyketide synthases are composed of unusual domain combinations. Dace_0979 and its orthologs contain just one elongation domain, one acyl carrier protein acylation domain, two to five acyl carrier protein domains, one reductase domain, and no dehydratase or thioesterase domains, suggesting that they build up a long-chain fatty acid with multiple hydroxyl groups, or possibly a storage polymer of a precursor resembling 3-hydroxybutanoate. Dace_1838 and its orthologs contain two elongation domains, one acyl carrier protein acylation domain, three to four dehydratase domains, and no acyl carrier protein or reductase or thioesterase domains, suggesting that their product may be polyhydroxylated, polyunsaturated, or cyclic. The loss of these two enzymes likely means that P. carbinolicus does not make secondary metabolites that are present in most other Geobacteraceae.
The remaining eight ancestral gene families are multiheme c-type cytochromes represented in D. acetoxidans or D. palmitatis and one or both of G. sulfurreducens and G. metallireducens. At least one gene of each family was most probably inherited by P. carbinolicus from a common ancestor of the Geobacteraceae, and then lost. Several other cytochrome families, found in the Geobacter genomes but not the incomplete D. acetoxidans and D. palmitatis genomes, may have been inherited and lost by P. carbinolicus, but the evidence is inconclusive. Nevertheless, it is notable that there are no cytochromes among the eighteen proteins of P. carbinolicus that have either more than 35 histidines or a histidine demand index above 5.0 (Additional file 1: Table S1).

Discussion
Although CRISPR spacers that match phage/plasmid genes have been shown to confer immunity against infection [12,15], and it is known that their mode of action is distinct from previously described mechanisms of phage resistance [13], the significant number of spacers that match host genes have not been investigated [25]. The activity of CRISPR spacers against genes encoding essential housekeeping enzymes has not been demonstrated before, nor have its consequences for genome-scale evolution been examined. The present study establishes that when spacer #1 of the P. carbinolicus CRISPR and its putative target, the hisS gene of the same species, are both present in a transgenic G. sulfurreducens strain, interference occurs that severely affects growth. Very few cells containing the hisS transgene were able to take up the chimeric CRISPR expression plasmids. Transformants in which spacer #1 was present grew more poorly. Attempts to induce expression of spacer #1 with IPTG had no effect, but growth was almost totally inhibited by switching from fumarate to Fe(III) as the electron acceptor, under which condition expression of Cas proteins and the chromosomal CRISPR2 is upregulated [56]. Therefore, the limiting factor for interference with hisS may be the amount of one or more Cas proteins, or the leader sequence, rather than the amount of spacer #1 transcript. Alternatively, growth by respiration of Fe(III) may require protein factors that cannot be expressed properly when histidyl-tRNA synthetase activity is low. Induction of CRISPR expression with IPTG in S. epidermidis has also shown no effect on interference with a target plasmid [15], but that CRISPR was inactive without a leader sequence in cis, whereas severe interference between spacer #1 and hisS requires neither the native leader sequence of the G. sulfurreducens CRISPR nor that of the P. carbinolicus CRISPR in cis.
The small decrease in the amount of hisS RNA in the presence of spacer #1, comparable to the decrease in the amount of transcript for a control housekeeping gene (hisZ) that accompanies the growth defect, is unlike the extent of decimation that one would expect if spacer #1-containing RNA catalyzed degradation of hisS RNA. It is even less likely that any transgenic strains could be viable if spacer #1-containing RNA targeted hisS DNA for degradation. An alternative possibility is that spacer #1 RNA hybridizes with the hisS gene without marking it for degradation (but perhaps recruiting one or more proteins), and must be displaced by RNA polymerase in order to complete transcription. In this scenario, even leaky expression of spacer #1 would saturate the available targets, and overexpression of spacer #1-containing RNA would not prevent RNA polymerase from displacing the one molecule obstructing transcription of the hisS gene. Consistent with these predictions, induction of spacer #1 expression with IPTG was not required for inhibition of growth, and did not cause stronger inhibition. It will be interesting to examine whether any putative nucleases have mutated in the few viable transformants that carry both spacer #1 and its target. None of the Cmr proteins implicated in RNA-targeted CRISPR function in P. furiosus [22] has a homolog in G. sulfurreducens or P. carbinolicus, nor does the CRISPRassociated double-stranded DNA/RNA-specific endonuclease of Sulfolobus solfataricus [57].
Sequencing confirmed that neither hisS nor spacer #1 was mutated in the viable transformants. Therefore, in contrast to earlier studies that showed absolute incompatibility between host spacers and phage proto-spacers, evaded only by mutation of the proto-spacers or protospacer-adjacent motifs [12,13], these results indicate that it is possible to establish a spacer that persistently interferes with an essential housekeeping gene. Both spacer and target can be maintained intact over numerous generations of inhibited growth. Insertion of a single base pair in the middle of the spacer did not reduce its efficacy, indicating that a small bulge in the region of complementarity between spacer and proto-spacer does not necessarily mitigate interference.
The experiments described herein with the chimeric CRISPR construct also show that a spacer from one species can be active in the context of the somewhat different repeats and cas genes of another species. The P. carbinolicus and G. sulfurreducens genomes encode homologs of all five components of the Cascade complex (cse1 CasA Pcar_0957, GSU1385; cse2 CasB Pcar_0958, GSU1386; cse4 CasC Pcar_0959, GSU1387; cas5e CasD Pcar_0960, GSU1388; and cse3 CasE Pcar_0961, GSU1389), which processes CRISPR transcripts into target-active RNAs in E. coli [14], along with the Cas3 helicase (Pcar_0956, GSU1384) that is required for their activity against targets [14], the Cas2 endoribonuclease (Pcar_0965, GSU1393) that may degrade them [58], and the Cas1 endodeoxyribonuclease (Pcar_0964, GSU1392) that may aid in the acquisition of new spacers [59], and promote DNA/RNA annealing [60]. However, the sequences of most of these Cas proteins are so divergent between the two species (e.g. 28% identity for Cas3) that it is noteworthy that the G. sulfurreducens system seemingly required no context other than its cognate repeat to recognize spacer #1 as a guide and hisS of P. carbinolicus as a target. There is evidence that proto-spacer-adjacent motifs are determinants of target recognition [13,20]. The proto-spacer within hisS is followed by a CTT motif (Figure 2), typical of sequences captured by CRISPR of phylogenetic cluster 2 [61], to which the P. carbinolicus CRISPR and G. sulfurreducens CRISPR2 loci both belong [44]. The facile reconstitution of interference between species with similar repeats and proto-spacer-adjacent motifs despite Cas protein divergence is encouraging for future development of CRISPR-based gene silencing technology with synthetic spacers.
A CRISPR transcript containing spacer #1 and flanking sequences is expressed in P. carbinolicus, and potentially processed by the Cas proteins into a hisSinterfering RNA. There are at least four possible explanations why the spacer and proto-spacer, which are incompatible in the transgenic G. sulfurreducens strain, still co-exist in P. carbinolicus. Firstly, being the spacer farthest from the leader sequence in P. carbinolicus, spacer #1 may produce comparatively few target-active RNA molecules, as processed RNAs containing spacers distal to the leader sequence are underrepresented in a clone library from P. furiosus [19]. Secondly, the repeats on either side of spacer #1 may not be good targets for processing because they deviate from the consensus repeat sequence of the P. carbinolicus CRISPR more than it has diverged from that of D. acetoxidans since the time of their last common ancestor ( Figure 9). Thirdly, pairing of the two strands of RNA derived from spacer #1 might prevent targeting of the proto-spacer. The fourth possibility is that P. carbinolicus in its natural environment experiences only growth conditions under which the incompatibility between spacer #1 and hisS is permissive, as it is for the transgenic strain growing by respiration of fumarate, and has evolved to avoid growth conditions under which the incompatibility is absolute, comparable to respiration of Fe(III) by the transgenic strain. Both strands of spacer #1 were detected at similar levels in P. carbinolicus RNA from actively growing cultures, in contrast with previous observations that both strands are present as RNA in Sulfolobus acidocaldarius only during stationary phase [5], that unequal amounts of RNA were detected with probes for the two strands in P. furiosus [19] and Thermus thermophilus [62], and that only transcripts from one strand were detected in E. coli [14], S. epidermidis [15], and Xanthomonas oryzae [20]. The presence of RNA representing both strands is consistent with CRISPR spacers on one strand of a cluster matching both the sense and antisense of proto-spacers [6], and with artificial spacers in both orientations having activity against their targets, although the efficacy may differ by several orders of magnitude [14].
Together, the evidence of spacer #1 expression and the proof that spacer #1 can inhibit growth in a hisSdependent manner indicate that interference with hisS by spacer #1 almost certainly occurred during the recent evolutionary history of P. carbinolicus. It is likely that this interference resulted in a growth-limiting shortage of histidyl-tRNA. As expected under this selective pressure, genes with numerous and/or closely spaced histidine codons have evidently been counterselected in the genome of P. carbinolicus. Although loss or mutation of ancestral genes is easiest to surmise from comparative genomics, acquisition of low-histidine-demand genes and failure to acquire new high-histidine-demand genes also contribute to this difference.
Among the ancestral gene losses from the P. carbinolicus genome attributable to interference with hisS by spacer #1, the most obvious are the multiheme c-type cytochromes typical of Geobacter species. The P. carbinolicus genome encodes only 14 c-type cytochromes [36], whereas the D. acetoxidans genome encodes at least 80 such genes (M. A., unpublished). These cytochromes are not well conserved even between closely related Geobacter species [33,63], and it has been hypothesized that they have a generic function as capacitors [64]. Therefore, in addition to the loss of eight ancestral multiheme c-type cytochrome families by P. carbinolicus, the non-evolution of new families may also be an effect of interference with hisS.
Of the other eight families of ancestral genes that exhibit loss or reduced histidine demand in P. carbinolicus, two have especially interesting metabolic and physiological implications. The nuoL-1 gene encodes a subunit of NADH dehydrogenase I with a conserved histidine cluster, implicated in proton pumping. Abortive expression of nuoL-1 due to interference with hisS by spacer #1 in an ancestor of P. carbinolicus, leading to a loss of function for the entire NADH dehydrogenase I complex, would have favoured the elimination of all fourteen structural genes of the complex from the genome, which is what has occurred. The reason why this deletion was not lethal is probably that another ancestral NADH dehydrogenase I complex, for which the nuoL-2 gene does not contain a cluster of histidine codons, has been retained by P. carbinolicus. Conceivably, loss of a major respiratory enzyme complex by P. carbinolicus caused it to rely more on laterally acquired fermentative pathways.
Interference with hisS by spacer #1 also offers an explanation for the diminutive histidine cluster in the znuA gene product of P. carbinolicus, which functions to bind zinc in the periplasm for active transport into the cell. Zinc is a cofactor of key metabolic enzymes such as carbonic anhydrase, acetate kinase and phosphotransacetylase [65][66][67]. If the mutations in the histidine cluster of znuA reduce the efficiency of zinc uptake, it could have an impact on metabolism related to the unexplained inability of P. carbinolicus to oxidize acetate [28] despite the presence of a complete set of tricarboxylic acid cycle genes in the genome.

Conclusions
This paper reports genetic and comparative genomic evidence that housekeeping genes can be targets of chronic CRISPR interference. Spacer #1 is shown to be P. carbinolicus CRISPR consensus GAGTTCCCCGCAGATGCGGGGATGAACCG repeat on leader side of spacer #1 GTGGTCCCCGCAGGTGCGGGGATGAACGG repeat on trailer side of spacer #1 GTGGTCCCCGCAGGTGCGGGGATGAACGA D. acetoxidans CRISPR consensus ATGTTCCCCGCAGATGCGGGGATGAACCG G. sulfurreducens CRISPR consensus GTGTTCCCCGCATGCGCGGGGATGAACCG Figure 9 Alignment of the repeats on either side of spacer #1 with the CRISPR consensus sequences of P. carbinolicus, D. acetoxidans and G. sulfurreducens.
transcribed into RNA with comparable amounts of both strands in P. carbinolicus, and to inhibit the growth of a transgenic strain of G. sulfurreducens that relies on hisS of P. carbinolicus, without drastically reducing the level of hisS RNA. The genome of P. carbinolicus exhibits the expected effect of a histidyl-tRNA shortage. Overall, the ancestral genes lost or mutated by the P. carbinolicus genome as part of its tendency towards fewer histidines per gene and lower histidine demand illustrate the deemphasis of the metal-respiring metabolism that is typical of other Geobacteraceae. More generally, while previous studies have approached CRISPR as a microbial immune system, another important role of the system may be to exert pressure on endogenous and essential genes, resulting in dramatic changes in the genome content and physiology of the host species.

Additional material
Additional file 1 : Table S1. Analysis of gene families of P. carbinolicus, D. acetoxidans and three Geobacter species for which one or more members have histidine content or histidine demand above the cutoffs (35 histidines or demand index 5.0).