The ancestor of the Paulinella chromatophore obtained a carboxysomal operon by horizontal gene transfer from a Nitrococcus-like γ-proteobacterium

Background Paulinella chromatophora is a freshwater filose amoeba with photosynthetic endosymbionts (chromatophores) of cyanobacterial origin that are closely related to free-living Prochlorococcus and Synechococcus species (PS-clade). Members of the PS-clade of cyanobacteria contain a proteobacterial form 1A RubisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) that was acquired by horizontal gene transfer (HGT) of a carboxysomal operon. In rDNA-phylogenies, the Paulinella chromatophore diverged basal to the PS-clade, raising the question whether the HGT occurred before or after the split of the chromatophore ancestor. Results Phylogenetic analyses of the almost complete rDNA operon with an improved taxon sampling containing most known cyanobacterial lineages recovered the Paulinella chromatophore as sister to the complete PS-clade. The sequence of the complete carboxysomal operon of Paulinella was determined. Analysis of RubisCO large subunit (rbcL) sequences revealed that Paulinella shares the proteobacterial form 1A RubisCO with the PS-clade. The γ-proteobacterium Nitrococcus mobilis was identified as sister of the Paulinella chromatophore and the PS-clade in the RubisCO phylogeny. Gene content and order in the carboxysomal operon correlates well with the RubisCO phylogeny demonstrating that the complete carboxysomal operon was acquired by the common ancestor of the Paulinella chromatophore and the PS-clade through HGT. The carboxysomal operon shows a significantly elevated AT content in Paulinella, which in the rbcL gene is confined to third codon positions. Combined phylogenies using rbcL and the rDNA-operon resulted in a nearly fully resolved tree of the PS-clade. Conclusion The HGT of the carboxysomal operon predated the divergence of the chromatophore ancestor from the PS-clade. Following HGT and divergence of the chromatophore ancestor, diversification of the PS-clade into at least three subclades occurred. The γ-proteobacterium Nitrococcus mobilis represents the closest known relative to the donor of the carboxysomal operon. The isolated position of the Paulinella chromatophore in molecular phylogenies as well as its elevated AT content suggests that the Paulinella chromatophore has already undergone typical steps in the reductive evolution of an endosymbiont.


Background
Paulinella chromatophora is a thecate filose amoeba of the Rhizaria that contains a photosynthetic entity of cyanobacterial origin termed chromatophore. A similar process initiated the evolution of plastids likely more than a billion years ago. It has previously been shown that neither the Paulinella host cell nor the chromatophores are related to the eukaryotic lineage containing primary plastids. Instead, the Paulinella chromatophore is affiliated with free-living Prochlorococcus and Synechococcus spp. (PSclade), and thus represents the product of a second primary endosymbiosis leading to photoautotrophic eukaryotes [1,2]. It is currently debated whether the Paulinella chromatophore represents an organelle comparable to a primary plastid, or merely a stable intracellular symbiont [3][4][5]. However, the extent of genome reduction as well as the presence or absence of gene transfers and protein import pathways are currently unknown for Paulinella, and only three gene cluster (4.3 to 9.4 kb) on the chromatophore genome have been analyzed to date and compared to free-living cyanobacterial relatives [1,6].
Previous phylogenies based on rRNA operon sequence data resolved the Paulinella chromatophore as sister to marine Synechococcus and Prochlorococcus spp (α-cyanobacteria) [1]. The intermediate position of Paulinella, diverging between αand β-cyanobacteria, raises the question whether the HGT of the carboxysomal operon occurred before or after the divergence of the ancestor of the Paulinella chromatophore, i.e. whether the chromatophore evolved from an αor β-cyanobacterium. Due to missing data, affiliation to αor β-cyanobacteria is also unknown for the Cyanobium-clade, which, besides marine Synechococcus and Prochlorococcus clades, represents the third major lineage in the PS-clade (e.g. [13][14][15]). Analyses of one member of the Cyanobium-clade (strain WH 5701) already indicated its individual divergence separate from Paulinella and both marine PS-subclades [6].
In this study, we have determined the sequence of the complete carboxysomal operon from the chromatophore of Paulinella and a bacterioferritin gene downstream of the carboxysomal operon. Additionally, we determined several rbcL and rDNA sequences from other cyanobacteria including the Cyanobium-clade. Our data reveal that the Paulinella chromatophore as well as the Cyanobium-clade display proteobacterial α-carboxysomes, and contain form 1A RubisCO. Furthermore, phylogenies of RubisCO and comparison of gene arrangement types of the carboxysomal operon revealed the γ-proteobacterium Nitrococcus mobilis as the closest known relative of the donor in the HGT of the carboxysomal operon. The neighbouring bacterioferritin gene was co-transferred through the same HGT event. Increased AT-content over the carboxysomal operon in Paulinella may reflect genomic adaptation to an endosymbiotic lifestyle.

Results and Discussion
Phylogenetic analysis of the ribosomal RNA operon One goal of this study was to determine the precise phylogenetic position of the Paulinella chromatophore within the cyanobacterial radiation. Cyanobacterial phylogeny is still a challenge since the standard phylogenetic marker, 16S rDNA, is not very informative in global analyses (see below). On the other hand, previous studies using extended data sets, either the complete rDNA operon [1] or several protein coding genes [6,16] suffered from limited taxon sampling. Here, we have extended the taxon sampling of nearly complete rDNA operon sequences to include all major cyanobacterial clades [17,18] with one to several representatives. Therefore, rDNA sequences from one plastid and 12 cyanobacteria were determined (taxa in bold in Figure 1). Together with data from newly released genome projects, the resulting alignment contained sequences from 36 bacteria, 42 cyanobacteria and 23 plastids.

PS
diverging in a basal position. The monophyly of the entire PS-clade to the exclusion of Paulinella, however, receives only moderate to low bootstrap support (branch 5), but is corroborated by unique synapomorphies in the 23S rRNA ( Figure 2). As previously shown, both marine PS-subclades are characterized by unique compensatory base changes (CBCs in pairs 868/909 and 869/908), whereas Paulinella is plesiomorphic in both pairs [1]. Interestingly, the Cyanobium-clade is intermediate in sharing the unique CBC in position 868/909 with marine PS-subclades, but displaying the ancestral character state in positions 869/ 908, in congruence with the tree topology ( Figure 2).
Several cyanobacterial branches gained only moderate bootstrap support (e.g. branches 1, 2, 3, 5, 12 in Figure 1). One possible explanation is the base compositional bias among prokaryotic rDNAs (the Chi-square test gave p = 0.00) mainly caused by bacteria and plastids (cyanobacteria alone have no significant base compositional bias; p = 0.21). Therefore we performed distance analyses using the LogDet correction for unequal base composition (Additional File 1) that largely confirm results shown in Figure  1. Branch 5 (PS-clade without Paulinella) is even better supported (78%). The phylogenetic signal for branch 5 is confined to the 23S rRNA gene as shown by single-gene analyses (87-93% bootstrap in the 23S rRNA phylogeny), since this branch collapsed in 16S rDNA analyses, as also did the branches 9, 11, 12, and 14 (Additional File 1).
In conclusion, the rDNA data support monophyly of both marine PS-subclades (= α-cyanobacteria sensu [11] with proteobacterial form 1A RubisCO) to the exclusion of both the Cyanobium-clade and the Paulinella chromatophore, raising the question whether these taxa are affiliated with αor with β-cyanobacteria (form 1B RubisCO), i.e. on which branch of the phylogenetic tree the horizontal gene transfer of form 1A RubisCO occurred. To answer this question, sequencing and phylogenetic analysis of RubisCO of the Cyanobium-clade and the Paulinella chromatophore was required.
Synapomorphy support in the 23S rRNA for the sister-group relationship between Paulinella and free-living α-cyanobacteria Figure 2 Synapomorphy support in the 23S rRNA for the sister-group relationship between Paulinella and free-living αcyanobacteria. Shown is the alignment and secondary structure diagram of Helix 837 in the 23S rDNA, with two RNA base pairs highlighted that represent synapomorphies of α-cyanobacterial clades to the exclusion of Paulinella and other prokaryotes. Sequence data and evolutionary changes are plotted on a simplified phylogram (NJ-bootstrap consensus tree). Pair 868/909 shows a uniquely derived CBC (compensatory base change: U-A → C-G) of all free-living α-cyanobacteria; the neighbouring pair 869/908 changed in the common ancestor of the marine PS-subclades (marine Synechococcus and Prochlorococcus) whereas the Cyanobium-clade and Paulinella are plesiomorphic. Pair 869/908 shows parallel changes in a few other cyanobacteria, (e.g. in Fischerella).

Horizontal gene transfer of a carboxysomal operon into the ancestor of the Paulinella chromatophore
In this study complete rbcL (RubisCO large subunit) sequences of Paulinella and four members of the Cyanobium-clade were determined and integrated into a global alignment of form 1 RubisCO large subunit amino acid sequences. The phylogenetic analysis resolves Paulinella chromatophora and the Cyanobium-clade as monophyletic with marine Synechococcus and Prochlorococcus spp., in congruence with the rDNA phylogeny ( Figure 3). The Paulinella chromatophore as well as the entire PS-clade belong to the RubisCO form 1A lineage, and in conclusion, have to be considered as α-cyanobacteria. The position of the entire α-cyanobacterial clade (including Paulinella) in the rbcL tree is not congruent with the rDNA phylogeny: α-cyanobacteria are not monophyletic with the remaining cyanobacteria (β-cyanobacteria) but are nested within a radiation of α-, βand γ-proteobacteria in the rbcL phylogeny ( Figure 3). This incongruence reflects the horizontal gene transfer (HGT) of RubisCO form 1A from proteobacteria to α-cyanobacteria, and loss of the ancestral 'cyanobacterial' RubisCO form 1B [9].
In cyanobacteria, carboxysomes are essential for the carbon concentration mechanism (CCM; [19]). Physiological differences between form 1A and form 1B RubisCO and corresponding carboxysome types (α and β) are still not understood [19,20]. It has been suspected that the occurrence of α-carboxysomes is correlated with ecological restriction to marine open ocean habitats [19]. Clearly, this view is untenable, since the Cyanobium-clade, which predominantly contains freshwater species, as well as the Paulinella chromatophore also display α-carboxysomes.
Since α-cyanobacteria occur in a broad range of habitats, it is even more difficult to speculate about advantages of α-carboxysomes for survival in special ecological niches, in particular an endosymbiotic habitat (Paulinella). As the microenvironment of the chromatophore of Paulinella can presumably be characterized as CO 2 -rich due to host respiration, it may even be assumed that an efficient CCM may not be essential for the Paulinella chromatophore.
Regrettably, experimental data on the photosynthetic properties of the Paulinella chromatophore, especially the existence and effectiveness of a CCM, are not yet available.
A search for unique synapomorphies in rbcL amino acid sequences revealed 3 positions (AA 36, 59, 64 in the Paulinella sequence), which in Nitrococcus and all α-cyanobacteria share uniquely derived character states to the exclusion of all remaining proteobacterial and β-cyanobacterial sequences (Figure 4). In addition, we found two synapomorphies (positions 399, 405), which characterize the α-cyanobacterial form 1A RubisCO ( Figure 4). These results highlight Nitrococcus as a key taxon for the HGT of RubisCO form 1A, being the closest known relative of the proteobacterial donor of rbcL.
Another HGT event is responsible for the well-known polyphyly of plastids in rbcL phylogenies [9,21,22]. Chloroplasts and cyanelles are rooted in the β-cyanobacteria in congruence with ribosomal phylogenies (Figure 1). In contrast, rhodoplasts are nested within the RubisCO form 1D clade of α-, βand γ-proteobacteria. With high significance, the analysis reveals Nitrosospira sp., Nitrosococcus oceani and Nitrosospira multiformis as a sister branch to rhodoplasts (the latter two already described in [21].) In the α-cyanobacteria, the genes encoding RubisCO form 1A belong to an operon that further contains genes for carboxysomal proteins, and previous studies revealed that the complete carboxysomal operon was acquired by HGT [10,11]. Among proteobacteria gene content as well as gene order in rbcL-containing operons differs considerably [23]. Several proteobacteria have two or even three unrelated rbcL genes (see for example Nitrobacter, Hydrogenovibrio, Thiomicrospira in Figure 3; [23,24]). In contrast, cyanobacteria generally have only one rbcL gene. In the present study, we determined the sequence of the complete carboxysomal operon of Paulinella chromatophora (7.6 kb), and compared the gene arrangement among members of the RubisCO form 1A clade ( Figure 5).
We found four major arrangement types. To analyse the evolution of these types, we plotted operon structures against a simplified rbcL tree as shown in Figure 5 (for more details, see Additional File 2). Basal branches of the RubisCO form 1A radiation (e.g. Thiomicrospira, Nitrococcus) show an almost identical operon architecture, which likely represents the plesiomorphic state. In this type, (cso-type) the following genes occur downstream of rbcL: rbcS, csoS2, csoS3, pepA, pepB, csoS1, csoS1, csoS1, and the iron storage protein bacterioferritin (bfr) [25]. The operon in the α-cyanobacteria (α-cyano-cso-type) was derived from the ancestral state by transfer of a single csoS1 gene to the 5' end of the operon, accompanied by a reduction of the number of 3' located csoS1 copies to one or zero. Interestingly, two members of the α-cyanobacteria still contain the 3' bacterioferritin gene (bfr): the Paulinella chromatophore and Synechococcus WH5701, the latter representing the Cyanobium-clade (data for the remaining Cyanobium-clade are currently missing). In the marine PSsubclades, bfr is absent ( Figure 5; Additional File 2). Comparison with the ribosomal phylogeny ( Figure 1) suggests that bacterioferritin was acquired by the same HGT event Evidence for HGT of RubisCO form 1A to the common ancestor of the Paulinella chromatophore and α-cyanobacteria Greek letters in grey indicate α-, β-, or γproteobacteria. Arrowheads highlight strains for which the gene arrangement of the carboxysomal operon is shown in Figure  5. as the carboxysomal operon, and was secondarily lost in marine Synechococcus and Prochlorococcus species. We addressed the bfr HGT hypothesis by performing a phylogenetic analysis with αand β-cyanobacterial ferritins and their proteobacterial relatives (Additional File 3). As in the rbcL-phylogeny, the bacterioferritins of Paulinella and WH5701 are monophyletic with Nitrococcus mobilis as their closest relative, and were nested within proteobacteria with carboxysomal operons, clearly proving co-transfer of bacterioferritin with the carboxysomal operon. Neither nonheme-ferritins of marine Synechococcus/Prochlorococcus-species nor the ferritin genes of β-cyanobacteria show any relationship to bfr of Paulinella and WH5701 (for details, see Additional File 3). In γ-proteobacteria incl. Nitrococcus, and in Prochlorococcus, the next gene downstream to the carboxysomal operon is a putative pterine-4alpha-carbinolamine dehydratase (Additional File 2) that in the remaining α-cyanobacteria is also present, but in those taxa is not linked to the carboxysomal operon. Blast searches [26] reveal their homology, suggesting that besides the carboxysomal operon and bfr even more genes may have been acquired by the same HGT event.
In parallel to the α-cyano-cso-type, another proteobacterial gene arrangement type is derived from the cso-type by the acquisition of a cbbR gene upstream to rbcL, coded by the opposite strand ( Figure 5). This type is therefore here named cbbR-cso-type (e.g. Nitrosomonas eutropha, Figure  5). Though not co-transcribed with the carboxysomal genes, CbbR is linked to this operon by its specific function as a transcriptional activator [27]. Finally, the most derived gene arrangement type is nested within the cbbRcso-type, and is named cbbRLSQO-type. The first three genes, cbbR, rbcL, and rbcS, remained unchanged whereas all carboxysomal shell proteins and bacterioferritin were lost and replaced by the genes cbbQ and cbbO, which are absent in the remaining three types of operon structures (Figure 4). cbbQ and cbbO have been shown to enhance RubisCO activity and stability [28,29].
Notably, each synapomorphic change leading to the three evolutionary derived gene arrangement types corresponds to a single branch/clade in the phylogenetic tree based on rbcL sequence data (Figure 3). This congruence provides additional credibility for the rbcL tree, including one branch without any bootstrap support that combines all Unique synapomorphies highlighting the HGT of RubisCO from a Nitrococcus-like γ-proteobacterium to the common ancestor of the Paulinellachromatophore and free-living α-cyanobacteria taxa characterized by the cbbRLSQO-type (Figures 4, 5). However, the loss of bacterioferritin in the marine Synechococcus and Prochlorococcus clades, which can also be assumed to be a synapomorphic change, cannot be traced to a single branch in the rbcL tree due to low resolution among α-cyanobacteria. Although the four major lineages (Paulinella, Cyanobium-clade, marine Synechococcus-and Prochlorococcus-clades) are recovered, relationships between these lineages remain unresolved (Figure 3). Based on the rDNA phylogeny, it appears likely that bacterioferritin was uniquely lost in the common ancestor of marine α-cyanobacterial clades as a single synapomorphic change (branch 9 in Figure 1). In general, the branching order within the α-cyanobacteria shows no significantly supported conflict between rDNA and rbcL phylogenies, and thus, both data sets were used separately and in combination to resolve phylogenetic relationships within the α-cyanobacteria.

Phylogenetic resolution of the Synechococcus/ Prochlorococcus-clade with the concatenated dataset
In Figure 6, three phylogenetic analyses of P. chromatophora and 17 taxa of Synechococcus and Prochlorococcus using three different datasets are compared: (A) rbcL nucleotide sequences, (B) complete rDNA operon sequences, and (C) concatenated rbcL and rDNA sequences. Prior to phylogenetic analyses, nucleotide frequencies of rbcL and rDNA sequence data were determined to prevent artefacts caused by base compositional bias. Whereas the base composition of rbcL codon posi-Architecture and evolution of operons containing form 1A RubisCO from proteobacteria and α-cyanobacteria including the Paulinella chromatophore Figure 5 Architecture and evolution of operons containing form 1A RubisCO from proteobacteria and α-cyanobacteria including the Paulinella chromatophore. Gene arrangements from selected taxa (see arrowheads in Figure 3) are plotted against a simplified phylogram based on RubisCO amino acid sequences. Four major types of gene arrangements can be distinguished (for details, see text). The operon of Paulinella is member of the α-cyano-cso-type, which is derived from the ancestral cso-type present in proteobacteria, providing evidence for a HGT of the complete operon. Homologous genes share the same colour. Abbreviations: Carboxysomal shell proteins 1, 2, 3 (csoS1, 2, 3); RubisCO large and small subunit (rbcL, rbcS = cbbL, cbbS); carboxysomal peptides A, B (pepA, pepB); bacterioferritin (bfr); LysR-type transcriptional activator (cbbR); putative RubisCO activation proteins (cbbQ, cbbO); hypothetical proteins (hypo). Dotted lines indicate that no data are available.

Ralstonia metallidurans CH34
Nitrosomonas eutropha C91 Comparison of phylogenetic relationships among Paulinella and free-living α-cyanobacteria inferred by rbcL and/or rDNA nucle-otide sequence data Figure 6 Comparison of phylogenetic relationships among Paulinella and free-living α-cyanobacteria inferred by rbcL and/or rDNA nucleotide sequence data. A. Unrooted analysis of codon positions 1+2 of the rbcL gene. B. Phylogeny of the rDNA operon, using more aligned positions as in Figure 1 (4317 vs. 4126 characters). C. Phylogeny inferred from concatenated rbcL and rDNA sequences. Tree topologies resulted from ML analyses using a GTR+I+Γ model; significance values shown as in Figure 1. Cyanobiumclade tions one and two was homogenous across taxa ( Figure  7A), the third codon position revealed strong differences between AT-rich (Paulinella, and Prochlorococcus strains: ca. 50-80%) and AT-poor taxa (Synechococcus strains ca. 10-30%) ( Figure 7B). Therefore, only first and second codon positions were used for phylogenetic analyses of the rbcL gene.
Both rbcL and rDNA phylogenies recover the monophyly of three major clades: Cyanobium-clade, marine Synechococcus-clade and Prochlorococcus-clade. However, rbcL data fail to recover the Cyanobium-clade with significance, probably due to the long-branched Synechococcus rubescens (see also the amino acid analysis; Figure 3). On the other hand, Prochlorococcus displays a much longer basal branch in the rbcL analysis compared to rDNA, due to a higher number of synapomorphies, accompanied by a considerably higher bootstrap support (99-100% in rbcL compared to 73-97% in rDNA). The higher-level order of the major clades remains unresolved in the rbcL tree, whereas the rDNA analysis recovers the sister-group relationship of marine Synechococcus and Prochlorococcus clades (88-99% bootstrap; see also Figure 1). Within Prochlorococcus, relationships among strains are resolved almost congruently by rbcL and rDNA data: MIT9313 appears as the basal divergence, followed by strains NATL2A, CCMP1375 and MIT9213 (relationships among these strains are not resolved), and a derived, long-branched clade consisting of MIT9312 and CCMP1986. In the remaining clades (Cyanobium-clade and marine Synechococcus-clade), the branching pattern of most strains remains unresolved by rbcL analyses, due to their almost redundant rbcL sequences. In contrast, the higher rDNA sequence diversity allows a better resolution in these lineages. For instance, Synechococcus rubescens, a single long branch in the rbcL tree, is confidently identified as sister of PCC 7001 by rDNA data. Similarly, Synechococcus strain RCC 556, which has an almost identical rbcL sequence as three other strains (CC9605, WH7805, WH8102), is resolved as the basal divergence of the marine Synechococcus clade by rDNA analyses. The concatenated data set (5253 nucleotide characters) is largely congruent with the rDNA tree, albeit with even better bootstrap support for the monophyly of the major clades (especially Prochlorococcus and marine Synechococcus). However, in a few cases (e.g. position of Synechococcus rubescens, position of Synechococcus CC9311) the combined analysis showed slightly reduced support in comparison with the rDNA phylogeny ( Figure 6B, 6C).
In the phylogenetic analyses performed in this study, we found no specific and robustly supported relationship between Paulinella and any of the three major clades of free-living α-cyanobacteria. RbcL data fail to resolve the monophyly of the Cyanobium-clade to the exclusion of  Paulinella with significance, irrespective of whether amino acid or nucleotide data are analyzed (Figures 3, 6A), in contrast to the remaining phylogenies that receive high significance for this clade (Figures 1, 6B, 6C). Although the prokaryote-wide rDNA phylogeny ( Figure 1) uses fewer positions than the unrooted rDNA phylogeny of the α-cyanobacteria ( Figure 6B; 4126 vs. 4317 characters), we recognize almost complete congruence between the two phylogenies among α-cyanobacteria. In addition, our tree topologies corresponded almost completely to phylogenetic analyses of the 16S-23S rDNA spacer ('ITS'; 233 aligned positions) from 57 free-living α-cyanobacteria ( Figure 4A in [30]). Probably due to the much higher number of aligned/variable (4317/540) positions in the rRNA and tRNA genes, our analyses gained high significance values for most basal and internal branches that were largely non-significant in ITS-phylogenies [30], e.g. the branch uniting all Prochlorococcus strains including the basal divergence MIT9313. Other phylogenies that included only a few α-cyanobacteria had often failed to position strain MIT9313 as monophyletic with other Prochlorococcus strains (e.g. [1,31,32]), even in multigene analyses of complete cyanobacterial genomes [31,32]. Superiority of a 233-character-analysis over wholegenome phylogenies may highlight the importance of a sufficient taxon sampling. Moreover, phylogenetic information content of protein gene data may be impaired by unequal base composition and amino acid frequencies across taxa: whereas derived Prochlorococcus strains have a high AT-content and show a preference for amino acids encoded by AT-rich codons, base composition and amino acid usage of Prochlorococcus MIT9313 are more similar to Synechococcus strains with lower AT-content [33], explaining artificial tree topologies. In the more conserved rbcL gene, the AT bias is confined to the third codon positions ( Figure 7A, 7B), and thus, phylogenies using only the first and second positions are not affected by base compositional bias. In fact, our rbcL tree does not contradict analyses using rDNA data, which are known to have a more balanced base composition across taxa ( [33], and our own results). Interestingly, Paulinella shows the same tendency towards high AT-content as known for Prochlorococcus, as evident from third rbcL codon positions ( Figure 7B) as well as the complete carboxysome operon ( Figure 7C). An elevated AT content, accompanied by genome size reduction, is a highly unusual phenomenon for free-living organisms, and in the case of Prochlorococcus, was interpreted as adaptation to oligotrophic marine environments with low nitrogen and phosphorus availability [33]. In contrast, endosymbionts or intracellular pathogens are known to tend to evolve towards AT rich genomes (e.g. [34][35][36][37]). We suspect that the complete genome of the Paulinella chromatophore is characterized by high AT content since not only the carboxysomal operon (this study) but also two DNA fragments of 9.4 kb and 4.3 kb (see Figure 1 in [6]) support this view. Together with its isolated position in molecular phylogenies ( [1,6], and this study), the elevated AT content further indicates that the chromatophore of Paulinella has significantly diverged from its free-living ancestor, and undergone typical steps in the evolution of an intracellular symbiont such as genome reduction.

Conclusion
The basal divergence of the Paulinella chromatophore as sister to free-living α-cyanobacteria was revealed by phylogenetic analysis of the complete rDNA operon with an extended taxon sampling, especially by addition of the Cyanobium-clade. Paulinella and free living α-cyanobacteria share a proteobacterial carboxysomal operon with a form 1A RubisCO, indicating that the HGT of the carboxysomal operon predated the divergence of the Paulinella chromatophore. The γ-proteobacterium Nitrococcus mobilis was identified as the closest known relative to the proteobacterial donor of the carboxysomal operon. The isolated position of Paulinella among α-cyanobacteria in molecular phylogenies as well as the elevated AT content of several of its genes indicates that Paulinella has already undergone typical steps in reductive genome evolution associated with an intracellular lifestyle.  [39]. Paulinella chromatophora M0880: Culture collection Melkonian, University of Cologne, Germany;Cryptomonas curvata CCAC 0006: Culture Collection of Algae at the University of Cologne, Germany [40].

DNA extraction, PCR and sequencing
Complete genomic DNA was extracted using a CTAB protocol. PCR primers for amplification of the rDNA operon, and sequencing methods were described previously [1]. For amplification of the full length rbcL gene from Paulinella chromatophora and Synechococcus strains, αcyanobacteria-specific primers were designed using an alignment of cyanobacterial rbcL and surrounding genes. PCR primers bind in the neighbouring genes of rbcL: csoS1 and rbcS. Primer sequences: Al_csoS1_F2: (GARGCWGCWGAYGCHATGACCAAGG) and Al_rbcS_R1: (TGRTCGTADATYTCKTCCTGGGTCAT-