An intriguing example of an obligate intracellular symbiotic interaction is the cyanobacterium-diatom symbiosis found in Rhopalodia gibba . Here the symbiont (spheroid body) can fix nitrogen for its eukaryotic host, and we have hypothesised that this capacity has been a driving force for establishing the intracellular endosymbiotic relationship . The spheroid body of Rhopalodia gibba provides an opportunity to investigate changes in endosymbiont physiology and genome evolution during adaptation of a symbiont to an intracellular environment.
Previous studies have reported changes in the genomes of bacteria following development of symbiotic relationships. In bacteria that are thought to have recently or transiently become symbiotic, changes include occurrence of multiple transposable elements and deletions of important components of recombinational DNA repair mechanisms . In longer established symbiotic and parasitic eukaryote-bacterium interactions, significant gene losses have been observed, and these have been accompanied by reduction of genome size and generation of AT rich genomes [3, 6, 24]. Changes that have occurred in the spheroid body's genome can not be categorised as an obvious example of the former or latter relationship. For example, the spheroid body's genome encodes several transposase genes, all with disrupted reading frames, indicating that these are pseudogenes. This finding is consistent with stability of the diatom-spheroid body endosymbiosis and a long term host-endosymbiont interaction, which can be traced back to the Miocene . Contrasting with the occurrence of transposase pseudogenes is evidence suggesting a functional DNA repair system in the spheroid body's genome. This is a finding more consistent with a relatively young endosymbiotic relationship. In nearly all intracellular bacteria studied to date, at least one of the genes encoding the DNA repair proteins RecA and RecF has been eliminated. It is thought this might be necessary to facilitate restructuring of the symbiont genome (for the exception see ). In the spheroid body's genome both rec A and rec F are present and have intact open reading frames. Thus the genome modifications that we report for the spheroid body's genome have all occurred against a background of a presumably intact DNA repair system. These modifications suggest that selective pressure for certain genes has changed upon establishment of the interaction, and the challenge is to attempt to understand the potential relevance of these for necessary and redundant functions in an obligate endosymbiotic relationship. For example, gene truncations as detected e.g. in nif U, would remove genes redundant for diazotrophic growth , and such deletion might be an early event in genome reduction of the symbiont. A subsequent or perhaps parallel step would include the inactivation of genes whose gene products are no longer needed for the initial symbiotic association. For this to occur, various different possible scenarios could be hypothesized: inactivation of genes by deleterious mutations resulting in the accumulation of pseudogenes or loss of genome fragments by deletion of larger DNA portions via rearrangements . Another hypothesis posits a "domino-effect" of initial pseudogenisation triggering subsequent large-scale gene loss . In this scenario, random pseudogenisation might lead to the inactivation of a pathway due to mutation of a single essential factor, followed by large-scale deletion of other genes involved in this pathway. In each case, the selective pressure would be different for genes coding for different functions, and loss would depend on whether function could be compensated by other genes in the endosymbiont or host cell genome. In the latter case, as in highly adapted interactions, signal-dependent transport of the protein from the host cytoplasm to the endosymbiont would be necessary.
We detected several examples for the disruption of coding regions by mutations (Table 1), in which the original gene is still detectable by analysis of all three possible frames. This includes psuedogenisation of fdx N (fdx N*), a gene which has been found to be non essential for nitrogen fixation in Anabaena variablis  and several other genes on spheroid body's genome fragments that we have sequenced (Table 2, Figure 1). Such observations provide further evidence that pseudogenisation of genes, which are non-essential for endosymbiotic life-style, is an important feature in the early reductive genome evolution of obligate intracellular cyanobacteria. Gene loss through independent DNA deletion events could also be inferred in comparative analysis of the spheroid body's genome fragment; among these the deletion of factors conserved in diverse cyanobacterial lineages (cyl 0012, cyl 0019). Due to elimination of the immediate DNA region, these modifications have led to a localised increase in gene density. In one extreme, deletion has produced a fusion of non-adjacent genes on the endosymbionts genome (sbl 0010). In other cases of gene deletion, genes have been removed and replaced with non-coding sequence that is much higher in AT-content than occurs in the coding regions (Figure 6). It is unclear whether this difference in composition reflects a shift in substitutional bias favouring A and T residues, and/or whether an existing bias becomes more apparent in de-novo regions that are under reduced structural/function constraint. In either event, the existence of these AT rich non-coding regions suggests that pseudogenisation and DNA deletion are not inevitably linked events in a sequential process of degenerative genome evolution in spheroid bodies. However, non-coding regions are rare in genomes of free-living bacteria. Since DNA can be introduced in several ways into prokaryotic genomes, their compactness is maintained by the deletion of harmful DNA. Given the intracellular existence of spheroid bodies, it is possible that their genome is less exposed and less susceptible to introductions of foreign DNA through mechanisms of horizontal gene transfer and lysogenic bacteriphages in comparison to those of free-living bacteria. If so, processes excluding non-coding DNA and pseudogenes from the spheroid body's genome may well be less efficient than those operating in free living bacteria. Such a hypothesis might help explain the greater extent of non-coding DNA and pseudogenised genes in the spheroid body's genome. Increased mutation rates, thought to be associated with reductive genome evolution would contribute to accumulation of these genome features . The genome modifications observed in the spheroid body are in some respects comparable to those of Sodalis glossinidius. A large fraction (49%) of the Sodalis genome is composed of non-coding DNA that has accompanied reductive genome evolution. Moreover, the Sodalis chromosome contains many unusual pseudogenes . The spheroid body's genome differs from Sodalis with respect to their generally higher AT-content.
The diverse features of reductive genome evolution in obligate intracellular symbionts (and pathogens) include a significant reduction of overall genome size in these organisms. However, the experimental determination of the spheroid body's genome size using standard molecular techniques is difficult due to the extreme stability of the host-spheroid body interaction and the limited amount of intact and purified endosymbionts that can be obtained from R.gibba. Recently in a study on the dynamics of reductive evolution, exponential relationships were inferred between genome size and SSU rDNA GC-content in mitochondria, free-living and obligate intracellular bacteria . Based on the model these authors propose, and using 16S sequence data previously published , we have estimated that the genome size of spheroid bodies is approximately 2.6 Mb. The genome size of free-living Cyanothece sp. CCY0110 is 5.8 Mb. Hence if our estimate of the spheroid body's genome size is accurate, this estimate suggests that reduction has produced a genome currently similar in size to that of Synechococcus (2.2–2.6 Mb), and may indicate that the endosymbiosis is still at an early state of development.
Our comparative analyses of spheroid body's genome fosmid sequences indicate that the photosynthetic genes psb C and psb D have been inactivated by mutation in the endosymbiont genome. These gene products are essential factors in the photosynthetic light reaction of photosystem II . According to the "domino-effect" hypothesis  initial deletion of components such as PsbC and PsbD is expected to lead to mass deletion of other genes involved in photosynthetic light reactions. Consistent with this prediction, additional photosynthetic factors that occur in closely related cyanobacteria are either absent (e.g. the cytochrome PetJ and the plastocyanine precursor PetE) or appear as a non functional pseudogene (e.g. the flavodoxin fld A*) in the spheroid body's genome.
Aside from gene loss resulting from reductive genome evolution, the absence of certain genes within the analysed genome region could also be explained by gene duplications or rearrangements. Without the complete sequence of the spheroid body's genome we can not exclude the possibility that following duplication, pseudogenisation has affected copies of some genes within the analysed genome region, while functional copies are retained elsewhere. However, the phenotypic loss of photosynthetic pigmentation indicates a complete loss of at least one essential factor of photosynthesis in the endosymbiont's genome. In addition, PCR analysis did not identify intact psbC and psbD genes present anywhere else in the spheroid body's genome (Figure 7).
The diverse modifications in the analysed spheroid body's genome fragment are not equally distributed over the whole sequence but accumulate downstream of the conserved nif gene region (Figure 1 and 5). This skewed distribution of degenerative modifications possibly reflects purifying selection acting across this genome region during the molecular adaptation process . Aside from the mutation of fdx N* – a protein unimportant for nitrogen fixation -and the truncation of nif U, all proteins for nitrogen fixation are conserved in the region without signs of degenerative genome evolution. This conservation of nif genes is consistent with the hypothesis that molecular nitrogen fixation has been an important driving force for the endosymbiotic interaction.
It can be expected that endosymbiont and host biochemistry will change with the development of the symbiotic interaction. Genes whose products become superfluous for symbiont-host coexistence are expected targets for mutation. At earlier stages of accumulation of deleterious mutation, holomologues will still be identifiable by BLAST homology searches. Table 2 lists many pseudogenes that may fit this category.