Tic62: a protein family from metabolism to protein translocation

Background The function and structure of protein translocons at the outer and inner envelope membrane of chloroplasts (Toc and Tic complexes, respectively) are a subject of intensive research. One of the proteins that have been ascribed to the Tic complex is Tic62. This protein was proposed as a redox sensor protein and may possibly act as a regulator during the translocation process. Tic62 is a bimodular protein that comprises an N-terminal module, responsible for binding to pyridine nucleotides, and a C-terminal module which serves as a docking site for ferredoxin-NAD(P)-oxido-reductase (FNR). This work focuses on evolutionary analysis of the Tic62-NAD(P)-related protein family, derived from the comparison of all available sequences, and discusses the structure of Tic62. Results Whereas the N-terminal module of Tic62 is highly conserved among all oxyphototrophs, the C-terminal region (FNR-binding module) is only found in vascular plants. Phylogenetic analyses classify four Tic62-NAD(P)-related protein subfamilies in land plants, closely related to members from cyanobacteria and green sulphur bacteria. Although most of the Tic62-NAD(P)-related eukaryotic proteins are localized in the chloroplast, one subgroup consists of proteins without a predicted transit peptide. The N-terminal module of Tic62 contains the structurally conserved Rossman fold and probably belongs to the extended family of short-chain dehydrogenases-reductases. Key residues involved in NADP-binding and residues that may attach the protein to the inner envelope membrane of chloroplasts or to the Tic complex are proposed. Conclusion The Tic62-NAD(P)-related proteins are of ancient origin since they are not only found in cyanobacteria but also in green sulphur bacteria. The FNR-binding module at the C-terminal region of the Tic62 proteins is probably a recent acquisition in vascular plants, with no sequence similarity to any other known motifs. The presence of the FNR-binding domain in vascular plants might be essential for the function of the protein as a Tic component and/or for its regulation.


Background
Chloroplasts, together with mitochondria, are the major energy producers in all eukaryotic photosynthetic organisms. The endosymbiotic theory proposes a prokaryotic origin for plastids and mitochondria. During the endosymbiotic process a host cell engulfed distinct ancestral bacteria. Part of the genomes of these endosymbiotic bacteria have been kept and, as a result, plastids and mito-chondria are the only organelles in the cell containing their own genome. However, while the chloroplast genome is composed of about 120 genes, its proteome is estimated to consist of about 3000 proteins [1]. The development of highly specific organellar transport mechanisms was thus the response to the necessity for reimporting the gene products and to guarantee an optimal communication between cells and organelles.
Unlike the different protein transport systems in thylakoids, the protein import machinery in the outer/inner envelope membrane of chloroplasts does not show obvious homology to any bacterial secretion system [6]. This is hardly surprising since the bacterial systems were required for thylakoids and therefore a new transport machinery had to be developed in the host cell to maintain the specificity for chloroplast communication. However, sequence analyses indicated that certain components of the translocons in chloroplasts are of bacterial origin. Besides, there is a parallelism in the chaperone system required in some transport stages [6]. The translocation channel in the outer envelope membrane, Toc75, is related to outer membrane proteins involved in the transport or integration of proteins in Gram-negative bacteria [7,8]. Tic20, which is discussed to constitute part of the protein-conducting channel, shares sequence similarities to bacterial amino acid transporters [9]. Other subunits might have been recruited and adapted as they show homology to bacterial proteins not related to transport processes. Tic22, which is thought to mediate the interaction of the Toc and Tic complexes during import, has cyanobacterial counterparts with unknown function and is proposed to be localized in the thylakoid lumen [10]. Some cyanobacterial proteins contain cofactor-binding motifs similar to those found in Tic62, Tic55 and Tic32. Tic55 contains a Rieske iron-sulphur centre and a mononuclear iron-binding site [11], and Tic62 and Tic32 each have a NAD(P)-binding motif [12,13]. No prokaryotic counterparts have been detected by direct sequence comparison for the other subunits that compose the translocons, which may indicate that they have evolved from the proteome of the ancestral host to fulfil specific functions demanded after the development of plastids and to ensure the specificity of the transport process in the outer/inner envelope membranes of chloroplasts. Genome-wide analyses had shown that some subunits of the translocons (Toc75, Toc159, Toc34, Tic20) are encoded by more than one gene in Arabidopsis thaliana [14,15]. Experimental data derived from analyses of the isoforms of the Toc complex revealed that the different members associate with structurally and developmentally distinct import complexes. Four homologues compose the Toc75 family in Arabidopsis thaliana (atToc75-I, atToc75-III, atToc75-IV and atToc75-V). The gene encoding the functional orthologue of Toc75 from Pisum sativum, atToc75-III, is essential for the viability of plants from the embryonic stage. This is not the case for atToc75-IV, which could play a role during growth in the dark. It seems that atToc75-I is in fact a pseudogene [16]. The function and relation with the Toc machinery of atToc75-V is still a matter of intensive study. In the case of Toc34 and Toc159 families, two (atToc33 and atToc34) and four (atToc159, atToc132, atToc120 and atToc90) isoforms are identified in the Arabidopsis genome, respectively. Whereas atToc33 associates preferably with atToc159, atToc34 does with atToc132/atToc120 and this association is likely related to the import of photosynthetic and non-photosynthetic precursors, respectively [17,18]. Four homologues are identified for Tic20 in Arabidopsis and only two of them contain a predicted transit peptide. However, the function and subcellular localization of the two Tic20 homologues with non-predicted transit peptide are still unknown [14].
In spite of the wealth of information about the Toc complex, less is known about phylogenetic relationships of the Tic complex subunits. Here we take a closer look at the structure, function and evolution of one component of the Tic complex, Tic62. The N-terminal module of Tic62 has a conserved NAD(P)-binding site and its C-terminal region was found to interact with ferredoxin-NAD(P)oxido-reductase (FNR) [12]. Homology searches and phylogenetic analyses show that the N-terminal domain is highly conserved among all oxyphototrophs and green sulphur bacteria. However, the C-terminal region (FNRbinding domain) is only found in vascular plants. Phylogenetic analyses indicate that there are four groups of Tic62-NAD(P)-related proteins in land plants. The first group is orthologous to the reported Tic62 from pea [12]. The physiological roles of the Tic62-NAD(P)-related proteins in the cell remain to be shown.

Results and discussion
Tic62, a protein of 62 kDa that is part of the Tic complex, has been proposed to function as a sensor protein whose possible role is to regulate protein import into chloroplasts by sensing and reacting to the redox state of the organelle. So far the only Tic62 protein studied is that from Pisum sativum [12]. This protein was found to have two functional modules: the N-terminus was shown to bind pyridine nucleotides and the C-terminal region interacts with FNR. The FNR-binding module consists of a repetitive, highly conserved KPPSSP motif. One or two transmembrane helices were proposed for the pea sequence and both the N-and the C-terminus seem to face the stroma [12].
Excluding the transit peptide, psTic62 consists of 470 residues. The blast search against the protein databases with psTic62 [Swiss-Prot:Q8SKU2] as a template resulted in several sequences from which just two correspond to the full-length form of the mature psTic62: one from Arabidopsis thaliana [GenBank:NP_188519] and another from Oryza sativa [GenBank:ABG65881]. All the other hits, which showed recognizable sequence similarity to the Nterminal NAD(P)-binding domain of Tic62, lack the Cterminal module (residues 387-534) responsible for the FNR binding and represent a short version of the Tic62 protein. A search of the FNR-binding motif over dbEST revealed its presence exclusively in vascular plant organisms (e.g., Lycopersicon esculentum, Medicago truncatula, Triticum aestivum, Glycine max, Lotus japonica) ( Figure 1).
Interestingly, all the proteins homologous to the NAD(P)binding part of Tic62 were from photosynthetic organisms (green plants, oxyphotobacteria, and green sulphur bacteria). A multiple sequence alignment of these pro-teins is shown in Figure 2. A phylogenetic tree was built based on the alignment (Figure 3). Both the multiple alignment and the phylogenetic tree indicate that the Tic62-NAD(P)-related protein family is made up of four well-supported clusters (support values of 100/95, 100/ 68, 100/100 and 100/100, Figure 3) that have been divided into six groups. These groups are schematically represented in Figure 4 and are described below. The four plant subfamilies are classified according to the GenBank accession number of the Arabidopsis protein found within each group (the locus_tag of the Arabidopsis gene is shown in parenthesis).
(i) Group I: NP_188519 (At3g18890). This subfamily contains the original Tic62 sequence from pea and makes up the Tic62 family, even though not all the members of this family have a molecular weight of 62 kDa and the association with the Tic machinery remains to be shown (see below). It is composed of proteins from chloroplastcontaining organisms of land plants and red algae. So far no sequence of this subfamily was found in green algae (Ostreococcus or Chlamydomonas), in the diatom Thalassiosira or in any oxyphotobacteria. Because a final annotation of the green algae genomes is still in progress, a final confirmation of the absence of the protein of group I in green algae is pending. This group is characterized by the motif E-R-P/A-T-D-X-Ar-K/G-E-T-H (residues 350-371 in Figure 2 [12] are highly conserved within the family. A fourth repetition found just in Arabidopsis sequence is marked in a box. The pea (PISSA), Arabidopsis thaliana (ARATH) and Oryza sativa (ORYSA) sequences were retrieved from GenBank. The sequences from tomato (LYCES), Glycine max (GLYMA), Medicago truncatula (MEDTR) and Triticum aestivum (TRIAE) were identified in dbEST and retrieved from plantGDB. The representation of the alignment is the standard from the ClustalX program [43].
protein and contain the FNR-binding motif at the C-terminus. A minor distinction between Tic62 from Arabidopsis and the other full-length sequences is the number of four or three repetitive modules, respectively ( Figure 1). Exhaustive searches for the FNR-interacting repeat in the Physcomitrella patens genome revealed no hits to these regions. 3'RACE PCRs of the detected Physcomitrella Tic62 gene were performed to determine its C-terminal sequence. It resulted exclusively in the short form of the gene, giving a stop codon in position 259. Additionally, immunodecoration with the pea Tic62 antibody, raised against the C-terminal part of the protein (residues 412-534), showed no signal in Physcomitrella chloroplasts (data not shown). Finally, an insertion of 6-15 residues (positions 148-168 in the alignment, Figure 2) is found in vascular plants and red algae. The search of this motif in the Physcomitrella genome resulted in no hits. The overall identity of the sequences composing this subfamily is 40%. All of them contain a transit peptide for targeting the protein to chloroplasts, and they might be Phylogram of representative members of the Tic62-NAD(P)-related family Figure 3 Phylogram of representative members of the Tic62-NAD(P)-related family. The optimal unrooted phylogenetic tree obtained by MrBayes is shown for representative members of the Tic62-NAD(P)-related family. The topology predicted with Bayesian and ML methods were not different from each other and four well-supported clusters and six groups are recognized. For display purposes, the green sulphur bacteria have been used as outgroup. The Bayesian posterior probability percentage (pP%) and the bootstrap values obtained by PhyML are shown in the nodes (Bayesian/ML). The organism's name is indicated followed by the accession number of the protein in the databases. Land plants, green algae, red algae, cyanobacteria and green-sulphur bacteria are coloured in light green, dark green, red, blue and brown, respectively. Branch lengths are proportional to evolutionary distances. localized at the inner envelope membrane as it was previously reported for the pea sequence [12], though the FNRbinding domain could modulate the subcellular localization of the protein by the interaction with FNR and/or other proteins.
(ii) Group II: NP_565789 (At2g34460). The members of this subfamily are homologous to the short version of the Tic62 protein from vascular plants. This subfamily is composed of proteins from the green algae Chlamydomonas reinhardtii, and both non-vascular and vascular plants (Physcomitrella and Arabidopsis, respectively). Surprisingly, there is no sign of the presence of members of this group in red algae genomes (Galdieria, Porphyra or Cyanidioschyzon). This group is closely related to group I and the phylogenetic tree is highly consistent in splitting up these two groups (Figure 3). The green plant sequences of this group are characterized by the motif L-V-N-G-A-A-p-G-Qx(2)-N-P-A-Y, where p represents a polar residue (range 282-296 in Figure 2). The proteins from green plants contain an N-terminal extension, which is predicted to act as a transit peptide to target the proteins to chloroplasts. Recently, the Arabidopsis protein has been identified in a proteomic analysis of isolated plastoglobules [19].
(iii) Group III. This cyanobacterial subfamily is composed of proteins from a variety of organisms such as Synechocystis sp. PCC 6803, the small-genome cyanobacteria Prochlorococcus marinus (MIT9313, SS120 and MED4) or the heterocystous cyanobacteria Nostoc sp. This group comes together with groups I and II in a well-supported cluster (support values 100/95) and the phylogenetic trees were highly consistent in outgrouping the sequence from Gloeobacter violaceus (Figure 3), a cyanobacterial member of an early branching lineage [20]. Due to the annotation in the databases of the Synechocystis sequence from this family (NP_441422, sll1218) as ycf39 gene product, a connection between Tic62 and ycf39 was previously proposed [12]. However, it can be traced that the original ycf39 gene product is not related to sll1218 but to slr0399 in Synechocystis [GenBank:NP_441851] [12]. Both cyanobacterial proteins share 26% identity and 42% similarity.
Presence of Tic62-NAD(P)-related proteins in cyanobacteria, algae, land plants and green sulphur bacteria   The ycf39 gene product (slr0399) was found to act as a chaperone for quinone binding [21]. This cyanobacterial protein is similar to the NP_195251Arabidopsis sequence that is not a Tic62-NAD(P)-related protein. Therefore, it can be argued that a connection between Tic62 and ycf39 may be an artefact originated by a non-reliable annotation in the protein database.  Figure 2). The land plant sequences contain a predicted chloroplast transit peptide and proteomics studies have localized the protein from Arabidopsis in chloroplasts [22]. The lack of homologous sequences in other cyanobacteria such as Synechocystis or Prochlorococcus may be in accordance with a previously reported work, which showed that Nostoc proteins have higher similarity to Arabidopsis nuclear-encoded proteins than proteins from Prochlorococcus or Synechocystis [23]. (Os05g01970) and [PhyscoDB:contig9865] from Arabidopsis, rice and Physcomitrella, respectively. At2g37660 has been found in chloroplasts by proteomics analyses [22]. In spite of a possible difference in localization, the two subgroups are highly similar (e.g., 79% identity between NP_568098 and NP_565868 in Arabidopsis) which suggests a similar function in the cell. Only incomplete sequences were found in the algae genomes analysed (Chlamydomonas and Galdieria) and, therefore, no further conclusions can be made for these organisms.
The structure of the NP_568098Arabidopsis protein bound to NADP has been recently solved at 1.8 Å resolution by X-ray crystallography [PDB:1XQ6]. The residues involved in the binding to the cofactor are marked with a triangle in the multiple sequence alignment (Figure 2).
(vi) Group VI. The last group to be mentioned corresponds to proteins of green sulphur bacteria (Figure 2 and Figure 4). Two subgroups are recognized, which likely originated from a gene duplication event (Figure 3). The similarity search using psTic62 as a template retrieved sequences from green sulphur bacteria with homology to the short version of Tic62. These are anoxygenic phototrophic bacteria that contain a type-I (Fe-S) reaction centre. A reverse blast search, using the green sulphur Tic62-related sequences, did not retrieve any sequences from oxyphotosynthetic organisms different from the groups mentioned above. Although very different organisms, the genome comparison between green sulphur bacteria and oxyphotosynthetic organisms showed that many components of photosynthesis and energy metabolism are highly similar. Green sulphur bacteria, cyanobacteria and eukaryotic phototrophs are the only organisms that synthesize chlorophyll a and also directly reduce pyridine nucleotides [24].
The presence of so many proteins in chloroplasts related to the NAD(P)-binding domain of Tic62 deserves a detailed study. The function of the N-terminal module seems important for the viability of the photosynthetic organisms since the gene has been conserved in all the genomes. All the proteins are predicted to bind pyridine nucleotides and are referred here as Tic62-NAD(P)-related family due to the similarity to the NAD(P)-binding domain of psTic62. The Tic62-NAD(P)-related family is of ancient origin, as proteins were not only found in ancient cyanobacteria (Gloeobacter violaceus) but also in green sulphur bacteria. This might propose that a Tic62-NAD(P)related protein was already present in the ancestor who evolved to green sulphur bacteria and cyanobacteria. The presence of two genes in Nostoc punctiforme (groups III and IV) might suggest that a gene duplication event occurred prior to the evolution of cyanobacteria (Figure 3). Some cyanobacterial organisms could have lost one of the genes, which could explain its absence in Gloeobacter, Prochlorococcus and Synechocystis in group IV. Two highly supported groups (I and II) together with group III comprise a big cluster of sequences and groups I and II are possibly derived from group III which contains the majority of the cyanobacterial proteins. A four-cluster likelihoodmapping analysis (cluster a = group I+II, cluster b = group III, cluster c = group IV, cluster d = group V or cluster a = group I, cluster b = group II, cluster c = group IV, cluster d = group V) showed that branching order (a, b)-(c, d) was favoured in more than 90% of 10,000 puzzling, and dem-onstrated that group V is closely related to group IV. The presence of paralogues in land plants of group V could be due to a gene duplication event within the eukaryotic organism.
Most of the Tic62-NAD(P)-related proteins in higher plants are found in chloroplasts, but only the specific localization of psTic62 (group I) at the inner envelope membrane of chloroplasts and NP_565789 (At2g24460, group II) in plastoglobules have been shown experimentally [12,19]. It would be worth investigating the subcellular localization of the other members of the family and, especially, to analyse the possible dual localization of the proteins belonging to group V. The lack of a transit peptide had also been described in two homologues that compose the Tic20 family in Arabidopsis [14]. A possible localization outside plastids could be another example of a protein of cyanobacterial origin that has been redirected to a compartment different from plastids [23]. However, the targeting information to chloroplasts could be different from the canonical transit peptide [22,25,26] and a localization of such proteins in chloroplasts cannot be excluded. The presence of members of the family at the inner envelope membrane of chloroplasts, involved in the import process, and in plastoglobules, structures that act as a functional metabolic link between the inner envelope and thylakoid membranes, points to an important role of the protein family in metabolism.
The resolution of the structure of a protein is a major step in understanding the function. Since the similarity among sequences in the Tic62-NAD(P)-related family is sufficiently high, the knowledge of the structure of one member of Tic62-related family permits to draw general conclusions about the structure of other members. The crystallized NP_568098 protein shows the typical NADPH-Rossman fold. Figure 2 also represents the secondary structure of the crystallized NP_568098 protein.
Clearly, most of the insertions and deletions of the proteins in this family correspond to loops in the crystal structure and most of the motifs related to αand β-conformations are highly conserved. Therefore, the NADPH-Rossman fold is also expected for the core structure of all members of the Tic62-NAD(P)-related family, with differences mainly in the loop regions. The glycine-motif in the coenzyme-binding region is fully conserved in the whole family (GxxGxxG, range 111-116 in Figure 2) and it may be related to the extended short-chain dehydrogenasereductase superfamily [27]. The highly conserved aspartic acid residue required for stabilization of the adeninebinding pocket is found in the loop between β3 and α3, except for group VI [28]. However, large differences are expected in the regions of the β5 and β6 strands. In the crystal structure, these two β-strands form an antiparallel β-sheet, which connect a long loop (Figure 2; see below).
The differences in this region among the subfamilies could be correlated to the specific function of each subfamily. Since the protein was crystallized in the presence of NADP, the residues involved in the binding to the cofactor were identified (G11, S13, G14, R15, T16, R38, G55, D56, I57, L76, T77, S78, A79, V80, Q103, V131, G132, S133, K155, A174, G175, G176, L177, R205; for underlined residues see below). These residues are marked in Figure 2. From the multiple sequence alignment it can be concluded that many residues that bind to NADP are highly conserved within the family (9 out of 22). Specifically, the conservation of the residues (or their physicochemical properties) involved in the NADP binding is high in members of group I (14 out of 22). These residues are underlined above and they could represent the residues implicated in the NADP binding of Tic62. Mutagenesis studies are necessary to establish the role of these residues clearly.
The mode of interaction of Tic62 with the membrane/Tic complex is unknown. Previous experiments showed that likely hydrophobic contacts mediate the binding to the membrane/Tic complex, as most of the protein remains within the membrane upon alkaline and urea treatments [12]. TMHMM [29] and PredictProtein [30] algorithms do not predict any transmembrane helices in group I. Moreover, protease digestion experiments showed that psTic62 is protected in inner envelope vesicles that, together with the hydrophilic profile of Tic62, suggest that the protein faces the stroma while attached to the membrane/Tic complex. Based on the identity (27% identity; 41% similarity) between atTic62 (NP_188519) and NP_568098 [PDB:1XQ6], a homology model procedure was followed to construct a model for the NADP-binding domain of the Tic62 protein (residues 78-331 in atTic62; see additional file 1: PDB coordinates for the atTic62 model). Figure 5a shows the sequence alignment of the N-terminal domain of the atTic62 protein and the template based on the multiple sequence alignment of the Tic62-NAD(P)-related family ( Figure 2). The key residues involved in the pyridine ring binding are shown in red. The predicted secondary structure of atTic62 is compared with the known secondary structure of the template. As can be seen, most of the conformational elements are conserved in both sequences. Slight differences are the presence of β5 and β6 strands in the template (as mentioned above), and two smalls α-helices predicted between β2 and β3 strands in atTic62. A model was built based on this alignment and it was structurally evaluated with WHATCHECK. The corresponding values were good: Ramachandran plot, -2.215; backbone conformation, -3.761; chi-1/chi-2 rotamer normality, -1.150; bond lengths, 0.716; bond angles, 1.439.
Only the values for the backbone conformation were poor, but this is probably due to gaps in the alignment and located in loop regions of the template (Figure 5a). In fact, the structural analysis obtained by the VERIFY3D program assigns positive values all over the structure, except in the regions LQNTDEGT and FPAAILNLFWGVLC in atTic62 (minimum value of -0.16) that support the previous proposal. The energetic parameter of the model was E = -4082.780 kJ·mol -1 . In Figure 5b, a view of the proposed structural model for at Tic62 (blue) superimposed to the template (green) is despicted. The NADP ligand is shown in red and the residues involved in the binding are underlined in Figure 5a. It can be seen that the β5 and β6 elements connect a long loop that is missing in atTic62 (Figure 5a and 5b). Interestingly, a large number of hydrophobic residues is concentrated in this region in atTic62 (Figure 5c, marked in orange). The model presented here for atTic62 suggests that the hydrophobic region (residues 180-184 and 217-233 in atTic62 sequence in Figure 5c, which correspond to residues 247-251 and 291-310 in the alignment shown in Figure 2) might be responsible for attaching the protein to the inner envelope membrane of chloroplasts or to the Tic complex, and this region would establish differences in the localisation within cells between the two groups of proteins (template and model). By this way, Tic62 would be attached to the mem-brane, without spanning it, exposing the two functional modules to the stromal side. The hydrophilic profile and the large number of conserved proline residues at the Cterminal domain make it a better candidate for proteinprotein interactions rather than for insertion into the membrane [31]. These interactions might also contribute to the binding of Tic62 to the membrane/Tic complex.
Focusing on group I, one of the questions to be answered is whether or not all members of this group are Tic components. Although they might share a common dehydrogenase activity at the N-terminus, the origin of the FNRbinding module at the C-terminus in vascular plants remains unknown and different functions might be expected among the different organisms. No similar sequences to the C-terminus of psTic62 were found in the databases with significant homology, which could indicate that either the FNR-binding module was lost during evolution and only kept in vascular plants, or (more probably) the FNR-binding module was recently acquired by vascular plants. The high similarity of the NADP-binding domain of the Physcomitrella sequence in group I with psTic62 (68% identity) suggests that the short version of Tic62 in Physcomitrella, together with Tic110/Tic55/Tic40/ Tic32/Tic22/Tic20 [32], might be a constituent of the Tic complex. The same might be true for Cyanidioschyzon merolae. On the other hand, it cannot be excluded that the concurrence of both N-t and C-t domains, or even the FNR-binding domain alone, were compulsory to settle a protein as a Tic component. Further studies are necessary to establish the mode of interaction of Tic62 with the Tic complex in vascular plants and to elucidate the localization and function of members of group I in non-vascular plants.
Still the question remains of the presence and function of FNR at the inner envelope membrane of chloroplasts. In chloroplasts, this protein is found either soluble in the stroma in a non-functional state or attached to the thylakoids, where the protein is involved in the last stage of the electron transport process in photosynthesis. FNR is a ubiquitous flavoenzyme whose function is not exclusively confined to photosynthesis [33] and, recently, the protein has also been found to be localized at the inner envelope membrane of chloroplasts [12]. When attached to the thylakoids, a reductase-binding protein (BP) mediates the binding to the membrane [34]. In line with this, one possibility could be that FNR is attached to the inner envelope membrane via the FNR-binding motif of Tic62. This interaction in vascular plants could be affected by the activity of Tic62 that could specifically regulate the-yet unknown-functional state of FNR in the inner envelope membrane of chloroplasts. The opposite effect cannot be excluded, and the binding of FNR could regulate the activity of Tic62, and therefore the transport machinery. This regulation upon binding could depend on the redox state of chloroplasts and might involve NADP(H)/NAD(H) or a low potential electron donor and another substrate not yet identified [33]. On the other hand, a possible electron transfer process between FNR and Tic62 cannot be excluded although the capacity of Tic62 as electron acceptor/donor has not yet been proven. It is likely that the FNR-binding domain is important for some kind of metabolic regulation just in vascular plants, which needs further studies.

Conclusion
The reported results show that the N-terminal module of Tic62 (NAD(P)-binding domain) is highly conserved among all oxyphototrophs. The Tic62-NAD(P)-related sequences are of ancient origin, since the protein was not only found in cyanobacteria but also in green-sulfur bacteria. This protein family would belong to the extended family of short-chain dehydrogenases-reductases and likely contains the structurally conserved Rossman fold.

Methods
A sequence homology search (tblastn/blastp) was performed using the Tic62 protein sequence from pea (psTic62) as a template (e-value < 10 -9 ). The following biological databases were considered: the non-redundant GenBank database (nr) [35]; the public available Physcomitrella patens EST database, PhyscoDB [36]; the genomic database containing the so far sequenced Physcomitrella patens genome (access due to collaboration with Prof. Ralf Reski, University of Freiburg); the annotated genome of the red alga Cyanidioschyzon merolae [37]; the annotated genome of the green alga Chlamydomonas reinhardtii, ChlamyDB [38]; the genome database for plants, plantGDB [39]. Also the following databases were considered: the red algae Porphyra yezoensis [40] and Galdieria sulphuraria [41] databases; the chlorophyta Ostreococcus lucimarinus database [42]; the EST GenBank database (dbEST) [35]. All the retrieved sequences were aligned with the ClustalX program [43], visually inspected and manually corrected. The prediction of the subcellular localization of the proteins was performed with TargetP [44], ChloroP [45] and Predotar [46] programmes.
ProtTest v1.3 [47] was used to estimate the best model of amino acid evolution for phylogeny. The WAG+I+Γ model was chosen using either AIC or BIC as statistical frameworks. Phylogenetic trees were generated on the basis of the maximum-likelihood (ML) and Bayesian analysis using PhyML v2.4.4 [48] and MrBayes v3.1.2 [49] programmes. For ML analysis, four Gamma-distributed sites were considered, and the parameters were estimated from the data. Non-parametric bootstrap values were calculated for ML analyses (100 replicates) to assess the significance of the resulting tree. Bayesian analysis was performed under the same model. Four chains were run for one million generations with sampling every 100 generations. Bayesian posterior probabilities were calculated from the majority rule consensus of the tree sampled after the initial burn-in period corresponding to 2,500 generations. Four-cluster likelihood-mapping [50] implemented in Tree-puzzle v5.2 [51] was performed with 10,000 randomly chosen quartets.
A 3D model for all non-hydrogen atoms was obtained for the N-terminal domain of the mature Tic62 from Arabi-dopsis thaliana (atTic62; NP_188519) by homology modelling using the known 3D structure of NP_568098Arabidopsis protein [PDB:1XQ6] as a template. The model was built using the SWISS-MODEL automated modelling server [52] and it was evaluated using WHAT-CHECK [53], PROMODII [54] and VERIFY3D [55]. The secondary structure prediction of atTic62 was performed using PSIPRED server [56].