The 70 kD heat shock proteins (called DnaK in bacteria and Hsp70 in eukaryotes) form a large family of molecular chaperones upregulated in cells suffering various stresses, including heat shocks and heavy metal exposure [1, 2]. In addition, these proteins play a major role during protein synthesis by binding to the nascent peptides exiting the ribosome in order to prevent their aggregation and facilitating their folding in the optimal functional conformation . During the interaction with the partially synthesized peptides, DnaK/Hsp70 increases its ATPase activity . This chaperone has two main partners: the J-proteins [4, 5] and the nucleotide exchange factor, called GrpE in bacteria (or Mge1  in mitochondria and Cge1  in chloroplasts) and Bag-1, a eukaryotic functional analogue of GrpE . The nucleotide exchange factor promotes the exchange of ADP to fresh ATP in the nucleotide-binding region of DnaK/Hsp70, whereas the J-proteins stimulate the ATPase activity in order to stabilize the interaction of DnaK with unfolded proteins [5, 9, 10]. The J-proteins form a large family of proteins, which are structurally and functionally diverse but all have the capacity to interact with DnaK/Hsp70 through their J-domain [4, 11]. Among them, DnaJ/Hsp40 proteins form the largest subfamily . They control the flux of unfolded polypeptides into and out of the substrate-binding domain of DnaK/Hsp70 [9, 11].
DnaK proteins are widespread, being encoded by a single gene in most bacterial genomes, whereas most eukaryotic genomes harbor several Hsp70 genes that may have diverse evolutionary origins [1, 13, 14]. For example, in the green alga Chlamydomonas reinhardtii, five Hsp70 copies are present, all them encoded in the nuclear genome despite being targeted in diverse cellular compartments: three of them most likely originated by duplications from an ancestral eukaryotic gene (one expressed in the cytoplasm and two in the endoplasmic reticulum); one has a mitochondrial origin and is exported into the mitochondria, whereas the latter originated from the chloroplast endosymbiosis and is targeted into the chloroplast . In contrast with DnaK, the J-proteins are encoded in multiple copies in bacterial genomes . This is also the case in eukaryotes, where they work in the different cell compartments in association with the Hsp70 proteins cited above [9, 11]. Finally, the nucleotide exchange factor GrpE is present in one copy in most of bacterial genomes, whereas the eukaryotic Mge1, Cge1 and Bag-1 are encoded in the nucleus but addressed to the mitochondria, chloroplasts, and to the nucleus and the cytoplasm, respectively [7, 8].
The presence of DnaK, DnaJ and GrpE has been reported in several archaeal genomes , more precisely in several euryarchaeota but never in crenarchaeotal species. The best studied case concerns DnaK. A phylogenetic analysis by Gribaldo and coworkers suggested that this protein was acquired by several archaea by horizontal gene transfer (HGT) from different bacterial donors . These authors observed three different groups of archaeal DnaK sequences branching specifically with certain bacterial homologues. More precisely, Methanosarcina mazei (Methanosarcinales) was related to the Clostridium group of Firmicutes (low G+C Gram positive bacteria), Halobacterium cutirubrum and Halobacterium marismortui (Halobacteriales) to the Actinobacteria (high G+C Gram positive bacteria), whereas Methanobacterium thermautotrophicum (Methanobacteriales) and Thermoplasma acidophilum (Thermoplasmatales) branched with Thermotoga maritima (Thermotogales) . More recently, Macario et al. (2006) studied in various bacteria and archaea the taxonomic distribution and the phylogeny not only of DnaK but also of GrpE and DnaJ. They showed that the genes coding for these three proteins were clustered in most of the genomes examined . They also confirmed the results of Gribaldo et al. (1999), i.e. the likely existence of three HGT events from bacteria to archaea. However, they proposed a more complex scenario where the DnaK/DnaJ/GrpE cluster was first acquired from a bacterial donor by the ancestor of the Euryarchaeota, then lost in Methanococcales and in the common ancestor of Archaeoglobales, Halobacteriales and Methanosarcinales, and finally reacquired independently by Halobacteriales and Methanosarcinales from Actinobacteria and from Firmicutes, respectively . Worth noting, in these two studies, none of the three proteins was detected in hyperthermophilic archaea.
In addition to these relatively well-characterized chaperones and co-chaperones, the study of a genomic fragment of an uncultured deep marine archaeon from an environmental DNA fosmid library revealed a very unusual J-protein, referred as DnaJ-Fer, composed of a J-domain fused with a Ferredoxin (Fer) domain . The phylogenetic analysis of a 16S rRNA gene also found in this genomic fragment showed that it belonged to a member of the Thaumarchaeota, more precisely in the I.1a subgroup. These archaea, formerly classified as Group I, a sublineage of Crenarchaeota [19, 20], have been recently proposed to represent a third phylum of Archaea together with the Euryarchaeota and Crenarchaeota . Thaumarchaeota are widespread in many environments, including marine and freshwater, soil and sediment [22, 23]. Surprisingly, the presence of DnaJ-Fer proteins has also been reported in Viridiplantae (including green algae and plants), with three homologues (CDJ3, 4 and 5) in C. reinhardtii. These proteins are localized in the chloroplast of this green alga where they interact with the chloroplast Hsp70B and Cge1 proteins. However, the precise function of these DnaJ-Fer proteins in C. reinhardtii remains to be elucidated. According to the location and the nature of its partners, it would be tempting to hypothesize a cyanobacterial origin of the DnaJ-Fer protein. However, no homologue has been detected in Cyanobacteria .
Two hypotheses can explain the unexpected taxonomic distribution of the DnaJ-Fer protein in Thaumarchaeota and Viridiplantae: either two independent and convergent fusions of the two protein domains occurred in these two distantly related lineages, or a single fusion occurred in one of them followed by a HGT to the other lineage . In this work, we have taken advantage of the recent burst of available archaeal complete genome sequences , including representatives of new major lineages such as the Thaumarchaeota, ARMAN or Nanohaloarchaeales, to decipher the evolutionary history of DnaK and its co-chaperones in Archaea, with especial attention on the intriguing DnaJ-Fer protein. Our results support a complex scenario in which HGT appears to have played an important role. In addition to other cases of HGT, Thaumarchaeota appear to have most likely acquired their DnaK, co-chaperones and DnaJ-Fer proteins by independent HGTs from multiple donors, including other archaea and plants.