Modular evolution of glutathione peroxidase genes in association with different biochemical properties of their encoded proteins in invertebrate animals

Background Phospholipid hydroperoxide glutathione peroxidases (PHGPx), the most abundant isoforms of GPx families, interfere directly with hydroperoxidation of lipids. Biochemical properties of these proteins vary along with their donor organisms, which has complicated the phylogenetic classification of diverse PHGPx-like proteins. Despite efforts for comprehensive analyses, the evolutionary aspects of GPx genes in invertebrates remain largely unknown. Results We isolated GPx homologs via in silico screening of genomic and/or expressed sequence tag databases of eukaryotic organisms including protostomian species. Genes showing strong similarity to the mammalian PHGPx genes were commonly found in all genomes examined. GPx3- and GPx7-like genes were additionally detected from nematodes and platyhelminths, respectively. The overall distribution of the PHGPx-like proteins with different biochemical properties was biased across taxa; selenium- and glutathione (GSH)-dependent proteins were exclusively detected in platyhelminth and deuterostomian species, whereas selenium-independent and thioredoxin (Trx)-dependent enzymes were isolated in the other taxa. In comparison of genomic organization, the GSH-dependent PHGPx genes showed a conserved architectural pattern, while their Trx-dependent counterparts displayed complex exon-intron structures. A codon for the resolving Cys engaged in reductant binding was found to be substituted in a series of genes. Selection pressure to maintain the selenocysteine codon in GSH-dependent genes also appeared to be relaxed during their evolution. With the dichotomized fashion in genomic organizations, a highly polytomic topology of their phylogenetic trees implied that the GPx genes have multiple evolutionary intermediate forms. Conclusion Comparative analysis of invertebrate GPx genes provides informative evidence to support the modular pathways of GPx evolution, which have been accompanied with sporadic expansion/deletion and exon-intron remodeling. The differentiated enzymatic properties might be acquired by the evolutionary relaxation of selection pressure and/or biochemical adaptation to the acting environments. Our present study would be beneficial to get detailed insights into the complex GPx evolution, and to understand the molecular basis of the specialized physiological implications of this antioxidant system in their respective donor organisms.


Background
Reactive oxygen species (ROS) are generated through an incomplete reduction of oxygen molecules during mitochondrial respiration and/or cytosolic metabolism. Exposure to exogenous stimuli such as radiation and redoxcycling drugs might be an alternative pathway of ROS production. ROS perform physiological roles relevant to cell signaling and redox-status control [1,2], while unbalanced generation of these species induces detrimental oxidation of macromolecules including DNA, proteins, and lipids. To minimize ROS-derived damage, aerobic organisms have evolved a series of multi-layered enzymatic and non-enzymatic defense systems [3]. Distinct enzymatic activities such as catalase, glutathione peroxidase (GPx), and peroxiredoxin (PRx; also called thioredoxin peroxidase) have been well characterized from numerous taxa, as the major antioxidant defense mechanism.
Selenium-containing GPx proteins reduce H 2 O 2 and organic hydroperoxides by employing glutathione (GSH) as an electron donor. A total of eight GPx families have been described in mammals on the basis of primary structure, specific substrate accessibility, and spatial expression [4,5]. These homotetrameric isoenzymes conserve structural/biochemical properties, however, a number of enzymes that have been classified into GPx4 (phospholipid hydroperoxide GPx; PHGPx) may function in monomeric forms and exhibit unique substrate availability. The enzymes can interfere directly with hydroperoxidized phospholipids in biomembranes. Proteins belonging to the other GPx families display substrate preference toward H 2 O 2 and protect against lipid peroxidation via a concerted operation with phospholipase [6]. PHGPx is the basis of a principal defense system that intimately participates in the repair of disrupted biomembranes [7]. The vertebrate-specific GPx7 and GPx8 also lack the oligomerization loop, although their unique enzymatic properties are less understood [5].
Multiple isoenzymes showing primary structure similar to those of the mammalian PHGPxs have been described in plants, along with their respective subcellular expression profiles [8,9]. Plant enzymes possess a Cys residue instead of a selenocysteine (Sec) at the catalytic site, and prefer thioredoxin (Trx) as the electron source [9][10][11]. A pair of PHGPx-like proteins that effectively reduce the peroxides by adapting the Trx system has also been isolated from insect, yeast, and protozoa [12][13][14][15]. Interestingly, the green alga Chlamydomonas reinhardtii was likely to express both GHS-dependent (CrGPx1 and CrGPx2) and Trxdependent (CrGPx3-5) GPxs [16]. These observations have created a controversy regarding the classification of PHGPx-like proteins [8,9]. Conventional cladistic analyses based on comparison of primary structures generally annotate these proteins as PHGPxs, prior to empirical examination of their catalytic mechanisms (for example, see [17]). It has been suggested that the Trx-dependent GPxs comprise the fifth class of the PRx family, on the basis of their biochemical properties rather than their phylogenetic affinity [8,9]. Conversely, a novel functional class of 'Trx GPx-like peroxidase (TGPx)' has been proposed to clarify the unique GPx group sharing a common evolutionary origin with the GSH-dependent GPxs [5]. The molecular basis for the differential preference has also been investigated and appeared to involve a 'resolving Cys' within the α2 helix of the Trx-dependent GPxs [5,18,19].
With the accumulation of genomic databases, it has been possible to analyze homologous genes from diverse taxonomical groups. In this context, the evolutionary relationships among the eight GPx families including the complex PHGPx-like proteins were comprehensively examined [5,20]. The proteins isolated from all metazoan species were clearly separated from those of fungi/algae/prokaryotes and plants, and some of algal proteins were dispersed in a distinct group together with the Kinetoplastida GPxs [20]. These analyses demonstrate that PHGPx-like proteins are the most abundant type found in almost all aerobic organisms and considered as an ancestral form of the GPx superfamily [20]. The common ancestor appears to have diverged into GPx7/8 and GPx1/2/3/5/6 groups after being duplicated in the vertebrate lineage [5,20]. The highly polytomic relationships suggest parallel evolutionary pathways for the GPx families following duplication events in the early stage of GPx evolution. Lateral gene transfer has also been introduced as a relevant mechanism for the presence of GPx3-like genes in parasitic nematodes [5]. However, the evolutionary aspects of GPx genes isolated from invertebrates have not yet been fully investigated. The evolutionary relationship between GSH-and Trx-dependent GPxs also remains largely unclear.
Recent molecular phylogeny has positioned the Platyhelminthes into Lophotrochozoa, which forms a monophyletic clade Protostomia with the Ecdysozoa [21,22]. The phylum is composed of markedly diverse species, most of which have established a parasitic life mode within specific host organisms. The tissue-invasive parasites are continuously exposed to oxidative stresses generated by both endogenous and exogenous ROS, which are generated by host immune cells [23,24]. Therefore, parasitic nematodes and platyhelminths might provide good models for the investigation of biochemical and physiological diversification of antioxidant enzymes. Together with the Trx system, GSH-dependent proteins perform central roles in the thiol-disulfide redox homeostasis in these parasites by providing electrons to essential enzymes and by protecting against oxidative stress [25,26]. GPxs homologous to the mammalian plasma GPx (GPx3) and PHGPx have been characterized only from a few filarial nematodes and trematodes, respectively [25,27,28].
The trematode GPxs seem to play specialized roles in their unique biotic environments, in association with sexual reproduction [28,29]. In the present study, we isolated GPx genes in lower animals, especially in platyhelminths and nematodes, and investigated their evolutionary aspects. Almost all of the platyhelminth genes examined displayed a strikingly close relationship to the mammalian PHGPx genes. In addition to the previously described GPx3-like genes, vertebrate-type PHGPx genes were also identified in nematodes. Comparative analyses further indicated that the GPx7 family, which has been known to be specific to vertebrates, already diverged in the Lophotrochozoa, although the gene lineage seems to have been deleted in most of the lower animal genomes. At structural and functional levels, the GSH-dependent platyhelminth and nematode PHGPx-like genes appear to share an evolutionary progenitor with the deuterostomian homologs. The Trx-dependent GPx genes found in the other ecdysozoans may have modularly evolved from another paralogous gene duplicated in a common metazoan/plant ancestor.

Nucleotide and amino acid sequences of platyhelminth GPx genes
The majority of the platyhelminth genes isolated in this study encoded a Sec-dependent GPx, as predicted by the detection of a second in-frame TGA codon and a concurrent Sec insertion sequence (SECIS) motif within their 3'untranslated region (UTR) [27,30]. The motifs conserved the unique secondary structure comprising two helices separated by an internal loop, a SECIS core structure, a quartet located at the base of the second helix, and an apical loop; these were equivalent to those of the mammalian homologs ( Figure 2) [31]. The quartets of non-Watson-Crick base pairs were class I (ATGA_AA_GA) in these platyhelminth genes, as were the cases in the mammalian selenoprotein genes. Exceptionally, SjGPx1 contained a quartet pattern of class II (GTGA_AA_GA). In contrast, PwGPx1 contained a standard codon for Cys (TGC) rather than the Sec codon and the SECIS was not detected within its nucleotide sequence.
The polypeptides encoded by the platyhelminth GPx genes showed a broad range of sequence identity either to one another or to the other eukaryotic PHGPx-like pro-Organisms used in this study and their phylogenetic relation-ships Figure 1 Organisms used in this study and their phylogenetic relationships. The rectangular cladogram was constructed based on a recent animal phylogeny [21]. The databases targeted for the screening of GPx homologs are presented in parentheses together with their respective sources, following the scientific names of donor organisms.
teins (21-87%). The N-terminal amino acid (aa) extension was hydrophobic in a major fraction of the trematode and planarian proteins (underlined sequences in Additional file 1), while such sequences were not detected in SmGPx1, SjGPx1, and ShGPx1. The presence/ absence of the leader sequences would be responsible for the expression of differentially targeted proteins, as was evidenced in the mitochondrial/non-mitochondrial variants of mammalian PHGPxs [32]. The active Cys, Gln, and Trp residues, which participate in the formation of the catalytic-site geometry, were well conserved at their corresponding positions, although the thiol-containing Cys was exclusively replaced by Sec in the platyhelminth and mammalian GPxs (T-1, -2, and -3 in Additional file 1) [19,33]. The Gln and Trp residues were replaced by Met/ Leu and Tyr in the Paragonimus proteins. The other functional aa involved in the stabilization of the active site, were also detected in their primary structures (C-1 [Asn], -2 [Cys], and -3 [Asn]). The Cys residue, which is pivotal in the interaction of Trx-dependent GPx with Trx, could not be defined in the platyhelminth proteins, as well as in the nematode and mammalian PHGPxs [5,18]. These lower animal GPxs lacked the subunit-interacting domain and PGGG motif found in the tetramer-forming GPxs.

Expression profiles and preferential electron donor
In trematode parasites, the expression of various GPx genes increased in proportion to the development of donor organisms and showed a specific locality within reproduction-related cells such as vitellocytes [27][28][29]. Exogenous oxidative chemicals including paraquat, juglone, and H 2 O 2 also induced the trematode genes in a dose-dependent manner [27,28]. PwGPx genes exhibited expression patterns identical to those of the Schistosoma and Clonorchis orthologs, as was evident in RT-PCR and immunohistochemical analyses (unpublished data). The enzymatic activity was not detected in the metacestode stage of E. granulosus [34], even though an mRNA sequence (EgGPx1) was isolated from the larval stage. We could not currently trace any empirical data demonstrating expression profiles of the antioxidant enzymes in Turbellaria. Given the sexual reproduction mechanism conserved in these platyhelminths [35][36][37], the planarian genes might have an expression pattern similar to that of the trematode genes.
The substrate specificity of schistosome GPxs was likely to be similar to those of mammalian PHGPxs [38], and the Cys motif found in the Trx-dependent GPxs was not detected in the primary structures of platyhelminth proteins analyzed in this study (Additional file 1). The preferential affinity toward electron donors has not been determined empirically with any of the platyhelminth GPx proteins including the well characterized Schistosoma and Clonorchis enzymes. We examined the catalytic efficiency with the native GPxs partially purified from adult P. westermani ( Figure 3). The enzymes effectively reduced H 2 O 2 and cumene hydroperoxide using GSH as an electron donor. The proteins also showed considerable reactivity with Trx, while these activities were lower than those with GSH (15.4-39.4%). The specific activity of GPx proteins decreases dramatically in the range of several tens to several hundred, when provided by an alternative reductant, rather than their preferential electron donor (see [18] and references therein). Therefore, it is possible that the purified Paragonimus proteins might have been contaminated with PRxs, although we examined the samples with the Clonorchis sinensis PRx-specific antibodies, which were highly cross-reactive against P. westermani PRxs ( Figure  3A). Regardless of this possibility, however, these observations collectively suggest that the trematode GPxs exhibit a catalytic pattern identical to that of the mammalian PHGPx proteins.

Phylogenetic analysis of GPx gene families
In addition to the platyhelminth proteins, we retrieved several hundred GPx-like proteins either from the nonredundant GenBank or organism-specific databases (sequence identity: 35-53%, E-values < 10 -16 ). Proteins with primary structures similar to those of mammalian PHGPxs were found in all major taxa including plants, bacteria, protozoans, fungi, algae, invertebrates, and vertebrates. GPx3-like proteins, which had been likely to be Selenocysteine insertion sequence (SECIS) motifs found in the platyhelminth GPx genes Figure 2 Selenocysteine insertion sequence (SECIS) motifs found in the platyhelminth GPx genes. The structures are predicted from the nucleotide sequences corresponding to the 3'-untranslated regions of the respective mRNAs. The typical secondary structure of the mammalian SECIS is also presented for comparison [31].
vertebrate-specific, were further found in nematode and tunicate species. The nematode GPx3-like proteins are proposed to have originated from vertebrate hosts by lateral gene transfer [20]. However, the presence of GPx3like proteins in a tunicate, which has not been described previously, implies that the hypothesis of horizontal gene transfer from vertebrate(s) to nematode(s) needs to be reexamined.
The sequences of 105 proteins were selected for cladistic analyses (Additional file 2). Intra-group divergences inferred from the Jones-Taylor-Thornton (JTT) model were not significantly different between the PHGPx-and GPx1/3-like groups (P = 0.053; Mann-Whitney-Wilcoxon test), although the values were highly variable: 0. 33-2.37 in the PHGPx-like groups versus 0.31-0.55 in the GPx1/3 groups (bold-face values in Table 1). At the inter-group level, the rates of divergence between a pair of PHGPx and GPx1/3 group were considerably higher than those between a pair of the same GPx family (Table 1). These results suggest that each of the GPx families has diverged during similar evolutionary times and that the GPx3-like proteins are endogenous in nematodes, rather than having originated by lateral gene transfer. To verify these views, we performed in-depth phylogenetic analyses.
The PHGPx-like proteins exhibited a highly polytomic clustering pattern in a maximum likelihood tree constructed by the quart puzzling method of TREE_PUZZLE ( Figure 4, the NEWICK-format tree is attached as Additional file 3). The phyletic GPx3-lineage members comprised another polytomic clade. A portion of the orthologous PHGPxs were clustered into well-separated focal clades, in accordance with the taxonomical positions of their donor organisms. The GPx7 and GPx8 proteins, which have been recently detected in vertebrate species, positioned into each of the distinct clades. In protostomians examined, only nematode species harbored the GPx3like proteins. The neighbor-joining and maximum parsimony trees showed topologies similar to that of the maximum likelihood tree (Additional file 4). The groupspecific clustering of PHGPxs was much more substantial in the neighbor-joining tree, albeit the statistical significance of some branching nodes could not be supported by bootstrap analyses. The neighbor-joining tree demonstrated that the GPx7 and GPx8 proteins have diverged from each other after being duplicated in vertebrates. More interestingly, two proteins in S. mediterranea (SmedGPx1) and C. intestinalis [GenBank: XP_002128643] showed a closer relationship with the vertebrate GPx7/8 lineages. Since the planarian is believed to have maintained its free-living life mode, the lineage members might also have endogenously evolved in the protostomian stage. The members seem to have been deleted in the other protostomian species selected in this study.
The Trx-dependent GPxs with the resolving Cys were largely scattered in plants, algae, protozoans, and insects (marked with ‡ in Figure 4 and Additional file 4), while the Cys residue was substituted by another aa in some of the alga and insect proteins. Conversely, the Sec codon and the associated SECIS motif were recognized as being largely confined to the mRNA sequences of the GSHdependent lophotrochozoan, ascidian, and vertebrate Biochemical properties of the native PwGPx1 and PwGPx2 proteins Figure 3 Biochemical properties of the native PwGPx1 and PwGPx2 proteins. (A) The partially purified PwGPxs were resolved by 2-dimensional SDS-PAGE (15%, pH 3-10) and transferred onto nylon membranes. The membranes were reacted with the PwGPx-specific mouse antisera (1:2,000 dilutions). One-dimensional blots with PwGPxs (10 μg) and whole protein extracts (40 μg) of Paragonimus westermani (Pw-CE) and Clonorchis sinensis (Cs-CE) were also examined with a pooled mouse antiserum against C. sinensis PRx1 and PRx2 (1: 1,000 dilutions; our unpublished data). (B) The specificity and catalytic efficiency of the proteins purified by a series of gel chromatographies were examined toward catalytic substrates (H 2 O 2 and cumene hydroperoxide [CHP]) and electron donors (glutathione [GSH] and thioredoxin [Trx]), respectively. The concentrations of GSH and Trx added in the reactions were empirically determined to ensure that the reductants are saturated. The absorbance values of experimental groups were substracted with those of reactions without PwGPxs. The reactions were performed in triplicate and the specific activities in μmol/min/mg were indicated as mean ± standard deviation. Figure 4). Among the ecdysozoans examined, only an arachnidal tick, Boophilus microplus, was found to express the Sec-dependent PHGPx protein [GenBank: ABA25916]. The C. reinhardtii [GenBank: AAL14348] and Hydra vulgaris [GenBank: ABC25026] proteins also contained Sec in their primary structures. The GPx7/8-like proteins found in S. mediterranea and C. intestinalis were selenium-independent, consistent with vertebrate proteins [5]. However, the evolutionary events resulting in these biased distributions of proteins with different biochemical properties could not be clearly addressed in the phylogenetic analysis.

Exon-intron structures of PHGPx homologs
The polytomic phylogeny of PHGPxs suggested that these proteins have evolved in a modular fashion from multiple genes, while informative diagnostic substitutions could not be detected between the dichotomized GSH-and Trxdependent proteins, except for the resolving Cys in the major Trx-dependent GPxs (Additional file 2). We examined the presence of orthologous introns in the PHGPx homologs to trace their evolutionary pathways in details. The genomic organizations were predicted by comparing each of the mRNA sequences with its corresponding chromosomal sequence isolated either from DNA databases (Figure 1) or by PCR amplification (PwGPxs [GenBank: DQ454159, DQ454160]). The numbers of intron were highly variable in the diverse PHGPx genes, with a gross tendency to increase along with the taxonomical positions of donor organisms (Figures 5 and 6). The genes from free-living nematodes, vertebrates, and plants were split into three, seven, and six exons, respectively. Meanwhile, insect genes displayed highly complicated exon-intron structures with different intron numbers from one to three. Interestingly, the platyhelminth genes displayed an organization pattern comparable to that of mammalian orthologs. A gene isolated from a parasitic nematode Brugia malayi was more complex than those of free-living nematodes.
The preservative organization patterns were analyzed by comparing the position of each intron in relation to the aa alignment and their phases. The phases and positions of the respective introns were also conserved in the taxonspecific gene groups. Among the groups, however, orthologous introns shared in the evolutionary terms could be detected on the basis of the biochemical properties of their encoded proteins. The introns were tightly conserved between the Sec-and GSH-dependent platyhelminth and vertebrate genes, except for one intron that was specifically integrated into the vertebrate orthologs. The first intron of Caenorhabditis genes was matched to the second one identified in these common decedents. Two introns of the Brugia gene were found to be orthologous to the platyhelminth and vertebrate introns ( Figure 5A). The GPx7/8-like genes of S. mediterranea and C. intestinalis, as proposed in the phylogenetic analysis, displayed a different intron conservation pattern ( Figure 5B). The exonintron architecture of the tunicate gene was identical to those of vertebrate GPx7 and GPx8 genes, and the turbellaria gene shared the last intron with them. The complex insect genes, which encode proteins exhibiting strong affinity to Trx, shared an intron with plant GPxs in a dichotomized pattern (Insect I and II in Figure 6). In addi- tion to these clonal genes, GPx genes with unique exonintron structures were also retrieved from the insect genomes (grouped into Insect III in Figure 6). The C. reinhardtii genome possessed genes with each of the unique number and position of intervening intron, including the gene encoding a Sec-dependent protein [GenBank: AAL14348]. The architectural conservation patterns showed weak relations to the presence of resolving Cys in the insect and alga genes (genes containing the motif were marked with § in Figure 6). The phyletic GPx1 and GPx3 members had their respective intron-sharing patterns in the genomic structures, while those were clearly distinguished from the chromosomal organizations of PHGPx members (Additional file 5).

Discussion
In this study, we described a comparative analysis of GPx genes including platyhelminth orthologs such as the human lung fluke and freshwater planarian genes, for the purpose of predicting their evolutionary pathways. The GPx homologs displayed diversity in the copy number among the eukaryotic species, which suggests that independent multiplication events have occurred in each of the taxa. In nematodes and deuterostomians, a couple of duplicated genes have diverged further into GPx1 and GPx3 lineages, respectively, while all GPx paralogs have preserved the structural properties similar to that of the mammalian PHGPx genes in the other invertebrate clades. A GPx7/8-like gene previously shown to be vertebrate-specific was isolated from the genomes of a tunicate and turbellaria. The spatiotemporal expressions of GPx genes were found to be specific to each of the duplicated copies in plants and vertebrates, which demonstrate functional diversification among them [4,9]. However, the trematode gene inductions were concentrated in the reproduction-related vitellocytes contained within vitelline follicles and eggs. This fact might suggest that the duplication events have been forced to increase genetic redundancy and/or protein levels, rather than to increase the size of isoenzyme pools with distinct tissue expression profiles, at least in the trematode species.

Maximum likelihood tree of the GPx homologs
The substrate specificity of GPx-like proteins has been debatable not only because the parasitic nematode proteins are not enzymatically active toward H 2 O 2 but also because PHGPxs show a variation against diverse catalytic substrates. The trematode PHGPxs reduced H 2 O 2 much more effectively than cumene hydroperoxide, as was determined with the Paragonimus (Figure 3) and Schistosoma proteins [38], whereas plant homologs preferred the organic peroxide [9]. Interestingly, the GSH-dependent incapability of plant PHGPxs in reduction of H 2 O 2 was reversed when Trx was provided as an alternative electron donor [10]. These results suggest that the GPx family genes have functionally diverged according to the biochemistry of their action environments. The active site geometry is modified in response to the binding of GSH or Trx [33]. The vitellocyte-specific platyhelminth genes were inducible by exogenous stimuli with the soluble hydroperoxide and redox-cycling drugs, which can cross cell boundaries and trigger the conversion of oxygen into O 2 - [39]. The increased GPx activity against these chemicals seems to result indirectly from the accumulation of H 2 O 2 and/or lipid peroxidation within the target cells/ organs, due to O 2overproduction. Alternatively, their responsiveness upon these stimuli could be a simple consequence of the coordinated regulation of phase I oxidative enzymes via the Nrf2-mediated signaling pathway [40], given the fact that the specific role of trematode GPx proteins is highly associated with sexual reproduction [28,29].
The majority of the mammalian GPxs contain Sec at their catalytic site, which is co-translationally inserted in response to a UGA codon, the stop signal in the standard genetic code. The alternative decoding of the opal codon depends on a cis-factor (SECIS) located in the 3'-UTR of selenoprotein genes [30]. In our analysis, the distribution of selenium-dependent GPxs (sGPxs) was biased toward platyhelminth and deuterostomian species, with a few exceptions: proteins isolated from an annelid [GenBank: ABA25916], alga [GenBank: AAL14348], and cnidarian [GenBank: ABC25026]. Of the platyhelminth enzymes obtainable, PwGPx1 and SmedGPx1 (GPx7-like protein) were determined to be selenium-independent GPx (siGPx). The Sec codon was replaced by a typical Cys codon in the nematode homologs, although nematode species also contained selenoproteins such as Trx reductase and/or SelK [41]. It has been suggested that siGPx proteins have emerged independently from sGPx counterparts during evolution, in response to the increased atmospheric oxygen concentration [16,42]. Conversely, a recent comparative analysis predicted the progressive acquisition of Sec by Cys-based proteins in a series of donor organisms [5]. The siGPx in protozoans, fungi, and plants, and the biased distribution of sGPx within a narrow range of taxonomical clades seems to support the Cys-based evolutionary route of GPx proteins. The highly conserved genomic architecture among Secdependent genes can be taken as evidence for the suggestion ( Figure 5). siGPx has been postulated either to comprise a second-line defense or to cooperate with sGPx in as yet-unknown and novel manner, to cope with cellular oxidative stresses [43]. In plants and lower animals, PRx constitutes a major antioxidant system against ROS-derived damage and the platyhelminth GPx seems to be involved in a specialized physiological function(s) [9,10,27,28]. Therefore, it is possible that siGPx carries out its role in a specific microenvironment with an optimal pH for the thiol group, along with the diversification of its donor organism. The siGPxs of filarial nematodes, of which functions are mainly confined to the cuticular matrix, may provide another example of this suggestion [44]. The Genomic organizations of thioredoxin (Trx)-dependent PHGPx genes Figure 6 Genomic organizations of thioredoxin (Trx)-dependent PHGPx genes. The chromosomal structures of insect and plant genes that encode the Trx-dependent proteins were compared to each other to detect orthologous introns. The insect genes are categorized into three groups according to their conservation patterns. The symbol § marks the gene with the Cys motif in its encoded protein. More detail is presented in the legend to Figure 5.
induction level of PwGPx1 was higher than that of PwGPx2 during the development of P. westermani and against exogenous oxidative stresses, while their histological distributions were almost identical (data not shown). Currently, it is not clear whether this phenomenon is related to their differential reactivity or it is an indication of functional and/or histological diversification within each of their respective subcellular micro-niches.
Together with the selenium dependency, the categories of GPx isoenzymes also revealed taxonomical biases along with their donor organisms; the GPx3-like proteins were detected exclusively in the deuterostomian and nematode species, whereas those with a phylogenetic linkage to PHGPx were ubiquitously isolated across taxa from plants to vertebrates. Since the complete or draft whole genome sequences with a considerable coverage (>7) have been comprehensively screened via a series of BLAST searches (Figure 1; see also [9,15,27]), the absence of genes encoding GPx3-like proteins appears apparent in the plant and invertebrate genomes selected in this study. The free-living and pytoparasitic nematodes [45], of which whole genome or comprehensive EST databases are available, contained both of the GPx lineages. The GPx3-like filarial proteins were identified empirically by analyzing the major cuticular glycoproteins [46]. Recently, we detected a PHGPx-like gene from the B. malayi DNA database [GenBank: XP_001897517]. Genes highly homologous to the vertebrate GPx7/8 genes were also detected exclusively in the free-living turbellaria and tunicate (Figures 4 and  5B). Therefore, it is evident that there have been a series of selection pressures to drive the mosaic distributions of the GPx3-and GPx7-lineage genes across taxa, which was similarly observed in the DNA methyltranferase gene homologs [47], although the mechanism(s) in the selection of genes for deletion/maintenance waits for further investigation.
The tetrameric GPx1/3-lineage proteins were phylogenetically related to one another in the cladistic analysis and their chromosomal genes showed clonal conservation patterns (Figure 4 and Additional file 4). On the contrary, the phylogeny and genomic structures of PHGPx genes were found to be multifaceted. The aa sequences of PHG-Pxs were well aligned throughout the whole polypeptides, except for the extreme N-terminal segments, and gaps resulted from indels of codon were not prominent (Additional file 2). The polymorphic N-terminal regions were intimately related to the evolutionary acquisition/loss of regulatory signal within certain members (Additional file 1). Despite the tight sequence conservation, the hypervariable genomic structures of these genes exhibited a discontinuous conservation patterns according to the taxonomical distributions and/or biochemical properties of protein products (Figures 5 and 6). Whether we accept the "intron-early" or the "intron-late" hypothesis concerning the origin of spliceosomal introns, it is clear that an orthologous intron can be taken as a footprint revealing a common ancestor during evolution of the corresponding genes [48]. Therefore, each of the genes encoding GSHand Trx-dependent proteins might have evolved along separate pathways from a certain evolutionary time, although the enzymatic properties of the nematode PHG-Pxs have not yet been examined empirically. Compared to the tightly conserved introns among the platyhelminth and vertebrate orthologs, intron preservation was much more complicated in the Trx-dependent insect and plant genes. The third intron of plant genes was orthologous to the second intron of some insect genes (Group I) and the forth intron was shared with the other insect genes (Group II), while a series of insect genes (Group III) with uniquely integrated introns were further recognized. The aa sequences of plant exons 3, 4, and 5 displayed identity values of 47.8, 59.8, and 53.3%, respectively, to the corresponding regions of Group I insect genes, and 49.8, 41.2, and 64.0%, respectively, to those of Group II insect genes. These collective results suggest a probable recombination event at a region matched to the end of current exon 4, which had occurred between two paralogs in a primordial genome that evolved into plants. After gaining additional introns, the gene might have undergone genic/chromosomal multiplications. Alternatively, each of the diverse insect genes might have unique exon-intron remodeling process mediated by a gene's recombination with a cDNA generated by reverse transcription of the respective mRNA [49]. Further information on the intermediate metazoan genes would be informative to address this intriguing issue.
The fully diverged mammalian genes were present as a single genomic copy, whereas the PHGPx-or GPx3-like genes had multiple copies in the respective genomes of the other metazoans and plants. Together with the polytomic relationships and differentiated genomic organizations, this fact is likely to provide additional evidence for the modular evolutionary pathways of GPx gene family from multiple intermediates. The presence of GPx3 in the soil-borne and pytoparasitic nematodes, and GPx7 in the free-living planarian and tunicate, respectively, would reduce the possibility of lateral gene transfer between taxa. Each of the primordial paralogs duplicated from a common ancestor seems to have been subject to independent exon-intron remodeling processes such as homologous recombination and integration of intronic sequences, along with their taxonomical lineages. A series of the intermediate genes might have diverged either to GPx3like genes by gaining exonic nucleotides corresponding to the subunit interaction domain or to GPx7 homologs, while they underwent lineage-specific expansion or deletion in the metazoan animals. The GPx1/2 genes are likely to have evolved from the retrointegration of GPx3-like intermediate aided by retrotransposon-encoded proteins, which was followed by acquisition of introns, during an early evolutionary period of the Vertebrata. The evolutionary pathways for the GSH-and Trx-dependent GPxs appear be separated from the common eukaryotic ancestor. The Trx-dependent GPxs without the resolving Cys in insects and algae would be acquired by the relaxation of selective constraints and/or by the adaptation to the micro-environments in their specific action sites. Further studies on biochemical properties with the individual insect GPxs and detection of a second Cys motif, if any, would be helpful to gain more insight into the complex evolution of PHGPx-like proteins.

Conclusion
Our study on the comparative analysis of GPx gene families suggests that these genes have modularly evolved from multiple intermediates, which accompanied the structural diversification and sporadic expansion/deletion. In addition, the ubiquitous PHGPx genes with different biochemical properties are likely to have been separated in an early evolutionary time. The dichotomized evolutionary pathways were traceable by considering structural conservation patterns in the chromosomal genes. Multiple GPx genes have tissue-specific implications in mammals and plants. However, the duplication events appear to be intimately related to an increase in the genetic repertoire of the antioxidant enzyme, at least in the platyhelminth species. The modular evolution of GPx genes in association with their biochemical properties may provide a molecular basis for understanding the detailed physiological implications of this antioxidant enzyme system. The complexity in genomic structures of platyhelminth genes, which is comparable to that of mammalian genes, further suggests that these organisms comprise an informative model system in elucidating the driving-forces and courses of genomic complexity during eukaryotic evolution.

In silico isolation of GPx genes from platyhelminth species
Two novel genes encoding GPx proteins (PwGPx1 and PwGPx2) were isolated from an EST dataset of P. westermani [50]. The Paragonimus genes, together with those of S. mansoni [GenBank: L37762, AY729668] were used as queries in the homology searches to retrieve their orthologous genes. Genomic and/or EST databases specific to platyhelminths were screened via stand-alone BLAST (Figure 1). The GenBank DNA/protein databases at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/) were also selected for the screening. GPx proteins isolated from human (GPx1 and GPx3 [GenBank: CAA68491, AAP50261]) and filarial nematode (Brugia pahangi [Gen-Bank: CAA48882]) were employed as the other queries during the analyses.

Sequence analysis
The coding profiles and patterns of sequence similarity of the retrieved mRNA sequences were analyzed using the ORF Finder and BLAST programs equipped at the NCBI. The similarity pattern of each sequence was verified by a subsequent analysis based on the Hidden Markov Models using InterProScan (http://www.ebi.ac.uk/Tools/Inter ProScan/). The SECIS motif was predicted using the SECI-Search program (ver. 2.19; http://genome.unl.edu/SECI-Search.html). The putative hydrophobic signal peptide was analyzed by the SignalP program (http:// www.cbs.dtu.dk/services/SignalP/). Chromosomal segments homologous to the PwGPx cDNAs were amplified from the P. westermani genome by long-range PCR using the gene-specific primer pairs, which had been designed from the nucleotide sequences within both ends of the cDNAs (5'-GGTACCAACAGTGACGGTTTGATTTTC TAA CACC-3' and 5'-GACAGGCCTGGAGGTGAATTGA TGA-GAGTGAACC-3' for PwGPx1; 5'-GG AACA TCGA AGGT-GGTTTGAAAAAGGTCAACTTC-3' and 5'-CT TTA CTC AC AAACTACTGTTGCAATAATAGTAACGTC-3' for PwGPx2). PCR was conducted with the LA Taq system (Takara, Shiga, Japan) and the resulting PCR products were cloned into the pGEM-T Easy vector (Promega, Madison, WI, USA) for sequencing. The nucleotide sequences were determined from both strands using a BigDye Terminator Cycle Sequencing Core Kit (Perkin Elmer, Foster City, CA, USA) and an automated ABI PRISM 377A DNA Sequencer (Applied Biosystems, Foster City, CA, USA). The mRNA sequences were aligned with their corresponding genomic sequences by considering the general rule for nucleotides tightly conserved in the exon-intron boundary [51], after which their chromosomal structures were determined [GenBank: DQ454159, DQ454160].

Enzymatic characterization of native PwGPx proteins
The native proteins were purified from an adult worm extract using an AKTA fast-performance liquid chromatography (Superdex 75), followed by DEAE-anion exchange chromatography (Amersham Pharmacia Biotech, Uppsala, Sweden), by monitoring GPx activity. The proteins were also examined by 1-dimensional (1-D) and/or 2-D Western blotting employing mouse antisera specific to the recombinant forms of the PwGPx proteins. The purified GPx proteins were dialyzed against PBS (pH 7.2) overnight at 4°C. The specific activity of the native proteins was examined by the reduction of H 2 O 2 or cumene hydroperoxide (Sigma-Aldrich, St. Louis, MO, USA) in the presence of GSH and Escherichia coli glutathione reductase (GR), as described previously [38,52]. The 200 μl reaction mixtures contained 5 mM potassium phosphate, 1 mM GSH, 0.1 unit of GR, 0.1 mM NADPH, 1 mM EDTA, and 1 μM of PwGPx. The reagents were pre-warmed to room temperature just prior to use in the reaction. After adding the substrate, the NADPH oxidation level was monitored at A 340 for 5 min with a spectrophotometer. A series of mixtures containing 10 μM E. coli Trx and 0.1 unit of thioredoxin reductase (TR) instead of GSH and GR were also assayed.

Phylogenetic Analysis
A total of 105 members that were assigned separately into the eight GPx families, were finally selected during the BLAST searches, by considering both the identity values and taxonomical distributions of donor organisms. The aa sequences were aligned using ClustalX and optimized using GeneDoc. The alignment was used as an input to obtain a maximum likelihood tree using the quartet puzzling algorithm implanted in TREE_PUZZLE (ver. 5.2) [53]. The options were selected to use the JTT model for the aa substitution [54], estimation of aa frequencies from the dataset, gamma distribution model for rate heterogeneity (the parameter alpha was estimated from the input data), eight gamma rate categories, and 50,000 puzzling steps. Indels between pairs of sequences were regarded as missing data. Distances between pairs of protein sequences were calculated according to the JTT model and corrected for the gamma distribution of evolutionary rates. The standard errors were computed by the bootstrapping of 1,000 replicates. The sequence alignment was also applied in phylogenetic analyses using the neighborjoining (PROTDIST and NEIGHBOR) and maximum parsimony (PROTPARS) programs of the PHYLIP package (ver. 3.6b). The statistical significance of each branching point in the resulting trees was evaluated using 1,000 random samplings of the input alignment by the SEQBOOT program.