Comprehensive analysis of MHC class II genes in teleost fish genomes reveals dispensability of the peptide-loading DM system in a large part of vertebrates

Background Classical major histocompatibility complex (MHC) class II molecules play an essential role in presenting peptide antigens to CD4+ T lymphocytes in the acquired immune system. The non-classical class II DM molecule, HLA-DM in the case of humans, possesses critical function in assisting the classical MHC class II molecules for proper peptide loading and is highly conserved in tetrapod species. Although the absence of DM-like genes in teleost fish has been speculated based on the results of homology searches, it has not been definitively clear whether the DM system is truly specific for tetrapods or not. To obtain a clear answer, we comprehensively searched class II genes in representative teleost fish genomes and analyzed those genes regarding the critical functional features required for the DM system. Results We discovered a novel ancient class II group (DE) in teleost fish and classified teleost fish class II genes into three major groups (DA, DB and DE). Based on several criteria, we investigated the classical/non-classical nature of various class II genes and showed that only one of three groups (DA) exhibits classical-type characteristics. Analyses of predicted class II molecules revealed that the critical tryptophan residue required for a classical class II molecule in the DM system could be found only in some non-classical but not in classical-type class II molecules of teleost fish. Conclusions Teleost fish, a major group of vertebrates, do not possess the DM system for the classical class II peptide-loading and this sophisticated system has specially evolved in the tetrapod lineage.


Supplementary Text 3 (Text S3): Discussion of potential polymorphism of MHC class II genes in selected teleosts and comparison with previous studies
In the below paragraphs the MHC class II sequence findings in the present study are discussed in relation to possible polymorphism and compared with previous reports on the investigated teleost species. For figures and gene nomenclature we refer to the main text (e.g. Fig. 2) and to other supplementary files. For literature references see at the end of this file.

Atlantic salmon (Salmo salar)
Arguments for true, classical-type polymorphism of the highly variable DA group DAA and DAB sequences in salmonid fishes (http://www.ebi.ac.uk/ipd/mhc/fish/index.html) are (1) that only one sequence each could be amplified per haploid genome, (2) which were found to segregate in single DAA+DAB haplotype fashion upon pedigree analysis, and (3) which encode polymorphic putative peptide binding residues shown to be under balancing selection [e.g. Shum et al. 2001;Stet et al. 2002]. Tight genomic linkage between salmon DAA and DAB loci was already known, but the present study provides the first evidence that they are neighboring genes (Additional file 2, Fig. S2, and Additional file 8, Text S2).
For the DB group we previously reported that salmon DBA, DBB, DCA and DDA loci are non-polymorphic and thus can be classified as nonclassical [Harstad et al. 2008]. In the present study it was found that also the DB group locus DCB is non-polymorphic (Additional file 8, Text S2).
The DE group DEA and DEB loci also are non-polymorphic (Additional file 8, Text S2).

Zebrafish (Danio rerio)
Although zebrafish was initially thought to have four classical DA group class II B loci, designated DAB1 to -4 [Ono et al. 1992], they were later concluded to be alleles of a single DAB locus, corresponding to D8.37B3 in main text Fig. 2 [Sültmann et al. 1994;Bingulac-Popovic et al. 1997;Graser et al., 1998;Kuroda et al. 2002]. Zebrafish DA group class II A polymorphism has not been resolved at the locus level yet. Sültmann and coworkers [Sultmann et al. 1993] reported variable II A sequences, but these sequences, two examples of which are shown as "zebrafish (4)" and "zebrafish (5)" in main text Fig. 4, Additional file 3, Fig. S3, and Additional file 10, Text S4, can not easily be assigned to loci discernable in the Ensembl database. A reported "DAA" gene on the other side of SLC7A4 than DAB (D8.37B3) [Kuroda et al. 2002] is not present in the Ensembl database (Additional file 2, Fig. S2), suggesting haplotype variation ("haplotype variation" in this article is used to distinguish from "allelic variation" and refers to differences in gene copy number or order of genes between individuals of the same species). In addition to DA group DAA and DAB, previous reports also described the zebrafish DA group loci DDB (on Chr.8), DEA (Chr.8), DEB (Chr.8) and DFB (Chr.4), as well as DB group loci DBB (Chr.18), DCA (Chr.8), DCB (Chr.8); hitherto for none of these non-DAA/DAB loci transcription had been reported in article form, some of these loci are obvious pseudogenes, and at least several display haplotype variation [Sültmann et al. 1994;Bingulac-Popovic et al. 1997;Graser et al. 1998;Sültmann et al. 2000;Kuroda et al. 2002]. Main text Fig. 4, Additional file 3, Fig.  S3, and Additional file 10, Text S4, show that some of these previously reported sequences are (nearly) identical with the Ensembl database, whereas others are not. Importantly, we found indications that eleven more zebrafish class II loci are expressed than previously reported in article form (assuming that those previously discussed transcripts all mapped to the region around SLC7A4, and haplotype-dependent Table S2). Future research should clarify whether these additionally expressed loci are mono-, oligo-or polymorphic.

Stickleback (Gasterosteus aculeatus)
Stickleback was estimated to have, on average, six DA group class II B loci per haploid genome [Sato et al. 1998;Reusch et al. 2001], which agrees well with the Ensembl database (main text Fig. 2 and Additional file 2, Fig. S2). Despite many sequence reports [e.g. Sato et al. 1998;Reusch et al. 2001;Reusch and Langefors 2005] there is no conclusive evidence on stickleback class II B allelic polymorphism, in part due to haplotype variation, recent gene duplications, and interlocus recombination events [Reusch et al. 2001;2004;2005]. Difference in gene copy number also hampers the distinction between allelic versus haplotype variation when comparing the scaffold G131 sequence (main text Fig. 2) with a region that, according to non-MHC genes, is allelic and harbors "DAA", DAB", DBA" and "DBB" loci [reported by Reusch and coworkers, 2004] (those class II sequences are included for comparison in main text Fig. 4, Additional file 3, Fig. S3, and Additional file 10, Text S4).
Stickleback DB group genes such as those found on GXVII and GXX (main text Fig. 2) have, to our knowledge, not been discussed before. We found the GXVII B gene expressed (Additional file 5, Table S2), but, as typical for DB group genes, the matching ESTs (GenBank DW594610, DW039681, DN724325, DT950896, DN681207, DN670562) are not suggestive of polymorphism.

Medaka (Oryzias latipes)
There are not many publications on medaka MHC class II, and the available information doesn't allow firm conclusions on polymorphism. However, sequence comparison (Additional file 3, Fig. S3, and Additional file 10, Text S4) suggests that the DA group class II B gene on scaffold 873 (M873) is an allelic variant of DBB [Nonaka et al. 2001]. DBB was mapped together with the quite similar DAB locus to chromosome 18 (sequence details of that region are not known), apart from DA group DCB which was mapped to chromosome 3 and which is nearly identical to Ensembl M3B [Naruse et al. 2000a;2000b;Nonaka et al. 2001; Additional file 3, Fig. S3, and Additional file 10, Text S4]. Medaka DB group genes map to Chr.16 [identified by Ohashi et al. 2010] and Chr.5 (main text Fig. 2), the latter of which to our knowledge have not been discussed before. A cDNA clone represented by two ESTs (GenBank BJ877464, BJ891116) fully matches M5A, and this DB group gene may be non-polymorphic.

Fugu (Takifugu rubripes)
Fugu class II genes have hardly been studied, and there are few ESTs in the database. The neoteleost specific intron in the β2 domain coding sequence, however, has been noted (Lim and Brenner 1995). All the detected class II genes belong to the same subfamily within the DA group (main text Fig. 4 and Additional file 3, Fig. S3), half of them representing pseudogenes (main text Fig. 2, Additional file 2, Fig. S2, and Additional file 5, Table S2). Reported cDNAs are similar though not identical to the intact A and B genes on scaffold F7533 and A gene on F402 (e.g. AB453019 and CA846190 to F7533A; Additional file 7, Text S1), which is suggestive but not conclusive for allelic polymorphism.

Tetraodon (Tetraodon nigroviridis)
Also Tetraodon class II genes have hardly been studied, but there are quite a number of class II single pass read cDNA sequences in the NCBI non-redundant (nr) database (Jaillon et al. 2004). As in Fugu, all the detected class II genes belong to the same subfamily within the DA group (main text Fig. 4 and Additional file 3, Fig. S3). Comparison between the reported cDNA sequences and the Ensembl genomic sequences suggests that the genes T19A (e.g. match with cDNA in GenBank accession CR731060), T19B (e.g. GenBank CR727059), T55A (similar GenBank matches as for T19A, although not as perfect, so maybe only T19A and not T55A is expressed) and T61A (e.g. GenBank CR682088) are expressed (Additional file 5, Table S2) and polymorphic (e.g. GenBank CR727059 vs. CR697461; Additional file 7, Text S1), but more research is needed.

Tilapia (Oreochromis niloticus)
For the cichlid tilapia, we found 60 A and B sequences in the Ensembl genome of which 30 presumably are pseudogenes. For layout reasons the short scaffolds O42, O49, O50, O77, O745, O779, OA35 and OA78 (Additional file 2, Fig. S2), which we deem to provide no extra information, are not shown in main text Fig. 2. Matching cDNAs were found for the DA group loci O79A3 (e.g. GenBank GR671091), O79B2 (GenBank FF281534), O80B (GenBank GR702301) and O845A (GenBank GR696948), and the DB group loci O29A3 (e.g. GenBank GR667686) and O57B (GenBank GR627548) (Additional file 5, Table S2), and they suggest that there is some limited polymorphism in the DA group loci O79A3, O79B2, and O845A (Additional file 7, Text S1). The DB group O29A3 locus was designated "DAA" and found non-polymorphic ; the "DAA" name may have to be reconsidered since this locus belongs to the nonclassical DB group (main text Fig. 4). Previously reported tilapia DA group sequences were distinguished as "DBA-type" and "DCA-type" by Murray and coworkers [2000], and are exemplified in main text Fig. 4 by the full-length tilapia and related cichlid A. hansbeanschi sequences DBA and DCA, respectively. The tilapia DA group "DBA-type" and "DCA-type" genes were reported to map to multiple loci linked within a single chromosome with extensive haplotype variation ], also including DA group B genes [Malaga-Trillo et al. 1998;Murray et al. 2000].
Southern blot analysis showed that tilapia haploid genomes contain more than ten B loci of the DA group [Malago-Trillo et al. 1998], consistent with the Ensembl database (main text Fig. 2 and Additional file 2, Fig. S2). Recently, while preparing our manuscript, an extensive MHC class II analysis of tilapia, investigating the tilapia Ensembl database as well as BAC sequences, was published by Sato and coworkers [2012], confirming haplotype variation between individuals and showing clustering of related genes within single genomic regions. They also provided the first descriptions of DB group sequences in this species other than "DAA". In their nomenclature the DB molecules encoded by the region represented by Ensembl scaffold O29 (our "S1 synteny region"; main text Fig. 2) are designated as "A-type", by O33 and O57 as "Y-type", and by O97 (our "S2 synteny region") as "Z-type". They collectively designate the DA group sequences as "IIb family" and the DB group sequences as "IIa family", which refers to the original naming of the nonclassical tilapia II A locus as "DAA". They did not extensively compare features between the DA and DB families, and they did not perform a comprehensive MHC class II gene search across teleost species.