- Research article
- Open Access
Evolution of major histocompatibility complex class I and class II genes in the brown bear
BMC Evolutionary Biologyvolume 12, Article number: 197 (2012)
Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae.
We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca.
Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South–north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia.
The major histocompatibility complex (MHC) is a key element of the vertebrate immune system, responsible for presentation of foreign peptides to T-cells . MHC consists of two main groups of genes, MHC class I and MHC class II, each comprising a number of genes that appear to evolve by the birth-and-death process, whereby some new genes appear via duplication and others are pseudogenised or deleted [2, 3]. MHC class I genes are expressed in all nucleated cells and present antigens derived mostly from intracellular parasites, whereas MHC class II genes are expressed in specialised antigen presenting cells, such as macrophages, and present mostly antigens of extracellular parasites. The peptide-binding groove of class I molecules is formed by α1 and α2 chains encoded by the second and third exon of the gene, whereas class II peptide binding groove is formed by α and β chains encoded by second exons of separate A and B genes .
MHC genes are the most polymorphic genes described in vertebrates, with polymorphism occurring predominantly at residues involved in peptide binding (antigen binding sites, ABS; [5–7]). The mechanisms deemed responsible for maintaining polymorphism at MHC genes include frequency-dependent selection [8, 9] and heterozygote advantage . Frequency dependence arises because the bearers of common alleles become more likely to be evaded by evolving parasites (e.g. ), whereas heterozygosity allows presentation of a wider range of pathogen-derived peptides, and thus provides better resistance to infection (e.g. ). Consistent with evolution under pressure from parasites, there is a growing evidence for an association between MHC types and susceptibility to parasites (e.g. [13–22]). Additionally, several taxa have been shown to avoid mating with MHC-similar partners (e.g. [23–26]), and such MHC-disassortative mating should also help maintain MHC polymorphism . Balancing selection acting on MHC appears to be able to maintain allelic lineages for a long time, resulting in trans-species polymorphism, whereby some alleles from different species are more similar than some alleles within species .
The mechanisms of balancing selection summarised above not only maintain polymorphism, but also favour novel alleles with non-synonymous substitutions changing peptide-binding properties of MHC molecules. Indeed, the rates of non-synonymous substitutions often exceed those of synonymous substitutions at sites involved in peptide binding [29–32].
The rate of birth-and-death process appears faster in MHC class I genes than in class II loci, and as a result it is difficult to establish orthology of class I genes among mammalian orders [31, 33]. In contrast, clusters of MHC class II genes, which originated 170–200 mya, retain orthology between orders of mammals . The reasons for this difference between class I and class II genes are not well understood. In humans, differences in the rate of the birth-and-death process is not mirrored in the differences in the strength of positive selection, as dN/dS ratios are very similar for both MHC classes, but in mice dN/dS in MHC class I genes is considerably lower than that in MHC class II [29, 31].
Here, we characterised sequences of 2nd exon of the MHC class I and class II genes in the Scandinavian population of brown bear Ursus arctos. The 2nd exon encodes one of two chains forming the peptide binding groove in both MHC classess. MHC class II genes in the brown bear from Japan have been studied by Goda et al. [35, 36], who found considerable polymoprhism of DRB genes, but limited polymorphism of DQA genes. MHC class I in the brown bear has not been characterised so far. Our study had the following aims: (i) to characterise the level of polymorphism of both MHC classes, (ii) to compare the strength of positive selection acting on them, based on the patterns of nucleotide substitution, and (iii) to assess the extent of gene orthology and trans-species polymorphism between the brown bear and the giant panda Ailuropoda melanoleuca. Based on generally higher rate of evolution of class I genes among mammals, we expected the extent of trans-species polymorphism to be lower for class I genes. Due to excellent long-term data about mating patterns, reproductive success and parasite load, the Scandinavian brown bear is an ideal system to study contemporary selection on MHC resulting from parasites and mate choice. The present study provides a basis for such work.
Samples analyzed in the present study originated from two brown bear populations sampled within the Scandinavian Brown Bear Research Project. The northern population (N) is located near Jokkmokk in Norrbotten County, Northern Sweden, and the southern population (S) consisted of samples collected in Dalarna and Gävleborg counties in Central Sweden and Hedmark County in Southeastern Norway . Details on sampling and genomic DNA (gDNA) extraction can be found in Waits et al.  and Bellemain et al. . Altogether samples from 234 individuals were used to characterize polymorphism in MHC and 6 samples from the southern population were used to characterize expression of four MHC genes. Samples were collected under permissions: C7/12 for 2012–2014 and C47/9 for 2009–2012, C59/6 for 2006–2008, C40/3 for 2003–2006 from Uppsala Ethical Committee on Animal Experiments, Uppsala, Sweden.
Comprehensive characterization of variation in highly polymorphic MHC genes requires PCR primers amplifying all alleles. To develop such primers for the second, most variable, exon of several MHC genes, we used two approaches. The first was based on the vectorette PCR and the second employed primers located in conserved portions of exons 1 and 3 or 4 to amplify the intervening fragments from cDNA.
We designed primers in conserved regions of the second exon, identified in the alignment of mammalian MHC sequences downloaded from GenBank (Additional file 1:Table S1, the list of accession numbers in Additional file 2: Supplementary information). Parts of the second exon for all four genes were amplified from several individuals, cloned and sequenced as described in Zagalska-Neubauer et al. . These partial sequences allowed the design of several primers within the second exon, which were used in vectorette PCR, performed as described in Babik et al. , to obtain sequences of 5’ and 3’ ends of the second exon from multiple individuals. In the vectorette PCR approach, total genomic DNA is digested with a restriction enzyme (RE) producing sticky ends; then double-stranded adapters (vectorettes) matching the overhangs but showing some internal mismatch (‘bubble’) are ligated. By using one primer specific to the sequence in question and the other specific to the reverse complement of one of the vectorette strands (in the region of mismatch), it is possible to directionally amplify the genomic fragment between the specific primer and the RE recognition site, i.e. outside of the region of known sequence. The consensus of these sequences and sequences obtained from cDNA (see below) allowed the design of robust primers for all studied genes.
We designed primers in conserved regions of the first and fourth (MHC class I genes, Table 1 – primer pair no. 1) or third (MHC class II genes, Table 1 – primer pairs no.: 2–4) exon using mammalian sequences downloaded from GenBank (Table 1, the list of accession numbers in Additional file 2: Supplementary information). Total RNA was extracted with PAXgene Blood RNA kit (Qiagen) from six blood samples frozen in Qiagen’s PAXgene Blood RNA tubes following the manufacturer’s protocol. Complementary DNA (cDNA) was obtained using Omniscript reverse transcriptase (Qiagen) and Oligo(dT)12-18 primer (Invitrogen). Fragments of MHC genes were amplified from cDNA in 15 μL mixes containing 7.5 μL of HotStarTaq Master Mix (Qiagen), 2 μM of each forward and reverse primers, 5.4 μL of PCR-grade water and 1.5 μL of cDNA template. The polymerase chain reaction (PCR) cycling scheme was as follows: 95°C for 15 min, 28 cycles of 95°C for 30 s, 57°C for 30 s, 72°C for 1 min, and the final elongation step at 72°C for 10 minutes. cDNA amplicons were pooled separately for each gene, cloned and 13 – 28 clones per pool were sequenced. Full second exon sequences were used in combination with sequences obtained with the vectorette PCR technique to design primers used in actual genotyping (Table 1 – primer pairs no.: 5–8). All newly designed primer pairs amplified the previously detected alleles, confirming the successful design of genotyping primers.
Investigated MHC genes exhibited varying levels of polymorphism and, consequently, several techniques were used for their genotyping. DQA and DQB genes were slightly or moderately polymorphic so to characterise most variation present in these genes it was sufficient to genotype a relatively small sample by Single Strand Conformational Polymorphism (SSCP) or cloning and sequencing. Genotyping of highly polymorphic class I and class II DRB genes was performed for large samples of individuals by 454 sequencing. For all genes, expression was assessed through genotyping cDNA and gDNA from six individuals.
SSCP and sequencing
PCR conditions for both DQA and DQB were: 95°C for 15 min, 28 cycles of 95°C for 30 s, 55°C for 30 s, 72°C for 1 min, and the final elongation step at 72°C for 10 minutes. The SSCP analysis was performed using GMA gels (Elchrom Scientific). We added 4.5 μL of PCR product to 9.5 μL of premix containing formamide and 10 mM sodium hydroxide, denatured it for 5 min at 95°C and immediately cooled it on ice. Electrophoresis was conducted in 1 x TAE buffer at 8°C, 4 V/cm, for 18 hours. Gels were stained with SYBR Gold Nucleic Acid Gel Stain (Invitrogen). Allele sequences were obtained by sequencing the bands excised from gels. Additional screening for variation in these genes was performed by cloning amplicons pooled from 20 individuals and sequencing multiple clones.
454 sequencing was used for genotyping of highly polymorphic MHC class I and MHC class II DRB genes because initial tests with the SSCP technique resulted in complex, uninterpretable patterns. PCR amplification was conducted using fusion primers, which contained the 454 Titanium adapter sequence (A in forward, B in reverse primer) at the 5’ end, followed by a 6-bp tag (barcode), which distinguished amplicons obtained in different PCR reactions, and the gene specific primer. Tag sequences differed from each other in at least three positions, which minimized the chance of misassigning sequencing reads due to errors in tag sequences. Sequences of fusion primers are given in Table 1 (primer pairs no. 9 and 10). Amplification was carried out in 15 μL, as described above. Ten individuals were amplified and sequenced twice to estimate the genotyping error. PCR products were pooled in approximately equimolar quantities, pools were purified with the MinElute PCR Purification Kit (Qiagen) and sequenced at the Functional Genomics Center Uni/ETH in Zurich. Sequencing was performed bidirectionally using the GS FLX Titanium MV emPCR kit for emulsion PCR and the GS FLX Titanium Sequencing Kit XLR70 in combination with GS FLX Titanium PicoTiterPlate Kit 70 x 75 for sequencing (Roche Applied Sciences). Extraction of reads from multifasta files, assignment of reads to individuals and generation of alignments of variants present in each amplicon were performed with jMHC software . The output from jMHC was analysed using BLAST, Excel and Bioedit .
To minimize the occurrence of false alleles that may be the artefacts of PCR or cloning, we followed the guidance of Lenz & Becker . The artefacts that occur in 454 output may be divided into three types: i) substitutions caused by polymerase errors during PCR, ii) PCR-chimeras and iii) insertions, deletions and substitutions due to 454 sequencing errors [45–47]. The first two types are not specific to 454 sequencing and their frequency may be reduced at the PCR level . Whereas the point substitutions and insertions – deletions (indels) should be relatively rare, the chimeras may be easily produced during PCR by recombination between true alleles [44, 46]. Furthermore, some chimeras may have sequences identical to true alleles, because the latter may originate through historical recombination from other alleles. Distinguishing between the PCR chimeras and the true alleles is based on the rationale that chimeras should always occur with both parental alleles in the amplicon and that the artefacts should be less frequent, as measured by the number of reads per amplicon, than the true alleles.
True alleles were distinguished from the artefacts following the procedure described in Zagalska-Neubauer et al. and Radwan et al. . Briefly, for each sequence variant, we calculated the maximum per amplicon frequency (MPAF) in the whole dataset. Sequences were sorted according to their MPAF. Starting from arbitrary MPAF of 1.5% for MHC class I sequences and 3% for MHC class II DRB sequences, 65 and 28 sequence variants, respectively, were chosen to evaluate whether they represent true alleles or sequence artefacts (see Radwan et al.  for details). For MHC class I sequences within the 1.5-2% MPAF interval, 95% (19 of 20) variants were classified as artefacts and within 2-3%, 75% (8 of 12) were classified as artefacts. All of 33 variants with MPAF above 3% were classified as true alleles. MPAF for the least abundant true allele and most abundant artefact (1.53%-2.86%) defined the “grey zone”, which required a decision about whether a sequence was a true allele or an artefact on a case-by-case basis . For MHC class II DRB sequences, all 12 variants in the 3-10% MPAF interval were classified as artefacts. All of 16 variants with MPAF above 10% were classified as true alleles.
Alleles were named according to the nomenclature proposed by Klein et al.  and were numbered in ascending order, starting from the most abundant one. We use the term “allele” for unique sequence variants for simplicity, but assigning sequence variants to loci was generally not possible in our study.
Genetic differentiation between populations was measured by calculating pairwise FST in ARLEQUIN 3.5 . Because assigning of alleles to loci was not possible, each allele was treated as a dominant locus, and binary encoded.
For all alleles, the average pairwise nucleotide distances (Kimura 2-parameter model - K2P), Poisson corrected amino acid distances, as well as the average rates of synonymous (dS) and nonsynonymous (dN) substitutions, using the Nei-Gojobori method  with the Jukes-Cantor correction for multiple substitutions, were computed in MEGA5 . Standard errors were obtained through 1000 bootstrap replicates.
We used two approaches to test whether positive selection shaped the evolution of the second exon of the investigated genes; the one-tailed Z test comparing dN to dS and comparison of the likelihoods of codon-based models of sequence evolution. The Z-test, as implemented in MEGA5, compared the rates of synonymous vs. nonsynonymous substitutions at all codons, ABS and non-ABS. The location of putative ABS was inferred from the structure of human HLA genes . For MHC class I, putative ABS location was conservatively inferred from the consensus of ABS common to human HLA-A, B and C genes. The comparison of models of sequence evolution for three genes (excluding MHC class II DQA) was performed in PAML 4.2 . Three models were tested: i) M0: a single ω (dN/dS) for all codons, ii) M7: nearly neutral (0 < ω < =1), with ω variation approximated by β-distribution, iii) M8: positive selection (a proportion of codons with ω > 1), with ω variation approximated by β-distribution. The best fitting model was chosen on the basis of the lowest value of the Akaike Information Criterion (AIC, ). Positively selected codons under the M8 model were identified through the Bayes empirical Bayes procedure .
Phylogenetic trees visualizing similarities among MHC alleles were constructed under the Bayesian approach with mrBayes 3.1.2 software . The general time-reversible (GTR) model of sequence evolution with the rate-variation (Γ) among sites was used; parameter values were estimated from the data. Priors were set to default values. Two independent Metropolis coupled Markov chain Monte Carlo simulations (four chains each, three of them heated, temp. 0.20) were run for five million generations and sampled every 1000 generations for DQA, DQB and DRB; for the class I longer runs of 20 million generations were necessary to reach convergence, trees were sampled every 2000 generations. The first 1000 (DQA, DQB, DRB) or 2000 (class I) trees were discarded as burn-in. To calculate the posterior probability of each bipartition, the majority-rule consensus tree was computed from the 8000 (DQA, DQB, DRB) or 16000 (class I) sampled trees.
MHC class I
We genotyped 228 bp fragment of MHC class I 2nd exon in 224 bears, 100 from the northern and 124 from the southern population, and identified 37 alleles (GenBank accession numbers JX469853-89). The nucleotide sequences translated into 33 unique amino-acid sequences (Figure 1a). Two alleles had premature stop codons (PSC). Because assignment of alleles to loci was impossible, instead of true allele frequencies, frequencies of individuals possessing particular alleles are given in Table 2. Frequencies of MHC class I alleles differed significantly between the two Scandinavian brown bear populations (FST = 0.216, P < 10-5)
A comparison of genotypes obtained from cDNA and gDNA of 6 individuals confirmed the expression of 10 alleles (Urar-U*01, *03, *04, *07, *10, *11, *13, *17, *26, *27), whereas 6 alleles were found only in gDNA (Urar-U*02, *05, *06, *08, *09, *12). The non-expressed alleles form two distinct clusters, consistent with the presence of two non-expressed MHC class I loci (Figure 2). We found 5–11 alleles per individual in gDNA. The allele Urar-U*01 was present in all individuals. The 6 bears assayed for MHC class I expression had 6–9 alleles in genomic DNA, but only 3–5 were detected in cDNA (Additional file 3: Table S2). This indicates the expression of at least 3 MHC class I loci in the brown bear, and the presence of a minimum of two non-expressed loci; at least one of these numbers is certainly an underestimate, because a minimum of six loci were present in gDNA when the entire population sample is considered. There was complete concordance between genotypes of both replicates in all cases.
Alleles Urar-U*35 and *36 group with Aime-1906 locus of giant panda (Figure 2). Similar to the panda, alleles of this locus contain amino acid substitution at position 59, where Y is replaced by F, indicating non-classical nature of these alleles [57, 58]. A similar substitution was found in Urar-U*02, *09 and *18 alleles, which form a separate, non-expressed cluster. Allele Urar-U*01 (present in all individuals) and the *24 group with Aime-152 locus. This locus, monomorphic in giant panda, does not bear the landmarks of a non-classical gene .
Ninety of 228 (39.47%) nucleotide positions and 42 of 76 (55.26%) amino-acid positions were variable. Pairwise differences between alleles varied from 0.44% to 31.74% and amino-acid translations showed between 0 and 61.72% pairwise differences. Average nucleotide and amino-acid distances are listed in Table 3. Across all sites, dN and dS were similar and consequently dN/dS did not differ significantly from 1 (Table 4). For ABS, however, dN exceeded dS by a factor of two, although the excess of non-synonymous substitutions was marginally non-significant. The model of codon evolution allowing for positive selection (M8) fitted the data better than models without positive selection (Table 5). The Bayes Empirical Bayes procedure identified eight codons under positive selection (positively selected sites, PSS; Figure 1), five of which were located at ABS, which is more than random expectation (Fisher’s exact p = 0.001).
Because in pseudogens the signal of positive selection may erode over time, we also carried out tests for positive selection after excluding sequences of pseudogens and non-classical MHC class I genes, but this did not change the results qualitatively (Additional file 4: Table S3).
MHC class II DRB
We assayed a 192 bp fragment of the DRB second exon. Sixteen DRB alleles were identified among 234 individuals (100 from the north and 134 from the south) (Additional file 5: DRB sequences, these sequences did not reach the minimum of 200 bp required currently for GenBank submission). Four of these were identical to alleles reported by Goda et al.  (Urar-DRB*11, *13, *16, *17). The sequences did not contain indels or PSC. The 16 nucleotide sequences translated into 15 unique amino-acid sequences (Figure 1b). Frequencies of individuals possessing particular alleles are given in Table 2. FST between the north and south was 0.304 (P < 10-5).
Seven expressed alleles were found in six bears assayed for expression (Figure 3). Two to four alleles per individual were found in gDNA, all expressed, which implies the presence of at least two expressed loci. No discrepancies between replicates were found in 10 replicated individuals (maximum genotyping error 2.6%).
DRB alleles grouped with DRB of other ursids, and separately from DQB alleles (Figure 3). The phylogenetic tree did not reveal clusters corresponding to two loci inferred above: there were 3 well supported clusters, and relationships among the remaining alleles were poorly resolved (Figure 3).
Thirty-seven of 192 (19.27%) nucleotide and 21 of 64 (32.81%) amino-acid positions were variable. Pairwise nucleotide differences ranged from 0.52% to 16.87%, and amino-acid sequence differences ranged between 0 and 33.03%. Nucleotide and amino-acid distances are reported in Table 3. dN significantly exceed dS for all codons, and especially for ABS codons (by a factor of about 5); also at non-ABS sites dN was higher than dS (by a factor of nearly 3), but the difference was not significant (Table 4). PAML analysis showed the best fit of the positive selection model M8 (Table 5). The Bayes Empirical Bayes procedure identified 18 PSS (Figure 1), of which 12 were in ABS. The excess of PSS at ABS was significant (Fisher’s exact p < 0.0001).
MHC class II DQB
Four alleles were found in the 224 bp fragment of the DQB 2nd exon in 26 genotyped individuals (GenBank accession numbers JX469892-5), each translating into a unique amino-acid sequence (Figure 1c). The sequences did not contain indels or PSC. One to 3 alleles per individual were present and all of them appear expressed.
Urar-DQB 2nd exon sequences formed a cluster separate from Urar-DRB sequences, but grouped with giant panda DQB (Figure 3). However, brown bear sequences were more similar to each other than to any of the giant panda DQB sequences.
Twenty-five of 224 (11.16%) nucleotide and 14 of 74 (18.92%) amino-acid positions were variable. Pairwise differences ranged between 2.27% and 11.04% for nucleotide sequences, and pairwise amino-acid differences ranged between 2.74% and 20.97%. Nucleotide and amino-acid distances are presented in Table 3. Across all sites, dN was nearly equal to dS, but for ABS dN was significantly higher (by a factor of about 3; Table 4). However, PAML analysis did not provide evidence for positive selection, as M7 model fitted the data best.
MHC class II DQA
Two alleles were found in the 202 bp fragment of DQA 2nd exon in 26 genotyped individuals (Genbank accession numbers JX469890-1). The sequences did not contain indels or PSC. Each allele translated into a unique amino-acid sequence (Figure 1d). All six individuals assayed for expression had only allele Urar-DQA*05 in gDNA and cDNA. Allele Urar-DQA*06 is more similar to one of the giant panda’s alleles than to Urar-DQA*05 (Figure 4). The two nucleotide sequences differed by 2.97%, and the difference at the amino-acid level was 5.97%. dN was not significantly different from dS (Table 4).
We characterised, for the first time, sequences coding for the peptide binding groove (2nd exon) of the MHC class I in the brown bear and report a number of new class II alleles in the Scandinavian brown bear populations. We found abundant variation in sequences coding for peptide binding groove in both MHC classes, but the three analysed class II genes differed in the level of polymorphism. We have found 37 MHC class I, 16 MHC class II DRB (12 of them new), 4 DQB and 2 DQA alleles.
MHC class II genes of the brown bear have been previously studied by Goda et al. [35, 36]. The authors characterised partial DQA 2nd exon and 2nd intron sequences, reporting 4 alleles containing only nonsynonymous substitutions, predominantly occurring at putative ABS, as inferred from crystallographic models of HLA. These 4 alleles probably represented 2 loci, one of which was not expressed. Our data also suggest the presence of one expressed DQA locus. The rarer of two alleles we found probably belongs to the same locus, as it was not present in six individuals from which we characterised both gDNA and cDNA sequences. Here, we also characterised, for the first time in the brown bear, DQB sequences, encoding the β chain, which forms a biding groove in a dimer with the α chain coded by DQA. The level of polymorphism at DQB was also low, with only four alleles, belonging to two loci, in a sample of 26 individuals. A higher level of polymorphism has been reported for class II DRB 2nd exon, with 19 alleles found in 38 individuals from Japan, Alaska and Siberia . Also Scandinavian populations are characterised by substantial DRB polymorphism, with 16 alleles present, but only 4 alleles were identical to those reported earlier. Goda et al.  inferred the presence of at least two DRB loci, but their expression status was not established. Our data confirm this number and confirm that both loci are expressed.
MHC class I in brown bear, characterised for the first time in this study, consist of at least three expressed, and at least two non-expressed loci. Two alleles (Urar-U*01 and *24) clustered with a monomorphic Aime-152 gene, which suggests that this cluster represents a separate locus in the brown bear. This hypothesis is further supported by presence of at least one sequence belonging to this cluster in all individuals investigated.
The two Scandinavian brown bear populations are highly differentiated in MHC class I genes as well as in class II DRB genes. High FST in both genes are consistent with findings of Taberlet et al. , based on mtDNA analysis, that Scandinavian brown bears originated from two refugia and colonized the area from two directions – from the south and the east. Analysis of 19 microsatellites detected three bear subpopulations in Scandinavia: North, Middle and South , but confirmed the earlier mtDNA results, in that there was a high genetic differentiation between the south and other two populations. Our samples were from the North and South subpopulations, as defined by Manel et al..
MHC class II DRB and DQB genes clustered with respective panda sequences, as expected based on the relative conservation of class II genes among mammals . Non-classical class I brown bear sequences also grouped with the sequence of the non-classical giant panda 1906 locus and with the dog DLA-79 locus. Another two brown bear sequences formed a distinct cluster with Aime-152 locus, which was monomorphic in panda. Thus, it seems that orthology has been maintained in MHC class I genes of ursids for over 12 million years, since the divergence of Ursus and Ailuropoda. Two distinct MHC class I clusters contained non-expressed sequences. The pseudogenisation of a polymorphic cluster probably exemplifies a birth-and-death process, which in the long run may cause the lack of orthology of MHC class I genes among taxa [2, 3].
Some examples of trans-species polymorphism were observed between brown bear and giant panda MHC class II sequences DQA (Urar-DQA*06 and Aime-DQA1*01) and possibly also in DRB genes (Urar-DRB*04 and Aime-DRB*02). In class I, alleles Aime-128*03 and Urar-U*29 appeared to group together, but the grouping was only weakly supported. The sparse data does not allow us to establish whether the brown bear MHC class I and class II differ in the extent of transspecies polymorphism, with respect to panda sequences. Transspecies polymorphism was observed for DRB sequences within the genus Ursus, as also noted by Goda et al. . The lack of sequences for MHC class I did not allow comparison of the extent of transspecies polymorphism at this level between MHC class I and MHC class II.
MHC class II DRB genes showed the strongest signal of historical positive selection. Goda et al.  also inferred positive selection at ABS sites, but not at non-ABS sites. However, the dN/dS ratio they reported (1.96), based on a subset of sequences which we analysed, was substantially lower than our estimate (5.08). In the giant panda, there was an evidence for positive selection at DRB3 locus, with ω estimates of 9.2-10.9, but not at the DRB1 locus . The results for DQB genes showed evidence for positive selection acting only in putative ABS, where dN/dS significantly exceeded 1. No signal of positive selection was detected in DQA.
MHC class I genes also evolved under positive selection. Model with positive selection fitted the MHC class I sequences best, and dN at putative ABS sites was twice as high as dS, although the excess of non-synonymous substitution was marginally non-significant using the Z test. The dN/dS ratio for MHC class I was much smaller than that found for the DRB locus (Table 4). This might be due to including into the analysis pseudogens, which might have lost the signal of positive selection, but excluding pseudogene sequences did not substantially change the estimate. A comparatively lower dN/dS ratio in MHC class I could also potentially have resulted from generally higher divergence within MHC class I loci and consequently saturation at nonsynonymous sites . Indeed, dN for MHC class I at ABS was even higher than for DRB genes, but the latter accumulated fewer synonymous substitutions. However, after excluding pseudogenes, dN at ABS for MHC class I was actually lower than for DRB. Thus, selection on DRB loci seems to be more pronounced than that on MHC class I loci, as additionally indicated by a number of positively selected sites (PSS) in DRB exceeding that at MHC class I by a factor of two. As a result, even though the proportion of PSS matching human ABS was actually similar at DRB and MHC class I, as many as 6 PSS were detected outside ABS in DRB, which was also reflected by high dN/dS ratios at non-ABS DRB sites.
Very strong positive selection on ABS in DRB was also reported for canines , with ω for positively selected sites under M8 model equal to 12.02, a value very similar to the one we found for the brown bear. For comparison, the value is 3.99 for humans and 5.03 in bovines . Thus, it seems that a very strong positive selection on DRB may be a general feature of caniform MHC. The five PSS (positions number 8, 9, 10, 16, 56) which Furlong et al. reported as canine-specific (i.e. not overlapping with PSS in primates and bovines) did not overlap with brown bear PSS, except for position 9, which is an ABS. Hence, the strong signal of positive selection detected in brown bear does not seem to result from phylogenetic history, but rather from species-specific selection pressure. Such high selection pressure may be capable of maintaining MHC polymorphism even in heavily bottlenecked populations; indeed an endangered canid, the island fox Urocyon littoralis is a rare example of such situation: variation at DRB locus is maintained despite depletion of neutral variation . Simulations have shown, however, that selection pressure from parasites is unlikely to maintain MHC variation in bottlenecked populations , so it is tempting to speculate that it may result from mate choice for dissimilar MHC type. We are currently investigating this possibility in the brown bear.
In summary, our work revealed high polymorphism of both MHC class I and class II DRB genes, with limited polymorphism at DQ genes in two Scandinavian populations of the brown bear. Both MHC class I and DRB genes have undergone significant positive selection during the evolutionary history of brown bear. There were no obvious differences between the classes in the degree of putative orthology to giant panda MHC genes, although pseudogenisation of two of the MHC class I clusters indicated that gene turnover may be higher in this class. Our data provide solid background to study contemporary selection resulting from parasites and mate choice on MHC in the brown bear.
Klein J: The natural history of the major histocompatibility complex. 1986, New York: Wiley and Sons
Nei M, Gu X, Sitnikova T: Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc Natl Acad Sci USA. 1997, 94 (15): 7799-7806. 10.1073/pnas.94.15.7799.
Nei M, Hughes AL: Proceedings of the 11th histocompatibility workshop and conferrence. Balanced polymorphism and evolution by the birth-and-death process in the MHC loci. 1992, Oxford, UK: Oxford University Press, 27-38.
Janeway C: Immunobiology: The immune system in health and disease. 2004, London: Current Biology Publications
Brown J, Jardetzky T, Saper M, Samraoui B, Bjorkman P, Wiley D: A hypothetical model of the foreign antigen-binding site of class-II histocompatibility molecules. Nature. 1988, 332 (6167): 845-850. 10.1038/332845a0.
Reche PA, Reinherz EL: Sequence variability analysis of human class I and class II MHC molecules: functional and structural correlates of amino acid polymorphisms. J Mol Biol. 2003, 331 (3): 623-641. 10.1016/S0022-2836(03)00750-2.
Brown JH, Jardetzky TS, Gorga JC, Stern LJ, Urban RG, Strominger JL, Wiley DC: 3-Dimensional structure of the human class-II histocompatibility antigen HLA-DR1. Nature. 1993, 364 (6432): 33-39. 10.1038/364033a0.
Snell GD: The H-2 locus of the mouse: observations and speculations concerning its comparative genetics and its polymorphism. Folia Biologica (Praha). 1968, 14 (5): 335-358.
Borghans JAM, Beltman JB, De Boer RJ: MHC polymorphism under host-pathogen coevolution. Immunogenetics. 2004, 55 (11): 732-739. 10.1007/s00251-003-0630-5.
Doherty PC, Zinkernagel RM: Enhanced immunological surveillance in mice heterozygous at the H-2 gene complex. Nature. 1975, 256 (5512): 50-52. 10.1038/256050a0.
Trachtenberg E, Korber B, Sollars C, Kepler TB, Hraber PT, Hayes E, Funkhouser R, Fugate M, Theiler J, Hsu YS, et al: Advantage of rare HLA supertype in HIV disease progression. Nat Med. 2003, 9 (7): 928-935. 10.1038/nm893.
Penn D, Damjanovich K, Potts W: MHC heterozygosity confers a selective advantage against multiple-strain infections. Proc Natl Acad Sci USA. 2002, 99 (17): 11260-11264. 10.1073/pnas.162006499.
Hill AVS, Allsopp CEM, Kwiatkowski D, Anstey NM, Twumasi P, Rowe PA, Bennett S, Brewster D, McMichael AJ, Greenwood BM: Common West African HLA antigens are associated with protection from severe malaria. Nature. 1991, 352 (6336): 595-600. 10.1038/352595a0.
Kloch A, Babik W, Bajer A, Sinski E, Radwan J: Effects of an MHC-DRB genotype and allele number on the load of gut parasites in the bank vole Myodes glareolus. Mol Ecol. 2010, 19: 255-265.
Thursz MR, Thomas HC, Greenwood BM, Hill AVS: Heterozygote advantage for HLA class-II type in hepatitis B virus infection. Nat Genet. 1997, 17 (1): 11-12. 10.1038/ng0997-11.
Carrington M: Recombination within the human MHC. Immunol Rev. 1999, 167: 245-256. 10.1111/j.1600-065X.1999.tb01397.x.
Kaufman J, Wallny HJ: Chicken MHC molecules, disease resistance and the evolutionary origin of birds. Immunology and Developmental Biology of the Chicken. 1996, 212: 129-141. 10.1007/978-3-642-80057-3_12.
Langefors A, Lohm J, Grahn M, Andersen O, von Schantz T: Association between major histocompatibility complex class IIB alleles and resistance to Aeromonas salmonicida in Atlantic salmon. Proc R Soc Lond B Biol Sci. 2001, 268 (1466): 479-485. 10.1098/rspb.2000.1378.
Bonneaud C, Perez-Tris J, Federici P, Chastel O, Sorci G: Major histocompatibility alleles associated with local resistance to malaria in a passerine. Evolution. 2006, 60 (2): 383-389.
Eizaguirre C, Yeates SE, Lenz TL, Kalbe M, Milinski M: MHC-based mate choice combines good genes and maintenance of MHC polymorphism. Mol Ecol. 2009, 18 (15): 3316-3329. 10.1111/j.1365-294X.2009.04243.x.
Froeschke G, Sommer S: MHC class II DRB variability and parasite load in the striped mouse (Rhabdomys pumilio) in the Southern Kalahari. Mol Biol Evol. 2005, 22 (5): 1254-1259. 10.1093/molbev/msi112.
Loiseau C, Zoorob R, Robert A, Chastel O, Julliard R, Sorci G: Plasmodium relictum infection and MHC diversity in the house sparrow (Passer domesticus). Proceedings of the Royal Society B-Biological Sciences. 2011, 278 (1709): 1264-1272. 10.1098/rspb.2010.1968.
Yamazaki K, Boyse EA, Mike V, Thaler HT, Mathieson BJ, Abbott J, Boyse J, Zayas ZA, Thomas L: Control of mating preferences in mice by genes in Major Histocompatibility Complex. J Exp Med. 1976, 144 (5): 1324-1335. 10.1084/jem.144.5.1324.
Radwan J, Tkacz A, Kloch A: MHC and preferences for male odour in the bank vole. Ethology. 2008, 114 (9): 827-833. 10.1111/j.1439-0310.2008.01528.x.
Wedekind C, Seebeck T, Bettens F, Paepke AJ: MHC-dependent mate preferences in humans. Proc R Soc Lond B Biol Sci. 1995, 260 (1359): 245-249. 10.1098/rspb.1995.0087.
Olsson M, Madsen T, Nordby J, Wapstra E, Ujvari B, Wittsell H: Major histocompatibility complex and mate choice in sand lizards. Proc R Soc Lond B Biol Sci. 2003, 270: S254-S256. 10.1098/rsbl.2003.0079.
Hedrick PW: Female choice and variation in the major histocompatibility complex. Genetics. 1992, 132 (2): 575-581.
Klein J: Origin of major histocompatibility complex polymorphism - the transspecies hypothesis. Hum Immunol. 1987, 19 (3): 155-162. 10.1016/0198-8859(87)90066-8.
Hughes AL, Nei M: Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature. 1988, 335 (6186): 167-170. 10.1038/335167a0.
Bernatchez L, Landry C: MHC studies in nonmodel vertebrates: what have we learned about natural selection in 15 years?. J Evol Biol. 2003, 16 (3): 363-377. 10.1046/j.1420-9101.2003.00531.x.
Hughes AL, Nei M: Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proc Natl Acad Sci USA. 1989, 86 (3): 958-962. 10.1073/pnas.86.3.958.
Garrigan D, Hedrick PW: Perspective: Detecting adaptive molecular polymorphism, lessons from the MHC. Evolution. 2003, 57: 1707-1722.
Klein J, Figueroa F: Evolution of the major histocompatibility complex. CRC Crit Rev Immunol. 1986, 6 (4): 295-386.
Takahashi K, Rooney A, Nei M: Origins and divergence times of mammalian class II MHC gene clusters. J Hered. 2000, 91 (3): 198-204. 10.1093/jhered/91.3.198.
Goda N, Mano T, Masuda R: Genetic diversity of the MHC class-II DQA gene in brown bears (ursus arctos) on hokkaido, northern japan. Zoolog Sci. 2009, 26 (8): 530-535. 10.2108/zsj.26.530.
Goda N, Mano T, Kosintsev P, Vorobiev A, Masuda R: Allelic diversity of the MHC class II DRB genes in brown bears (Ursus arctos) and a comparison of DRB sequences within the family Ursidae. Tissue Antigens. 2010, 76 (5): 404-410. 10.1111/j.1399-0039.2010.01528.x.
Bjärvall A, Sandegren F: Early experiences with the first radio-marked brown bears in Sweden. Int. Conf. Bear Res. Manage. 1987, 7: 9-12.
Waits L, Taberlet P, Swenson J, Sandegren F, Franzen R: Nuclear DNA microsatellite analysis of genetic diversity and gene flow in the Scandinavian brown bear (Ursus arctos). Mol Ecol. 2000, 9 (4): 421-431. 10.1046/j.1365-294x.2000.00892.x.
Bellemain E, Swenson J, Tallmon O, Brunberg S, Taberlet P: Estimating population size of elusive animals with DNA from hunter-collected feces: Four methods for brown bears. Conserv Biol. 2005, 19 (1): 150-161. 10.1111/j.1523-1739.2005.00549.x.
Zagalska-Neubauer M, Babik W, Stuglik M, Gustafsson L, Cichoń M, Radwan J: 454 sequencing reveals extreme complexity of the class II Major Histocompatibility Complex in the collared flycatcher. BMC Evol Biol. 2010, 10: 395-10.1186/1471-2148-10-395.
Babik W, Pabijan M, Radwan J: Contrasting patterns of variation in MHC loci in the alpine newt. Mol Ecol. 2008, 17 (10): 2339-2355. 10.1111/j.1365-294X.2008.03757.x.
Stuglik M, Radwan J, Babik W: jMHC: software assistant for multilocus genotyping of gene families using next-generation amplicon sequencing. Mol Ecol Resour. 2011, 11 (4): 739-742. 10.1111/j.1755-0998.2011.02997.x.
Hall TA: Bioedit: an user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposia Series. 1999, 41: 95-98.
Lenz TL, Becker S: Simple approach to reduce PCR artefact formation leads to reliable genotyping of MHC and other highly polymorphic loci - Implications for evolutionary analysis. Gene. 2008, 427 (1–2): 117-123.
Babik W, Taberlet P, Ejsmond MJ, Radwan J: New generation sequencers as a tool for genotyping of highly polymorphic multilocus MHC system. Mol Ecol Resour. 2009, 9 (3): 713-719. 10.1111/j.1755-0998.2009.02622.x.
Galan M, Guivier E, Caraux G, Charbonnel N, Cosson JF: A 454 multiplex sequencing method for rapid and reliable genotyping of highly polymorphic genes in large-scale studies. BMC Genomics. 2010, 11: 296-10.1186/1471-2164-11-296.
Babik W: Methods for MHC genotyping in non-model vertebrates. Mol Ecol Resour. 2010, 10 (2): 237-251. 10.1111/j.1755-0998.2009.02788.x.
Radwan J, Zagalska-Neubauer M, Cichoń M, Sendecka J, Kulma K, Gustafsson L, Babik W: MHC diversity, malaria and lifetime reproductive success in collared flycatchers. Mol Ecol. 2012, 21 (10): 2469-2479. 10.1111/j.1365-294X.2012.05547.x.
Klein J, Bontrop RE, Dawkins RL, Erlich HA, Gyllensten UB, Heise ER, Jones PP, Parham P, Wakeland EK, Watkins DI: Nomenclature for the major histocompatibility complexes of different species: a proposal. Immunogenetics. 1990, 31 (4): 217-219.
Excoffier L, Lischer H: Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010, 10 (3): 564-567. 10.1111/j.1755-0998.2010.02847.x.
Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986, 3 (5): 418-426.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28 (10): 2731-2739. 10.1093/molbev/msr121.
Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24 (8): 1586-1591. 10.1093/molbev/msm088.
Posada D, Buckley TR: Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. Syst Biol. 2004, 53 (5): 793-808. 10.1080/10635150490522304.
Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005, 22 (12): 2472-2479. 10.1093/molbev/msi237.
Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.
Shum B, Rajalingam R, Magor K, Azumi K, Carr W, Dixon B, Stet R, Adkison M, Hedrick R, Parham P: A divergent non-classical class I gene conserved in salmonids. Immunogenetics. 1999, 49 (6): 479-490. 10.1007/s002510050524.
Pan H, Wan Q, Fang S: Molecular characterization of major histocompatibility complex class I genes from the giant panda (Ailuropoda melanoleuca). Immunogenetics. 2008, 60 (3–4): 185-193.
Taberlet P, Swenson J, Sandegren F, Bjarvall A: Localization of a contact zone between 2 highly divergent mitochondrial-DNA lineages of the brown bear Ursus arctos in Scandinavia. Conserv Biol. 1995, 9 (5): 1255-1261.
Manel S, Bellemain E, Swenson J, Francois O: Assumed and inferred spatial structure of populations: the Scandinavian brown bears revisited. Mol Ecol. 2004, 13 (5): 1327-1331. 10.1111/j.1365-294X.2004.02074.x.
Chen Y, Zhang Y, Zhang H, Ge Y, Wan Q, Fang S: Natural selection coupled with intragenic recombination shapes diversity patterns in the major histocompatibility complex class II genes of the giant panda. J Exp Zool B Mol Dev Evol. 2010, 314B (3): 208-223.
Richman AD, Herrera LG, Nash D: MHC class II beta sequence diversity in the deer mouse (Peromyscus maniculatus): implications for models of balancing selection. Mol Ecol. 2001, 10 (12): 2765-2773.
Furlong RF, Yang Z: Diversifying and purifying selection in the peptide binding region of DRB in mammals. J Mol Evol. 2008, 66 (4): 384-394. 10.1007/s00239-008-9092-6.
Aguilar A, Roemer G, Debenham S, Binns M, Garcelon D, Wayne R: High MHC diversity maintained by balancing selection in an otherwise genetically monomorphic mammal. Proc Natl Acad Sci USA. 2004, 101 (10): 3490-3494. 10.1073/pnas.0306582101.
Ejsmond MJ, Radwan J: MHC diversity in bottlenecked populations: a simulation model. Conserv Genet. 2011, 12 (1): 129-137. 10.1007/s10592-009-9998-6.
This work was supported by grant N N303 457338 from the Polish National Science Centre, Foundation for Polish Science professor subsidy 9/2008 to JR. The Scandinavian Brown Bear Research Project (SBBRP) is funded by the Swedish Environmental Protection Agency, Norwegian Directorate for Nature Management, Swedish Association for Hunting and Wildlife Management, the Austrian Science Foundation, and the Research Council of Norway. This is paper no. 140 from the SBBRP.
The authors declare that they have no competing interests.
KK participated in study design, carried out the laboratory analyses, analysed the data and helped to draft the manuscript, WB helped to analyze the data and to draft the manuscript, KB and EŚ carried some laboratory analyses, JK, PT and JES participated in study design and coordination, provided the samples and commented on the manuscript, JR participated in study design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.