Different level of population differentiation among human genes
https://doi.org/10.1186/1471-2148-11-16
© Wu and Zhang; licensee BioMed Central Ltd. 2011
Received: 3 September 2010
Accepted: 14 January 2011
Published: 14 January 2011
Abstract
Background
During the colonization of the world, after dispersal out of African, modern humans encountered changeable environments and substantial phenotypic variations that involve diverse behaviors, lifestyles and cultures, were generated among the different modern human populations.
Results
Here, we study the level of population differentiation among different populations of human genes. Intriguingly, genes involved in osteoblast development were identified as being enriched with higher FST SNPs, a result consistent with the proposed role of the skeletal system in accounting for variation among human populations. Genes involved in the development of hair follicles, where hair is produced, were also found to have higher levels of population differentiation, consistent with hair morphology being a distinctive trait among human populations. Other genes that showed higher levels of population differentiation include those involved in pigmentation, spermatid, nervous system and organ development, and some metabolic pathways, but few involved with the immune system. Disease-related genes demonstrate excessive SNPs with lower levels of population differentiation, probably due to purifying selection. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection.
Conclusion
Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
Background
After dispersal from Africa, humans have evolved to be characterized by substantial phenotypic variation, including variation in skin, hair, and eye color, body mass, height, diet, drug metabolism, susceptibility and resistance to disease, during the colonization of the World. Efforts to reveal the genetic bases of these variations should provide important insight into the history of human evolution, gene function, and the mechanisms of disease [1, 2]. Indeed, with the advent of large scale comparative genomic and human polymorphism data, a flood of studies have identified many candidate genes and genomic regions accounting for the observed phenotypic characters [2]. However, the evolutionary forces, i.e., positive selection, balancing selection, purifying selection, or neutral evolution, driving the variation of these phenotypic traits remain largely unknown.
In general, population differentiation under neutral evolution is mostly influenced by demographic history; however, adaptation to a local environment, driven by positive selection, will increase the level of population differentiation [3]. In contrast, negative and balancing selection tends to reduce population differentiation [3]. Accordingly, the evaluation of the level of population differentiation of the human genome would be helpful and informative for the identification of the genetic basis of the phenotypic difference observed in different human populations.
Results and Discussion
Here, we evaluated the level of population differentiation for human genes on autosomal chromosomes among three populations: African, European and East Asian, based on the HapMap data (Phase II) [4], using the parameter FST according to methods described previously [3, 5]. A previous study has reported that there is a higher level of population differentiation at gene regions compared to non-gene regions in the genome [6]. However, in our analysis, we observed that for several chromosomes, including 5, 6, 8, 11, 13, and 20, did not show a pattern with higher population differentiation at genic compared to non-genic regions, namely genic regions did not have excess SNPs with a higher FST (≥0.6) (Figure S1 in Additional file 1).
Functional significance of genes with higher levels of population differentiation
λ values of GO categories in biological processes enriched for higher F ST SNPs with P -value lower than 10 -10 .
An intriguing observation is that osteoblast development is significantly rich in high FST SNPs (λ = 12.28, P= 4.92E-88 after multiple testing). Osteoblasts are mononucleate cells that are responsible for bone formation. Modern humans demonstrate substantial phenotypic variation, which to a large extent can be illuminated by the skeletal system, such as height, body mass, body mineral density, and craniofacial differences. Indeed, evidence indicates that the human skeletal system has evolved rapidly since the advent of agriculture [9] and our recent study concluded that the high levels of population differentiation of skeletal genes among human populations was driven by positive selection [10].
Another interesting category is hair follicle development, which also showed a higher level of population differentiation (GO: 0001942, λ = 4.09, P= 2.07E-08 after multiple testing). Hair is produced by hair follicles. Similar to the skeletal system, hair morphology, including water swelling diameter and section, shape of fiber, mechanical properties, combability and hair moisture, have distinctive traits among human populations [11]. Previous studies have identified some genes involved in hair follicle development that have undergone recent positive selection, as detected by the long range haplotype homozygosity test, such as EDAR and EDA2R [12, 13]. These studies, together with our evidence of higher population differentiation in the genes involved in the hair follicle development support a hypothesis of adaptive evolution accounting for the diversification of human hair.
Consistent with previous observations [12, 14], genes involved in pigmentation, including the following GO processes: pigmentation during development, pigmentation, and melanocyte differentiation, demonstrated significantly higher population differentiation. In a similar manner, reproduction associated processes, e.g. sperm motility, spermatid development, gamete generation, have higher levels of population differentiation (Figure 1). Among the categories with a significant enrichment of higher FST SNPs, many are involved in the nervous system, e.g. dorsoventral neural tube patterning (GO: 0021904, λ = 15.67), hindbrain development (GO: 0030902, λ = 11.08), positive regulation of neuron differentiation (GO: 0045666, λ = 8.50), and neuron development (GO: 0048666, λ = 5.27) (Figure 1). Others categories include metabolic process, such as the triglyceride metabolic process (GO: 0006641, λ = 6.69), glucose homeostasis (GO: 0042593, λ = 4.64), cholesterol homeostasis (GO: 0042632, λ = 4.35), possibly resulting from the variation in metabolism among humans.
Immunity-related genes, however, which are a common target of positive selection [2, 15, 16], are involved in small list of categories with a higher proportion of higher FST SNPs. This observation is probably attributable to the fact that many of the genes in the immunity system evolve under balancing selection in human populations for a heterozygote advantage, which would reduce the level of population differentiation [17, 18].
Tables S1 in Additional file 2, and Tables S2 in Additional file 3 summarize the GO categories in cellular component and molecular function with an enrichment of higher FST SNPs.
The F ST (≥0.6) distribution of SNPs in the biological processes in Figure 1 and of genome-wide genes. (A) FST all -values among the three populations. (B) FST(CEU-EA) -values between Europeans and East Asians. (C) FST(EA-YRI) -values between East Asians and Africans. (D) FST(CEU-YRI) -values between Europeans and Africans.
Population differentiation under neutral evolution is mostly influenced by demographic history (that is, genetic drift and gene flow), which can generate similar pattern with biological factor such as natural selection. However, demographic history tends to influence all loci in the genome equally, and natural selection acts only on the single gene or a group of functional related genes. Compared with the proportion of higher FST SNPs in the genome-wide genes, we present some groups of functional related genes enriched with high FST SNPs, which are mostly driven by positive natural selection, although the confounding factor of demographic history cannot be excluded absolutely.
Population differentiation in disease-related genes
Proportions of SNPs with F ST ≤ 0.05 at each global MAF (minor allele frequencies) bin in complex disease genes (A), and OMIM genes (B), compared to that of other genes. The black nodes indicate significantly higher proportion in disease genes with P < 0.01.
Proportions of SNPs with F ST ≥ 0.60 at each global MAF (minor allele frequencies) bin in OMIM genes and non-OMIM genes. The P-value with statistical significance is presented above each bin.
Lower levels of population differentiation in microRNA targeted genes
Proportions of SNPs with F ST ≤ 0.05 at each global MAF (minor allele frequencies) bin for microRNA targeted genes compared with other genes. The P-value is presented above each bin.
Conclusions
In this study, we find that genes involved in osteoblast development, hair follicles development, pigmentation, spermatid, nervous system and organ development, and some metabolic pathways have higher levels of population differentiation. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection. Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
Methods
Since genes on the sex chromosomes are involved in higher population differentiation than those on the autosomal chromosomes [3], we only analyzed data from the autosomal chromosomes. Allele frequency data for SNPs on autosomes were retrieved from HapMap Phase II (release 24, NCBI36) [4] for three populations: African (YRI panel including 60 Yoruban individuals from Ibadan), European (CEU panel including 60 individuals of Utah residents with ancestry from northern and western Europe) and East Asian (EA panels including 45 Han Chinese (HCB) and 45 Japanese from Tokyo (JPT)).To evaluate the degree of population differentiation, FST values of the polymorphic SNPs with minor allele frequencies ≥0.01 in at least one population were calculated as previously described [3, 5]. Since negative values have no biological explanation these were set to 0.
Protein coding genes on the human autosomal chromosomes, and their corresponding gene ontology (GO) terms including three categories: biological process, cellular component, and molecular function, were downloaded from Ensembl (http://www.ensembl.org version 54) by means of BioMart [25]. Each gene was extended 500 bp upstream of 5'-termus and downstream of 3'-termus to include all of its SNPs. χ2 tests with one degree of freedom were used to test for the significance of the enrichment of SNPs with higher (≥0.6) FST values compared with genome-wide genes empirical data based on 2 × 2 contingency tables constructed by the numbers of SNPs. For these analyses, Bonferroni correction was used for the multiple testing. To better understand the enrichment, we calculated the parameter, λ, the ratio of the proportion of higher FST SNPs in the analyzed category to that in the genome-wide genes. λ values significantly higher than 1 indicates a higher population differentiation of genes in the category among human populations.
Complex disease genes were obtained from the Genetic Association Database (GAD) [26]. Human Mendelian disease genes were obtained from the study by Blekhman et al. (2008) (OMIM) [20]. Genes targeted by microRNA were obtained from targetscan (http://www.targetscan.org, release 5.1) [27–29]. For these genes, χ2 tests with one degree of freedom were used to test the significance of an enrichment of SNPs with higher (≥0.6) FST values and lower (≤0.05) FST values, respectively, compared with other genes based on 2 × 2 contingency tables constructed by the numbers of SNPs.
Declarations
Acknowledgements
We thank Dr. David Irwin for revising the manuscript, and valuable comments. This work was supported by grants from the National Basic Research Program of China (973 Program, 2007CB411600), the National Natural Science Foundation of China (30621092), and Bureau of Science and Technology of Yunnan Province.
Authors’ Affiliations
References
- Novembre J, Di Rienzo A: Spatial patterns of variation due to natural selection in humans. Nat Rev Genet. 2009, 10 (11): 745-755. 10.1038/nrg2632.View ArticlePubMedPubMed CentralGoogle Scholar
- Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312 (5780): 1614-1620. 10.1126/science.1124309.View ArticlePubMedGoogle Scholar
- Akey JM, Zhang G, Zhang K, Jin L, Shriver MD: Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002, 12 (12): 1805-1814. 10.1101/gr.631202.View ArticlePubMedPubMed CentralGoogle Scholar
- The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-861. 10.1038/nature06258.View ArticlePubMed CentralGoogle Scholar
- Weir BS, Cockerham CC: Estimating F-statistics for the analysis of population structure. Evolution. 1984, 38: 1358-1370. 10.2307/2408641.View ArticleGoogle Scholar
- Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L: Natural selection has driven population differentiation in modern humans. Nat Genet. 2008, 40 (3): 340-345. 10.1038/ng.78.View ArticlePubMedGoogle Scholar
- Hojo M, Kita A, Kageyama R, Hashimoto N: Notch-Hes signaling in pituitary development. Expert Review of Endocrinology & Metabolism. 2008, 3 (1): 91-100.View ArticleGoogle Scholar
- Zhu X, Gleiberman AS, Rosenfeld MG: Molecular physiology of pituitary development: signaling and transcriptional networks. Physiol Rev. 2007, 87 (3): 933-963. 10.1152/physrev.00006.2006.View ArticlePubMedGoogle Scholar
- Larsen CS: Biological changes in human populations with agriculture. Ann Rev Anthropol. 1995, 24 (1): 185-213. 10.1146/annurev.an.24.100195.001153.View ArticleGoogle Scholar
- Wu DD, Zhang YP: Positive selection drives population differentiation in the skeletal genes in modern humans. Hum Mol Genet. 2010, 19: 2341-2346. 10.1093/hmg/ddq107.View ArticlePubMedGoogle Scholar
- Franbourg A, Hallegot P, Baltenneck F, Toutain C, Leroy F: Current research on ethnic hair. J Am Acad Dermatol. 2003, 48 (6PB): 115-119. 10.1067/mjd.2003.277.View ArticleGoogle Scholar
- Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R, et al: Genome-wide detection and characterization of positive selection in human populations. Nature. 2007, 449 (7164): 913-918. 10.1038/nature06250.View ArticlePubMedPubMed CentralGoogle Scholar
- Fujimoto A, Kimura R, Ohashi J, Omi K, Yuliwulandari R, Batubara L, Mustofa MS, Samakkarn U, Settheetham-Ishida W, Ishida T, et al: A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. Hum Mol Genet. 2008, 17 (6): 835-843. 10.1093/hmg/ddm355.View ArticlePubMedGoogle Scholar
- Izagirre N, Garcia I, Junquera C, de la Rua C, Alonso S: A scan for signatures of positive selection in candidate loci for skin pigmentation in humans. Mol Biol Evol. 2006, 23 (9): 1697-1706. 10.1093/molbev/msl030.View ArticlePubMedGoogle Scholar
- Wu DD, Zhang YP: Positive Darwinian selection in human population: A review. Chinese Science Bulletin. 2008, 53 (10): 1457-1467. 10.1007/s11434-008-0202-z.Google Scholar
- Vallender EJ, Lahn BT: Positive selection on the human genome. Hum Mol Genet. 2004, 13 (Review Issue 2): R245-R254. 10.1093/hmg/ddh253.View ArticlePubMedGoogle Scholar
- Ferrer-Admetlla A, Bosch E, Sikora M, Marques-Bonet T, Ramirez-Soriano A, Muntasell A, Navarro A, Lazarus R, Calafell F, Bertranpetit J: Balancing selection is the main force shaping the evolution of innate immunity genes. J Immunol. 2008, 181 (2): 1315-1322.View ArticlePubMedGoogle Scholar
- Fumagalli M, Cagliani R, Pozzoli U, Riva S, Comi GP, Menozzi G, Bresolin N, Sironi M: Widespread balancing selection and pathogen-driven selection at blood group antigen genes. Genome Res. 2009, 19 (2): 199-212. 10.1101/gr.082768.108.View ArticlePubMedPubMed CentralGoogle Scholar
- Cai JJ, Borenstein E, Chen R, Petrov DA: Similarly strong purifying selection acts on human disease genes of all evolutionary ages. Genome Biol Evol. 2009, 2009 (0): 131-144.Google Scholar
- Blekhman R, Man O, Herrmann L, Boyko AR, Indap A, Kosiol C, Bustamante CD, Teshima KM, Przeworski M: Natural selection on genes that underlie human disease susceptibility. Curr Biol. 2008, 18 (12): 883-889. 10.1016/j.cub.2008.04.074.View ArticlePubMedPubMed CentralGoogle Scholar
- Nielsen R, Hubisz MJ, Hellmann I, Torgerson D, Andres AM, Albrechtsen A, Gutenkunst R, Adams MD, Cargill M, Boyko A, et al: Darwinian and demographic forces affecting human protein coding genes. Genome Res. 2009, 19 (5): 838-849. 10.1101/gr.088336.108.View ArticlePubMedPubMed CentralGoogle Scholar
- Bartel DP: MicroRNAs genomics, biogenesis, mechanism, and function. Cell. 2004, 116 (2): 281-297. 10.1016/S0092-8674(04)00045-5.View ArticlePubMedGoogle Scholar
- Bartel DP: MicroRNAs: target recognition and regulatory functions. Cell. 2009, 136 (2): 215-233. 10.1016/j.cell.2009.01.002.View ArticlePubMedPubMed CentralGoogle Scholar
- Chen K, Rajewsky N: Natural selection on human microRNA binding sites inferred from SNP data. Nat Genet. 2006, 38 (12): 1452-1456. 10.1038/ng1910.View ArticlePubMedGoogle Scholar
- Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A: BioMart - biological queries made easy. BMC Genomics. 2009, 10 (1): 22-10.1186/1471-2164-10-22.View ArticlePubMedPubMed CentralGoogle Scholar
- Becker KG, Barnes KC, Bright TJ, Wang SA: The genetic association database. Nat Genet. 2004, 36 (5): 431-432. 10.1038/ng0504-431.View ArticlePubMedGoogle Scholar
- Grimson A, Farh KKH, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP: MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell. 2007, 27 (1): 91-105. 10.1016/j.molcel.2007.06.017.View ArticlePubMedPubMed CentralGoogle Scholar
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005, 120 (1): 15-20. 10.1016/j.cell.2004.12.035.View ArticlePubMedGoogle Scholar
- Friedman RC, Farh KKH, Burge CB, Bartel DP: Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009, 19 (1): 92-105. 10.1101/gr.082701.108.View ArticlePubMedPubMed CentralGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.




