Different level of population differentiation among human genes
© Wu and Zhang; licensee BioMed Central Ltd. 2011
Received: 3 September 2010
Accepted: 14 January 2011
Published: 14 January 2011
During the colonization of the world, after dispersal out of African, modern humans encountered changeable environments and substantial phenotypic variations that involve diverse behaviors, lifestyles and cultures, were generated among the different modern human populations.
Here, we study the level of population differentiation among different populations of human genes. Intriguingly, genes involved in osteoblast development were identified as being enriched with higher FST SNPs, a result consistent with the proposed role of the skeletal system in accounting for variation among human populations. Genes involved in the development of hair follicles, where hair is produced, were also found to have higher levels of population differentiation, consistent with hair morphology being a distinctive trait among human populations. Other genes that showed higher levels of population differentiation include those involved in pigmentation, spermatid, nervous system and organ development, and some metabolic pathways, but few involved with the immune system. Disease-related genes demonstrate excessive SNPs with lower levels of population differentiation, probably due to purifying selection. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection.
Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
After dispersal from Africa, humans have evolved to be characterized by substantial phenotypic variation, including variation in skin, hair, and eye color, body mass, height, diet, drug metabolism, susceptibility and resistance to disease, during the colonization of the World. Efforts to reveal the genetic bases of these variations should provide important insight into the history of human evolution, gene function, and the mechanisms of disease [1, 2]. Indeed, with the advent of large scale comparative genomic and human polymorphism data, a flood of studies have identified many candidate genes and genomic regions accounting for the observed phenotypic characters . However, the evolutionary forces, i.e., positive selection, balancing selection, purifying selection, or neutral evolution, driving the variation of these phenotypic traits remain largely unknown.
In general, population differentiation under neutral evolution is mostly influenced by demographic history; however, adaptation to a local environment, driven by positive selection, will increase the level of population differentiation . In contrast, negative and balancing selection tends to reduce population differentiation . Accordingly, the evaluation of the level of population differentiation of the human genome would be helpful and informative for the identification of the genetic basis of the phenotypic difference observed in different human populations.
Results and Discussion
Here, we evaluated the level of population differentiation for human genes on autosomal chromosomes among three populations: African, European and East Asian, based on the HapMap data (Phase II) , using the parameter FST according to methods described previously [3, 5]. A previous study has reported that there is a higher level of population differentiation at gene regions compared to non-gene regions in the genome . However, in our analysis, we observed that for several chromosomes, including 5, 6, 8, 11, 13, and 20, did not show a pattern with higher population differentiation at genic compared to non-genic regions, namely genic regions did not have excess SNPs with a higher FST (≥0.6) (Figure S1 in Additional file 1).
Functional significance of genes with higher levels of population differentiation
An intriguing observation is that osteoblast development is significantly rich in high FST SNPs (λ = 12.28, P= 4.92E-88 after multiple testing). Osteoblasts are mononucleate cells that are responsible for bone formation. Modern humans demonstrate substantial phenotypic variation, which to a large extent can be illuminated by the skeletal system, such as height, body mass, body mineral density, and craniofacial differences. Indeed, evidence indicates that the human skeletal system has evolved rapidly since the advent of agriculture  and our recent study concluded that the high levels of population differentiation of skeletal genes among human populations was driven by positive selection .
Another interesting category is hair follicle development, which also showed a higher level of population differentiation (GO: 0001942, λ = 4.09, P= 2.07E-08 after multiple testing). Hair is produced by hair follicles. Similar to the skeletal system, hair morphology, including water swelling diameter and section, shape of fiber, mechanical properties, combability and hair moisture, have distinctive traits among human populations . Previous studies have identified some genes involved in hair follicle development that have undergone recent positive selection, as detected by the long range haplotype homozygosity test, such as EDAR and EDA2R [12, 13]. These studies, together with our evidence of higher population differentiation in the genes involved in the hair follicle development support a hypothesis of adaptive evolution accounting for the diversification of human hair.
Consistent with previous observations [12, 14], genes involved in pigmentation, including the following GO processes: pigmentation during development, pigmentation, and melanocyte differentiation, demonstrated significantly higher population differentiation. In a similar manner, reproduction associated processes, e.g. sperm motility, spermatid development, gamete generation, have higher levels of population differentiation (Figure 1). Among the categories with a significant enrichment of higher FST SNPs, many are involved in the nervous system, e.g. dorsoventral neural tube patterning (GO: 0021904, λ = 15.67), hindbrain development (GO: 0030902, λ = 11.08), positive regulation of neuron differentiation (GO: 0045666, λ = 8.50), and neuron development (GO: 0048666, λ = 5.27) (Figure 1). Others categories include metabolic process, such as the triglyceride metabolic process (GO: 0006641, λ = 6.69), glucose homeostasis (GO: 0042593, λ = 4.64), cholesterol homeostasis (GO: 0042632, λ = 4.35), possibly resulting from the variation in metabolism among humans.
Immunity-related genes, however, which are a common target of positive selection [2, 15, 16], are involved in small list of categories with a higher proportion of higher FST SNPs. This observation is probably attributable to the fact that many of the genes in the immunity system evolve under balancing selection in human populations for a heterozygote advantage, which would reduce the level of population differentiation [17, 18].
Population differentiation under neutral evolution is mostly influenced by demographic history (that is, genetic drift and gene flow), which can generate similar pattern with biological factor such as natural selection. However, demographic history tends to influence all loci in the genome equally, and natural selection acts only on the single gene or a group of functional related genes. Compared with the proportion of higher FST SNPs in the genome-wide genes, we present some groups of functional related genes enriched with high FST SNPs, which are mostly driven by positive natural selection, although the confounding factor of demographic history cannot be excluded absolutely.
Population differentiation in disease-related genes
Lower levels of population differentiation in microRNA targeted genes
In this study, we find that genes involved in osteoblast development, hair follicles development, pigmentation, spermatid, nervous system and organ development, and some metabolic pathways have higher levels of population differentiation. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection. Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
Since genes on the sex chromosomes are involved in higher population differentiation than those on the autosomal chromosomes , we only analyzed data from the autosomal chromosomes. Allele frequency data for SNPs on autosomes were retrieved from HapMap Phase II (release 24, NCBI36)  for three populations: African (YRI panel including 60 Yoruban individuals from Ibadan), European (CEU panel including 60 individuals of Utah residents with ancestry from northern and western Europe) and East Asian (EA panels including 45 Han Chinese (HCB) and 45 Japanese from Tokyo (JPT)).To evaluate the degree of population differentiation, FST values of the polymorphic SNPs with minor allele frequencies ≥0.01 in at least one population were calculated as previously described [3, 5]. Since negative values have no biological explanation these were set to 0.
Protein coding genes on the human autosomal chromosomes, and their corresponding gene ontology (GO) terms including three categories: biological process, cellular component, and molecular function, were downloaded from Ensembl (http://www.ensembl.org version 54) by means of BioMart . Each gene was extended 500 bp upstream of 5'-termus and downstream of 3'-termus to include all of its SNPs. χ2 tests with one degree of freedom were used to test for the significance of the enrichment of SNPs with higher (≥0.6) FST values compared with genome-wide genes empirical data based on 2 × 2 contingency tables constructed by the numbers of SNPs. For these analyses, Bonferroni correction was used for the multiple testing. To better understand the enrichment, we calculated the parameter, λ, the ratio of the proportion of higher FST SNPs in the analyzed category to that in the genome-wide genes. λ values significantly higher than 1 indicates a higher population differentiation of genes in the category among human populations.
Complex disease genes were obtained from the Genetic Association Database (GAD) . Human Mendelian disease genes were obtained from the study by Blekhman et al. (2008) (OMIM) . Genes targeted by microRNA were obtained from targetscan (http://www.targetscan.org, release 5.1) [27–29]. For these genes, χ2 tests with one degree of freedom were used to test the significance of an enrichment of SNPs with higher (≥0.6) FST values and lower (≤0.05) FST values, respectively, compared with other genes based on 2 × 2 contingency tables constructed by the numbers of SNPs.
We thank Dr. David Irwin for revising the manuscript, and valuable comments. This work was supported by grants from the National Basic Research Program of China (973 Program, 2007CB411600), the National Natural Science Foundation of China (30621092), and Bureau of Science and Technology of Yunnan Province.
- Novembre J, Di Rienzo A: Spatial patterns of variation due to natural selection in humans. Nat Rev Genet. 2009, 10 (11): 745-755. 10.1038/nrg2632.View ArticlePubMedPubMed Central
- Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312 (5780): 1614-1620. 10.1126/science.1124309.View ArticlePubMed
- Akey JM, Zhang G, Zhang K, Jin L, Shriver MD: Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002, 12 (12): 1805-1814. 10.1101/gr.631202.View ArticlePubMedPubMed Central
- The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-861. 10.1038/nature06258.View ArticlePubMed Central
- Weir BS, Cockerham CC: Estimating F-statistics for the analysis of population structure. Evolution. 1984, 38: 1358-1370. 10.2307/2408641.View Article
- Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L: Natural selection has driven population differentiation in modern humans. Nat Genet. 2008, 40 (3): 340-345. 10.1038/ng.78.View ArticlePubMed
- Hojo M, Kita A, Kageyama R, Hashimoto N: Notch-Hes signaling in pituitary development. Expert Review of Endocrinology & Metabolism. 2008, 3 (1): 91-100.View Article
- Zhu X, Gleiberman AS, Rosenfeld MG: Molecular physiology of pituitary development: signaling and transcriptional networks. Physiol Rev. 2007, 87 (3): 933-963. 10.1152/physrev.00006.2006.View ArticlePubMed
- Larsen CS: Biological changes in human populations with agriculture. Ann Rev Anthropol. 1995, 24 (1): 185-213. 10.1146/annurev.an.24.100195.001153.View Article
- Wu DD, Zhang YP: Positive selection drives population differentiation in the skeletal genes in modern humans. Hum Mol Genet. 2010, 19: 2341-2346. 10.1093/hmg/ddq107.View ArticlePubMed
- Franbourg A, Hallegot P, Baltenneck F, Toutain C, Leroy F: Current research on ethnic hair. J Am Acad Dermatol. 2003, 48 (6PB): 115-119. 10.1067/mjd.2003.277.View Article
- Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R, et al: Genome-wide detection and characterization of positive selection in human populations. Nature. 2007, 449 (7164): 913-918. 10.1038/nature06250.View ArticlePubMedPubMed Central
- Fujimoto A, Kimura R, Ohashi J, Omi K, Yuliwulandari R, Batubara L, Mustofa MS, Samakkarn U, Settheetham-Ishida W, Ishida T, et al: A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. Hum Mol Genet. 2008, 17 (6): 835-843. 10.1093/hmg/ddm355.View ArticlePubMed
- Izagirre N, Garcia I, Junquera C, de la Rua C, Alonso S: A scan for signatures of positive selection in candidate loci for skin pigmentation in humans. Mol Biol Evol. 2006, 23 (9): 1697-1706. 10.1093/molbev/msl030.View ArticlePubMed
- Wu DD, Zhang YP: Positive Darwinian selection in human population: A review. Chinese Science Bulletin. 2008, 53 (10): 1457-1467. 10.1007/s11434-008-0202-z.
- Vallender EJ, Lahn BT: Positive selection on the human genome. Hum Mol Genet. 2004, 13 (Review Issue 2): R245-R254. 10.1093/hmg/ddh253.View ArticlePubMed
- Ferrer-Admetlla A, Bosch E, Sikora M, Marques-Bonet T, Ramirez-Soriano A, Muntasell A, Navarro A, Lazarus R, Calafell F, Bertranpetit J: Balancing selection is the main force shaping the evolution of innate immunity genes. J Immunol. 2008, 181 (2): 1315-1322.View ArticlePubMed
- Fumagalli M, Cagliani R, Pozzoli U, Riva S, Comi GP, Menozzi G, Bresolin N, Sironi M: Widespread balancing selection and pathogen-driven selection at blood group antigen genes. Genome Res. 2009, 19 (2): 199-212. 10.1101/gr.082768.108.View ArticlePubMedPubMed Central
- Cai JJ, Borenstein E, Chen R, Petrov DA: Similarly strong purifying selection acts on human disease genes of all evolutionary ages. Genome Biol Evol. 2009, 2009 (0): 131-144.
- Blekhman R, Man O, Herrmann L, Boyko AR, Indap A, Kosiol C, Bustamante CD, Teshima KM, Przeworski M: Natural selection on genes that underlie human disease susceptibility. Curr Biol. 2008, 18 (12): 883-889. 10.1016/j.cub.2008.04.074.View ArticlePubMedPubMed Central
- Nielsen R, Hubisz MJ, Hellmann I, Torgerson D, Andres AM, Albrechtsen A, Gutenkunst R, Adams MD, Cargill M, Boyko A, et al: Darwinian and demographic forces affecting human protein coding genes. Genome Res. 2009, 19 (5): 838-849. 10.1101/gr.088336.108.View ArticlePubMedPubMed Central
- Bartel DP: MicroRNAs genomics, biogenesis, mechanism, and function. Cell. 2004, 116 (2): 281-297. 10.1016/S0092-8674(04)00045-5.View ArticlePubMed
- Bartel DP: MicroRNAs: target recognition and regulatory functions. Cell. 2009, 136 (2): 215-233. 10.1016/j.cell.2009.01.002.View ArticlePubMedPubMed Central
- Chen K, Rajewsky N: Natural selection on human microRNA binding sites inferred from SNP data. Nat Genet. 2006, 38 (12): 1452-1456. 10.1038/ng1910.View ArticlePubMed
- Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A: BioMart - biological queries made easy. BMC Genomics. 2009, 10 (1): 22-10.1186/1471-2164-10-22.View ArticlePubMedPubMed Central
- Becker KG, Barnes KC, Bright TJ, Wang SA: The genetic association database. Nat Genet. 2004, 36 (5): 431-432. 10.1038/ng0504-431.View ArticlePubMed
- Grimson A, Farh KKH, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP: MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell. 2007, 27 (1): 91-105. 10.1016/j.molcel.2007.06.017.View ArticlePubMedPubMed Central
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005, 120 (1): 15-20. 10.1016/j.cell.2004.12.035.View ArticlePubMed
- Friedman RC, Farh KKH, Burge CB, Bartel DP: Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009, 19 (1): 92-105. 10.1101/gr.082701.108.View ArticlePubMedPubMed Central
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.