A comparative analysis of divergently-paired genes (DPGs) among Drosophila and vertebrate genomes
© Yang and Yu; licensee BioMed Central Ltd. 2009
Received: 16 June 2008
Accepted: 11 March 2009
Published: 11 March 2009
Divergently-paired genes (DPGs) are defined as two adjacent genes that are transcribed toward the opposite direction (or from different DNA strands) and shared their transcription start sites (TSSs) less than 1,000 base pairs apart. DPGs are products of a common organizational feature among eukaryotic genes yet to be surveyed across divergent genomes over well-defined evolutionary distances since mutations in the sequence between a pair of DPGs may result in alternations in shared promoters and thus affect the function of both genes. By sharing promoters, the gene pairs take the advantage of co-regulation albeit bearing doubled mutational burdens in maintaining their normal functions.
Drosophila melanogaster has a significant fraction (31.6% of all genes) of DPGs which are remarkably conserved relative to its gene density as compared to other eukaryotes. Our survey and comparative analysis revealed different evolutionary patterns among DPGs between insect and vertebrate lineages. The conservation of DPGs in D. melanogaster is of significance as they are mostly housekeeping genes characterized by the absence of TATA box in their promoter sequences. The combination of Initiator and Downstream Promoter Element may play an important role in regulating DPGs in D. melanogaster, providing an excellent niche for studying the molecular details for transcription regulations.
DPGs appear to have arisen independently among different evolutionary lineages, such as the insect and vertebrate lineages, and exhibit variable degrees of conservation. Such architectural organizations, including convergently-paired genes (CPGs) may associate with transcriptional regulation and have significant functional relevance.
How genes are structurally organized and functionally evolved are two fundamental biological questions to be addressed across diverse evolutionary lineages. The best known example for structurally-coordinated and functionally-related genes are operons in prokaryotes . In eukaryotes, certain classes of genes are also non-randomly distributed, forming different structural classes including pairing and clustering. For instance, genes within the same metabolic pathways are often clustered together  and have correlated expression patterns when compared against random genes [3–7].
Recently, there have been increasing numbers of genome-wide studies on divergently-paired genes or DPGs in human [8–11] and Drosophila melanogaster . DPGs are often defined as two adjacent genes that are divergently transcribed on opposite DNA strands, which have transcription start sites (TSSs) less than 1,000 bp apart . The sequences between the two TSSs among DPGs are defined as divergently-shared promoters (DSPs). More than 10% of the human genes are arranged in the divergent organization, and DPGs are often co-ordinately expressed with evolutionary conservation and functional association [10, 11].
Among species as diverse as human [13–17], mouse [18–20], chicken [21, 22], fruit fly [23, 24], Saccharomyces cerevisiae[25, 26], and Aspergillus nidulans , a substantial number of individual DPGs have been reported based on experimental evidence but few genome-wide analysis across diverse evolutionary lineages has been published. The recent availability of genome sequences of D. melanogaster (Dmel) and a constellation of closely-related species at various levels of divergence time selected in the genus Drosophila have made the genus an ideal model for a thorough comparative analysis for DPGs http://rana.lbl.gov/drosophila/.
We performed a genome-wide identification of DPGs in Dmel and other selected eukaryotic genomes, including representatives from vertebrate and other sequenced Drosophila species. We also examined the conservation of divergent gene organization over different evolutionary time scales using orthologous sequence datasets based on synonymous (Ks) and nonsynonymous (Ka) substitution rates. We also correlated gene expression and functional relevance among DPGs of Dmel and other eukaryotes based on Gene Ontology terms. Our results suggested that the divergent gene organization is a widespread and evolutionary conserved feature of co-regulated transcription for functionally-related genes in Drosophila genomes albeit variable patterns observed among different taxonomic groups or lineages in terms of structural conservation.
Identification and characterization of DPGs in Dmeland other eukaryotes
The evolution of DPGs
Evolutionary conservation of DPGs in D. melanogaster as compared with those from other Drosophila species
Evolutionary conservation of human DPGs as compared with those of other eukaryotic genomes
DPGs are under stronger purifying selection
Ka and Ks for divergent and non-divergent D. melanogaster genes
0.087 ± 0.102
1.582 ± 1.039
0.074 ± 0.188
0.086 ± 0.103
1.467 ± 1.027
0.085 ± 0.314
0.087 ± 0.103
1.504 ± 1.032
0.081 ± 0.280
Ka and Ks for divergent and non-divergent human genes
0.098 ± 0.139
0.449 ± 0.148
0.201 ± 0.169
0.244 ± 0.277
0.776 ± 0.582
0.281 ± 0.218
0.234 ± 0.273
0.754 ± 0.569
0.276 ± 0.216
Co-expression of DPGs in D. melanogaster
We also evaluated the significance of each correlation for all datasets. We denoted a correlation as a significant correlation when its p-value < 0.05, in which a significant positive correlation if the correlation is positive, otherwise a significant negative correlation. Of total 1,770 DPGs with available microarray data, 1,031 (58.2%) and 404 (22.8%) pairs showed significant positive and negative correlations respectively, which have p < 0.05 at least in one dataset. Moreover, there were 67 (3.8%) pairs showed either significant positive or negative correlations depending on conditions of microarray experiments [see Additional file 6].
The fact that the overall 84.8% of all DPGs are significant correlated with expression implies co-regulation as the driving force for maintaining this gene organization. In addition, the relationship between the intergenic distance of DPGs and the level of co-expression are not correlated as shown previously in human .
Functional classification of DPGs
Previous studies have shown that many DNA repair genes are DPGs in human genome [8, 10]. To observe the functional classification of DPGs in Dmel, we analyzed their annotations and included six other eukaryotic genomes, H. sapiens, M. musculus, G. gallus, C. elegans, S. cerevisiae, and A. thaliana for comparison [see Additional file 7]. In Biological Process, the GO terms related to organization/biogenesis and metabolic/biosynthetic processes topped the list of DPGs. The GO terms involved in RNA Binding are significantly overrepresented as compared to others in Molecular Function. From the results of Cellular Component, we found that DPGs tend to be in the classes of Intracellular, Organelle, Cytoplasm, and Protein Complex. However, almost 80% of the overrepresented GO terms in human and mouse are identical due to their close evolutionary distance. Above 90% of the terms in chicken are also present in human genome, but the total number of the overrepresented GO terms is less than human partly because of the rarity of GO annotations and less number of genes in the chicken genome. The Dmel DPGs had the most overrepresented GO terms among the eukaryotes accounted for the highest proportion of DPGs relative to gene density. Almost all terms found in human genome are also present in Dmel genome. Furthermore, every species has its own specific GO annotations, suggesting that some DPGs of different species may evolve independently during evolution. For instance, C. elegans has distinct overrepresented GO terms in Biological Process, including Reproduction, Behaviour, Growth and Development. The terms Thylakoid, Plastid, and Triplet Codon-Amino Acid Adaptor Activity in A. thaliana represent characteristics of plants that are different from those highlighted among animals. In Dmel, there were relatively more specific GO terms than other eukaryotes, including Transport, Cytoskeleton Organization and Biogenesis, Cell Death and Cell Proliferation in Biological Process, Nuclear Envelope, Cytoskeleton and Cytoplasmic Membrane-bound Vesicle in Cellular Component, Chromatin Binding, Motor Activity, Actin Binding, Kinase Activity, Cytoskeletal Protein Binding, Enzyme Regulator Activity, and Transcription Regulator Activity in Molecular Function.
It has been reported that among prokaryotes there is a strong enrichment of DPGs in which one gene encodes a transcriptional regulator (R) and the other encodes other protein classes (X) . This suggests potential regulatory interactions among DPGs. For the Dmel DPGs, we identified 459 (21.8%), 39 (1.9%), and 1,607 (76.3%) pairs for RX, RR, and XX structures, respectively. However, in the control set with 50,000 random gene pairs, we found 9,350 (18.7%), 500 (1%) and 40,150 (80.3%) pairs for RX, RR, and XX structures, respectively. Since p-values based on Fisher's Exact Test are 4.06e-3 for RX, 6.04e-4 for RR, and 0.138 for XX, RX and RR structures are more likely to present in DPGs. Of the 459 DPGs with RX structure, 318 pairs (69.3%) are fully-conserved across at least seven Drosophila clades. Furthermore, as a fraction of DPGs classified as XX may in fact play role as post-transcriptional regulators and some poorly annotated ('hypothetical') genes classified as X, RX structure should be more enriched in DPGs.
Analysis of the promoter sequences among DPGs
Core promoter elements in divergent and non-divergent D. melanogaster genes
Combinations of core promoter elements in divergent and non-divergent genes of D. melanogaster
The limitation in defining DPGs based on shared distance
We determined DPGs based on the criterion that the two transcriptional start sites should be found in the opposite orientation and less than 1,000 bp apart so the gene pair has a greater possibility to be functionally correlated by sharing a common regulatory region. However, a recent study on human CYP1A1 and CYP1A2 genes  has shown that they are simultaneously controlled through bidirectional and common regulatory elements, but separated by 23 kb intergenic spacer region, suggesting that the number of DPGs are underestimated by current standard. Obviously, some of the true DPGs that are distantly situated are to be mapped experimentally in the future, perhaps coupled with the next-generation sequencing technology. Fortunately, the abundance of DPGs overcomes this obvious limitation for characteristic analysis unless individual genes are scrutinized.
The conservation and origin of divergent gene organization
We found that the human DPGs and their corresponding orthologs are conserved only among vertebrates, especially among mammals. Similarly, the DPG orthologs of other species, such as those of insects, are also better conserved among their close relatives as compared to random genes. These observations support the idea that DPGs provide structural advantages for co-regulation so they become conserved when functionally important genes (such as certain housekeeping genes) become divergently organized. This hypothesis is further supported by the abundance of species-specific DPGs observed. For instance, human DPGs have strong association with CpG islands that are specifically related to genome compositional dynamics and evolution of mammalian genomes. In addition, some of the DPGs may be associated with species specific functions as DPGs in C. elegans are significantly associated with reproduction, behaviour, growth, and development related functions. The GO terms Thylakoid, Plastid, and Triplet Codon-Amino Acid Adaptor Activity in A. thaliana represent the characteristics of plants different from that of animals.
The conservation patterns of the divergent gene organization differ among different lineages, such as between insects and vertebrates. A majority of DPGs with orthologs of insect genomes are diminishing faster over evolutionary time scales but not those of vertebrates. We believe that the organizational priorities for the insect and vertebrate lineages are different strategically as insect species tend to be more diversified to adapt different environments and ecological relationships when compared to vertebrates that gain complex in terms of anatomical structures and behaviours over time.
The origin of DPGs has been of great interest since more and more individual divergent gene pairs have been identified experimentally and several hypotheses have been proposed to illustrate the molecular mechanisms as to how the architecture is created. The first argues for tandem duplication. Tandem duplicated genes refer to two gene s reside physically adjacent to each other, often in the same orientation, and usually have similar expression patterns and similar in function, if not identical. Tandem duplication is a common event among eukaryotic genomes, which is a primary mechanism for generating gene clusters. Genome analysis suggests that one copy of a duplicated gene could drift and potentially acquire a new function. For example, several odorant-binding proteins in Drosophila are transcribed in opposite directions . Nevertheless, our analysis showed that only 4.3% of all the divergent gene pairs were tandem duplicates, suggesting that tandem duplication might not be the essential driving force. The second hypothesis is overprinting, which is a process of generating new genes from pre-existing nucleotide sequences . For a divergent gene pair, one is often confined to a single lineage, but the other is widespread. For example, the two human genes SIRT3 and PSMD13, are linked in divergent configuration sharing regulatory mechanism . One of them, SIRT3, only has orthologs in vertebrates, and the other, PSMD13, is found in all genomes examined in this study. It is assumed that the evolution of their divergent arrangements is associated with that of a complex pathway of co-regulation related to aging in vertebrates. Thus, PSMD13 represents an old gene widespread among eukaryotes but SIRT3is a vertebrate invention. The third theory is genome rearrangement. A genome rearrangement occurs when a chromosome breaks at two or more locations and reassembled in to a different orientation. This results in a DNA sequence that has essentially the same features as the original sequence, except that the order of these features has been modified. A possible scenario to explain the origin of DPGs is that this gene organization originated by chance via genome rearrangement. When a genome rearrangement brought two distant genes together and formed a divergent gene pair, there was no distinct functional relationship at the beginning. However, the promoter region shared by both genes was maintained by evolutionary pressure. A mutation in this region would be potentially hazardous, resulting in failure in normal expression for both genes. Both genes may have housekeeping roles and any one of them would be vital to survival. In the process of evolution, genome took advantage of such a gene organization and utilized it for transcriptional regulation. Regulation of gene expression by forming DPGs may result in more efficient control an d reduce the need for more complex regulatory pathways.
The features of DSPs in eukaryotes
In general, the transcription of each gene in a eukaryotic genome is controlled independently and operons are unusual in eukaryotes, as opposed to most prokaryotes . The organization of DSPs in DPGs does not exhibit universal structural features, because there have not been consistent sequence motifs found among them. As far as we know, promoters recognized by RNA polymerase II are divided into two broad categories: TATA-containing and TATA-less promoters. Tissue-specific genes typically contain TATA boxes located ~30-bp upstream of a single TSS.
Tissue-specific transcriptional factors generally bind upstream of a TATA box and either activate or repress promoter activities. The promoters of housekeeping genes do not generally contain TATA box sequences and usually display multiple transcription start sites. Housekeeping promoters are active in most cell types and often contain binding sites for ubiquitous transcription factors. Many of the human DSPs that have been studied so far are TATA-less  and associated with genes for housekeeping functions. Examples include DSPs of the genes encoding DHFR/Rep-1 , TK/KF , Surf1/Surf2 , GPAT/AIRC, histones H2A/H2B and BRCA1/NBR2 . Although a few DSPs have TATA boxes in both orientations, most lack TATA boxes and initiator elements in either direction and stimulate transcriptional initiation at multiple sites over broad initiation windows as a strong association between DPGs and CpG island is described in human genome [8, 10]. The mammalian DSPs with CpG island are frequently lack of TATA boxes [44–46]. Nevertheless, CpG islands appear less frequently found in promoters that contain both TATA boxes and initiator regions . It is clear that a majority of DPGs in human are co-regulated by TATA-less promoters with CpG-islands and Sp1 binding site is prevalent in DSPs . Furthermore, some other transcriptional factor binding sites may also play key roles in regulating certain DPGs. Examples include:(1) YY1 factor binding site in Surf1/Surf2 genes [48, 49], (2) CCAAT box binding sites for HSF-1 (Heat shock factor-1)/Bop1, E14/ATM, BRCA1/NBR2, and GPAT/AIRC , (3)GC boxes between the TSSs of TAP1/LMP2, DHFR/Rep-1 and GPAT/AIRC, and (4)E2F factor binding site in TK/KF genes .
Our analysis for Dmel genome indicated that DPGs often have TATA-less promoters, consistent with the finding in human. However, CpG island is not an indicator for DSPs in Dmel as DNA methylation is known to be absent in this organism. In this study, we focused on four common core promoter elements experimentally identified: TATA box, Inr, DPE, and MTE. TATA box and Inr are well-known in Drosophila and vertebrates. The core motif of DPE is located exactly from +28 to +33 bp downstream of TSS and is recognized by two distinct TBP-associated factors (TAFs). Experimental evidence suggests that DPE appears to be as widely used as TATA box . MTE is located at positions from +17 to +22 bp, experimentally verified to interact with TFIID . These core promoter elements show organism-specific patterns; Inr has higher information content, and DPE is much more frequently found in the fly promoters as compared to those among mammals . The diversity of core promoters are thought to contribute to specificity of gene regulation in a combinatorial fashion . Although the distribution of the four core promoter elements indicated that there has not been a single element overrepresented among DPGs, a significant overrepresentation was found in a combination of Inr and DPE, which is functionally equivalent to CpG islands in mammalian DPGs.
DPGs exist as one of the common structural features of genomes and provide advantages in transcriptional co-regulation. DPGs are abundant among eukaryotic genomes and highly conserved; the conservation is stronger within lineages than between lineages. The conservation patterns among the different organizational classes, i.e. DPGs, CPGs, and CDPGs, appear linage-specific as vertebrate DPGs are better conserved than those of insects. Further analyses revealed that DPGs are strongly co-regulated in expression profiles and associated with certain functional categories. DPGs are mostly housekeeping genes so they lack TATA box. Combinations of transcriptional factor binding sites are crucial in regulating this divergent gene organization.
We retrieved the genomic data and annotations for D. melanogaster and other nine Drosophila (D. simulans, D. sechellia, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, D. mojavensis, D. virilis and D. grimshwawi) from Drosophila Comparative Annotation (available at http://rana.lbl.gov/drosophila/) that host gene models built with Gene Wise based on Flybase Release 4.2 for Dmel. Other genome data from sixteen eukaryotes were downloaded from the NCBI Map Viewer ftp://ftp.ncbi.nlm.nih.gov/genomes/MapView, which include Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, Canis familiaris, Gallus gallus, Apis mellifera, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Eremothecium gossypii, Kluyveromyces lactis, Magnaporthe grisea, Neurospora crassa, Arabidopsis thaliana, and Oryza sativa. The genome sizes of all analyzed species were obtained from NCBI http://www.ncbi.nlm.nih.gov/Genomes/.
Identification of DPGs
Divergently-paired genes or DPGs are defined as divergently-arranged (bi-directional or head-to-head) gene pairs on opposite strands with transcription start sites within 1,000 bp . We did an all-against-all BLAST search on all Dmel genes. The tandem duplicates were determined as neighbouring gene pairs with expect value E < 1e-10. The definitions of CPGs and CDPGs are associated to that of DPGs, where TSS distances are within 1 kb.
Organizationally-conserved DPGs between Dmeland other species
The orthologs among Dmel and other Drosophila clades as well as from other eukaryotic genomes were extracted in a similar way as we did for Dmel from Drosophila Comparative Annotation and NCBI HomoloGene release 56 ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene. According to the degree of conservation, we classified DPGs into five categories [see Additional file 5]. (1) "Fully conserved" are DPGs that have orthologs for both genes and remain their relative directions in other species. (2) "Both orthologs" represents DPGs that have orthologs for both genes but lost the divergent relationship in other species. (3) "Species-specific" is defined as those that have one ortholog in other species but chose another gene without orthology as a counterpart to keep their relative direction. (4) "Single ortholog" means DPGs that only have one gene ortholog found in other species but lost the counterpart. (5) "No ortholog" refers DPGs that do not have orthologs in any other species analyzed. We also prepared 20,000 gene pairs randomly selected separately from the Dmel and human genomes in order to show the evolutionary conservation among DPGs.
We selected three groups of species pairs with comparable divergence time to examine the difference of divergent organization in the light of evolution in of insect and vertebrate lineages. The first group is composed of D. melanogaster vs. D. simulans and human vs. chimpanzee, which diverged about 5 million years ago. The second group includes D. melanogaster vs. D. ananassae and mouse vs. rat; both have a divergence time about 40 million years. The third group concerns D. melanogaster vs. A. mellifera and human vs. chicken; both have a relatively longer divergence time about 300 million years.
Synonymous and nonsynonymous substitution rates of DPGs
We calculated synonymous (Ks) and nonsynonymous (Ka)substitution rates  for both divergent and non-divergent orthologous gene pairs between Dmel and D. pseudoobscura for insects and between human and mouse for vertebrates using a maximum likelihood (ML) algorithm that corrects for reversion events implemented in the software package PAML . Protein identity was calculated based on multiple alignments by using CLUSTALW .
Expression correlations among DPGs
The expression data based on microarray experiments belong to three datasets: DeGregorio2001, Arbeitman2002, and Spellman2002. The DeGregorio2001 dataset is from adult flies in response to microbial infection, and it was acquired from high-density oligonucleotide microarrays  representing 13,172 distinct genes and 351 DPGs. The Arbeitman2002 dataset was generated from a study on the development of Dmel measured in a time-course ; it contains 6,841 distinct genes and 381 DPGs. The Spellman2002 dataset has 13,141 distinct genes determined from over 80 experimental conditions  and contains 1,761 DPGs. We defined the level of co-expression between two genes as Pearson correlation coefficient of expression abundance, and denoted a significant correlation as p-value < 0.05. We also selected CDPGs, CPGs, and 20,000 random gene pairs as a control for each dataset to calculate correlation coefficient.
Gene Ontology (GO) annotation for DPGs
To determine statistically overrepresented GO terms for DPGs, we counted the number of appearances of each GO term in general annotations for DPGs and all other genes. We used generic GO terms to offer a broad overview of the ontology content without details of specific terms. For each GO term, a p-value is calculated based on hypergeometric test to represent the probability that the observed number of gene counts within a GO group could have resulted from a random distribution between the tested and the reference groups. The statistically overrepresented GO terms or number of genes can be identified when p-value is less than 0.05 based on Benjamini & Hochberg False Discovery Rate (FDR) correction.
where S(C1, C2) is the set of general GO terms shared by both C1 and C2.
We also prepared 50,000 gene pairs randomly selected from the Dmel genome in order to show the functional relevance among DPGs. The difference between the distribution of DPGs and the control sets was analyzed by using Kolmogorov-Smirnov test to confirm whether DPGs are inclined to have similar function.
We identify a gene as a transcriptional regulator (R) if it is annotated with GO term "regulation of biological process" in the general annotation, whereas any other class of proteins (X)are treated separately. Gene pairs are classified into three classes: regulator-regulator pairs (RR), potential regulatory interactions (RX), and non-regulatory (XX). The 50,000 random gene pairs mentioned above are used as a control set.
Analysis of promoter sequence
The parameters of core promoter elements
-33 - -23
-1 – +9
+27 – +36
+17 – +26
- Dmel :
transcription start site
Downstream Promoter Element
Motif Ten Element.
The authors would like to acknowledge Drs. Venky Iyer and Dan Polland at UC Berkeley/LBNL for providing the sequences and annotation of Drosophila genomes (Drosophila Comparative Annotation). The study is supported by the National Basic Research Program of China (2006CB910404) awarded to JY.
- Lawrence JG: Shared strategies in gene organization among prokaryotes and eukaryotes. Cell. 2002, 110: 407-413.View ArticlePubMedGoogle Scholar
- Hurst LD: The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet. 2004, 5: 299-310.View ArticlePubMedGoogle Scholar
- Cohen BA, Mitra RD, Hughes JD, Church GM: A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat Genet. 2000, 26 (2): 183-186.View ArticlePubMedGoogle Scholar
- Kruglyak S, Tang H: Regulation of adjacent yeast genes. Trends Genet. 2000, 16: 109-111.View ArticlePubMedGoogle Scholar
- Spellman PT, Rubin GM: Evidence for large domains of similarly expressed genes in the Drosophila genome. J Biol. 2002, 1 (1): 5-PubMed CentralView ArticlePubMedGoogle Scholar
- Lercher MJ, Blumenthal T, Hurst LD: Coexpression of Neighboring Genes in Caenorhabditis Elegans Is Mostly Due to Operons and Duplicate Genes. Genome Res. 2003, 13: 238-243.PubMed CentralView ArticlePubMedGoogle Scholar
- Williams EJB, Bowles DJ: Coexpression of Neighboring Genes in the Genome of Arabidopsis thaliana. Genome Res. 2004, 14: 1060-1067.PubMed CentralView ArticlePubMedGoogle Scholar
- Adachi N, Lieber MR: Bidirectional gene organization: a common architectural feature of the human genome. Cell. 2002, 109 (7): 807-809.View ArticlePubMedGoogle Scholar
- Takai D, Jones PA: Origins of bidirectional promoters: computational analyses of intergenic distance in the human genome. Mol Biol Evol. 2004, 21 (3): 463-467.View ArticlePubMedGoogle Scholar
- Trinklein ND, Aldred SF, Hartman SJ, Schroeder DI, Otillar RP, Myers RM: An Abundance of Bidirectional Promoters in the Human Genome. Genome Res. 2004, 14 (1): 62-66.PubMed CentralView ArticlePubMedGoogle Scholar
- Li YY, Yu H, Guo ZM, Guo TQ, Tu K, Li YX: Systematic analysis of head-to-head gene organization: Evolutionary conservation and potential biological relevance. PLoS Comput Biol. 2006, 2 (7): e74-PubMed CentralView ArticlePubMedGoogle Scholar
- Herr DR, Harris GL: Close head-to-head juxtaposition of genes favors their coordinate regulation in Drosophila melanogaster. FEBS Letters. 2004, 572 (1): 147-153.View ArticlePubMedGoogle Scholar
- Schmidt C, Fischer G, Kadner H, Genersch E, Kuhn K, Poschl E: Differential effects of DNA-binding proteins on bidirectional transcription from the common promoter region of human collagen type IV genes COL4A1 and COL4A2. Biochim Biophys Acta. 1993, 1174 (1): 1-10.View ArticlePubMedGoogle Scholar
- Sun Z, Kitchingman GR: Bidirectional transcription from the human immunoglobulin VH6 gene promoter. Nucleic Acids Res. 1994, 22 (5): 861-868.PubMed CentralView ArticlePubMedGoogle Scholar
- Byrd PJ, Cooper PR, Stankovic T, Kullar HS, Watts GD, Robinson PJ, Taylor MR: A gene transcribed from the bidirectional ATM promoter coding for a serine rich protein: amino acid sequence, structure and expression studies. Hum Mol Genet. 1996, 5: 1785-1791.View ArticlePubMedGoogle Scholar
- Xu CF, Brown MA, Nicolai H, Chambers JA, Griffiths BL, Solomon E: Isolation and characterisation of the NBR2 gene which lies head to head with the human BRCA1 gene. Hum Mol Genet. 1997, 6: 1057-1062.View ArticlePubMedGoogle Scholar
- Hansen JJ, Bross P, Westergaard M, Nielsen M, Eiberg H, Børglum AD, Mogensen J, Kristiansen K, Bolund L, Gregersen N: Genomic structure of the human mitochondrial chaperonin genes: HSP60 and HSP10 are localised head to head on chromosome 2 separated by a bidirectional promoter. Hum Genet. 2003, 112: 71-77.View ArticlePubMedGoogle Scholar
- Crouse GF, Leys EJ, McEwan RN, Frayne EG, Kellems RE: Analysis of the Mouse dhfr Promoter Region: Existence of a Divergently Transcribed Gene. Mol Cell Biol. 1985, 5 (8): 1847-1858.PubMed CentralView ArticlePubMedGoogle Scholar
- Liao WC, Ash J, Johnson LF: Bidirectional promoter of the mouse thymidylate synthase gene. Nucleic Acids Res. 1994, 22 (20): 4044-4049.PubMed CentralView ArticlePubMedGoogle Scholar
- Schuettengruber B, Doetzlhofer A, Kroboth K, Wintersberger E, Seiser C: Alternate activation of two divergently transcribed mouse genes from a bidirectional promoter is linked to changes in histone modification. J Biol Chem. 2003, 278 (3): 1784-1793.View ArticlePubMedGoogle Scholar
- Sturm RA, Dalton S, Wells JRE: Conservation of histone H2A/H2B intergene regions: a role for the H2B specific element in divergent transcription. Nucleic Acids Res. 1988, 16 (17): 8571-8586.PubMed CentralView ArticlePubMedGoogle Scholar
- Gavalas A, Dixon JE, Brayton KA, Zalkin H: Coexpression of two closely linked avian genes for purine nucleotide synthesis from a bidirectional promoter. Mol Cell Biol. 1993, 13 (8): 4784-4792.PubMed CentralView ArticlePubMedGoogle Scholar
- Salzberg A, Cohen N, Halachmi N, Kimchie Z, Lev Z: The Drosophila Ras2 and Rop gene pair: a dual homology with a yeast Ras-like gene and a suppressor of its loss-of-function phenotype. Development. 1993, 117 (4): 1309-1319.PubMedGoogle Scholar
- Crayton Mr, Ladd C, Sommer M, Hampikian G, Strausbaugh L: An organizational model of transcription factor binding sites for a histone promoter in D. melanogaster. In Silico Biol. 2004, 4 (4): 537-548.PubMedGoogle Scholar
- Johnston M, Davis RW: Sequences that regulate the divergent GAL1-GAL10 promoter in Saccharomyces cerevisiae. Mol Cell Biol. 1984, 4 (8): 1440-1448.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhu Y, Xiao W: Pdr3 is required for DNA damage induction of MAG1 and DDI1 via a bi-directional promoter element. Nucleic Acids Res. 2004, 32 (17): 5066-5075.PubMed CentralView ArticlePubMedGoogle Scholar
- Punt PJ, Strauss J, Smit R, Kinghorn JR, Hondel CAvd, Scazzocchio C: The intergenic region between the divergently transcribed niiA and niaD genes of Aspergillus nidulans contains multiple NirA binding sites which act bidirectionally. Mol Cell Biol. 1995, 15 (10): 5688-5699.PubMed CentralView ArticlePubMedGoogle Scholar
- Dandekar T, Snel B, Huynen M, Bork P: Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci. 1998, 23: 324-328.View ArticlePubMedGoogle Scholar
- Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N: The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA. 1999, 96: 2896-2901.PubMed CentralView ArticlePubMedGoogle Scholar
- Hedges SB: The origin and evolution of model organisms. Nat Rev Genet. 2002, 3 (11): 838-849.View ArticlePubMedGoogle Scholar
- Tamura K, Subramanian S, Kumar S: Temporal Patterns of Fruit Fly (Drosophila) Evolution Revealed by Mutation Clocks. Mol Biol Evol. 2004, 21 (1): 36-44.View ArticlePubMedGoogle Scholar
- Consortium HGS: Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006, 443 (7114): 931-949.View ArticleGoogle Scholar
- Korbel JO, Jensen LJ, Mering Cv, Bork P: Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat Biotechnol. 2004, 22 (7): 911-917.View ArticlePubMedGoogle Scholar
- Resnik P: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research. 1999, 11: 95-130.Google Scholar
- Lim CY, Santoso B, Boulay T, Dong E, Ohler U, Kadonaga JT: The MTE, a new core promoter element for transcription by RNA polymerase II. Genes Dev. 2004, 18 (13): 1606-1617.PubMed CentralView ArticlePubMedGoogle Scholar
- Smale ST, Kadonaga JT: The RNA polymerase II core promoter. Annual Review of Biochemistry. 2003, 72 (1): 449-479.View ArticlePubMedGoogle Scholar
- Ueda R, Iketaki H, Nagata K, Kimura S, Gonzalez FJ, Kusano K, Yoshimura T, Yamazoe Y: A Common Regulatory Region Functions Bidirectionally in Transcriptional Activation of the Human CYP1A1 and CYP1A2 Genes. Mol Pharmacol. 2006, 69 (6): 1924-1930.View ArticlePubMedGoogle Scholar
- Hekmat-Scafe DS, Scafe CR, McKinney AJ, Tanouye MA: Genome-Wide Analysis of the Odorant-Binding Protein Gene Family in Drosophila melanogaster. Genome Res. 2002, 12 (9): 1357-1369.PubMed CentralView ArticlePubMedGoogle Scholar
- Kesse PK, Gibbs A: Origins of Genes: "Big Bang" or Continuous Creation?. Proc Natl Acad Sci USA. 1992, 89 (20): 9489-9493.View ArticleGoogle Scholar
- Bellizzi D, Dato S, Cavalcante P, Covello G, Di Cianni F, Passarino G, Rose G, De Benedictis G: Characterization of a bidirectional promoter shared between two human genes related to aging: SIRT3 and PSMD13. Genomics. 2007, 89 (1): 143-150.View ArticlePubMedGoogle Scholar
- Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, Rockman MV, Romano LA: The Evolution of Transcriptional Regulation in Eukaryotes. Mol Biol Evol. 2003, 20 (9): 1377-1419.View ArticlePubMedGoogle Scholar
- Linton JP, Yen JY, Selby E, Chen Z, Chinsky JM, Liu K, Kellems RE, Crouse GF: Dual bidirectional promoters at the mouse dhfr locus: cloning and characterization of two mRNA classes of the divergently transcribed Rep-1 gene. Mol Cell Biol. 1989, 9 (7): 3058-3072.PubMed CentralView ArticlePubMedGoogle Scholar
- Lennard AC, Fried M: The bidirectional promoter of the divergently transcribed mouse Surf-1 and Surf-2 genes. Mol Cell Biol. 1991, 11 (3): 1281-1294.PubMed CentralView ArticlePubMedGoogle Scholar
- Smith ML, Mitchell PJ, Crouse GF: Analysis of the mouse Dhfr/Rep-3 major promoter region by using linker-scanning and internal deletion mutations and DNase I footprinting. Mol Cell Biol. 1990, 10 (11): 6003-6012.PubMed CentralView ArticlePubMedGoogle Scholar
- Qvist H, Sjöström H, Norén O: The TATA-less, GC-rich porcine dipeptidylpeptidase IV (DPPIV) promoter shows bidirectional activity. Biol Chem. 1998, 379 (1): 75-81.PubMedGoogle Scholar
- Seki Y, Ikeda S, Kiyohara H, Ayabe H, Seki T, Matsui H: Sequencing analysis of a putative human O-sialoglycoprotein endopeptidase gene (OSGEP) and analysis of a bidirectional promoter between the OSGEP and APEX genes. Gene. 2002, 285 (1–2): 101-108.View ArticlePubMedGoogle Scholar
- Suzuki Y, Tsunoda T, Sese J, Taira H, Mizushima-Sugano J, Hata H, Ota T, Isogai T, Tanaka T, Nakamura Y, et al: Identification and Characterization of the Potential Promoter Regions of 1031 Kinds of Human Genes. Genome Res. 2001, 11 (5): 677-684.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaston K, Fried M: YY1 is involved in the regulation of the bi-directional promoter of the Surf-1 and Surf-2 genes. FEBS Lett. 1994, 347 (2–3): 289-294.View ArticlePubMedGoogle Scholar
- Cole EG, Gaston K: A functional YY1 binding site is necessary and sufficient to activate Surf-1 promoter activity in response to serum growth factors. Nucleic Acids Res. 1997, 25 (18): 3705-3711.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang Y, Koushik S, Dai R, Mivechi NF: Structural Organization and Promoter Analysis of Murine Heat Shock Transcription Factor-1 Gene. J Biol Chem. 1998, 273 (49): 32514-32521.View ArticlePubMedGoogle Scholar
- Wright KL, White LC, Kelly A, Beck S, Trowsdale J, Ting JP: Coordinate regulation of the human TAP1 and LMP2 genes from a shared bidirectional promoter. J Exp Med. 1995, 181 (4): 1459-1471.View ArticlePubMedGoogle Scholar
- Kutach AK, Kadonaga JT: The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters. Mol Cell Biol. 2000, 20 (13): 4754-4764.PubMed CentralView ArticlePubMedGoogle Scholar
- Lim CY, Santoso B, Boulay T, Dong E, Ohler U, Kadonaga JT: The MTE, a new core promoter element for transcription by RNA polymerase II. Genes Dev. 2004, 18 (13): 1606-1617.PubMed CentralView ArticlePubMedGoogle Scholar
- Ohler U: Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction. Nucleic Acids Res. 2006, 34 (20): 5943-5950.PubMed CentralView ArticlePubMedGoogle Scholar
- Smale ST: Core promoters: active contributors to combinatorial gene regulation. Genes Dev. 2001, 15 (19): 2503-2508.View ArticlePubMedGoogle Scholar
- Yang Z, Nielsen R: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000, 17: 32-43.View ArticlePubMedGoogle Scholar
- Yang Z: PAML: A program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997, 13: 555-556.PubMedGoogle Scholar
- Higgins DG, Thompson JD, Gibson TJ: Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 1996, 266: 383-402.View ArticlePubMedGoogle Scholar
- De Gregorio E, Spellman PT, Rubin GM, Lemaitre B: Genome-wide analysis of the Drosophila immune response by using oligonucleotide microarrays. Proc Natl Acad Sci USA. 2001, 98 (22): 12590-12595.PubMed CentralView ArticlePubMedGoogle Scholar
- Arbeitman MN, Furlong EE, Imam F, Johnson E, Null BH, Baker BS, Krasnow MA, Scott MP, Davis RW, White KP: Gene expression during the life cycle of Drosophila melanogaster. Science. 2002, 297 (5590): 2270-2275.View ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B. 1995, 57 (1): 289-300.Google Scholar
- Naum IG, Edward NT, Ilya PI: The features of Drosophila core promoters revealed by statistical analysis. BMC Genomics. 2006, 7: 161-View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.