- Research article
- Open Access
Evidence against the energetic cost hypothesis for the short introns in highly expressed genes
© Huang and Niu; licensee BioMed Central Ltd. 2008
- Received: 10 December 2007
- Accepted: 20 May 2008
- Published: 20 May 2008
In animals, the moss Physcomitrella patens and the pollen of Arabidopsis thaliana, highly expressed genes have shorter introns than weakly expressed genes. A popular explanation for this is selection for transcription efficiency, which includes two sub-hypotheses: to minimize the energetic cost or to minimize the time cost.
In an individual human, different organs may differ up to hundreds of times in cell number (for example, a liver versus a hypothalamus). Considered at the individual level, a gene specifically expressed in a large organ is actually transcribed tens or hundreds of times more than a gene with a similar expression level (a measure of mRNA abundance per cell) specifically expressed in a small organ. According to the energetic cost hypothesis, the former should have shorter introns than the latter. However, in humans and mice we have not found significant differences in intron length between large-tissue/organ-specific genes and small-tissue/organ-specific genes with similar expression levels. Qualitative estimation shows that the deleterious effect (that is, the energetic burden) of long introns in highly expressed genes is too negligible to be efficiently selected against in mammals.
The short introns in highly expressed genes should not be attributed to energy constraint. We evaluated evidence for the time cost hypothesis and other alternatives.
- Effective Population Size
- Energetic Cost
- Intron Length
- Daily Energy Consumption
- Similar Expression Level
In animals (including humans, mice and Caenorhabditis elegans), the moss Physcomitrella patens and the pollen of Arabidopsis thaliana, highly expressed genes have been found to have short introns and exons [1–7]. Several hypotheses have been proposed to explain the compactness of highly expressed genes. The first, based on the fact that transcription is a slow and expensive process, suggests that natural selection for transcriptional efficiency favors the compactness of highly expressed genes [1, 8, 9]. The second hypothesis, called "genome design", suggests that highly expressed genes are short because most of them are housekeeping genes whose epigenetic regulation is less complex than that of weakly expressed tissue-specific genes . In line with this hypothesis, expression level and breadth are strongly positively correlated, and human housekeeping genes are more compact than tissue-specific genes [9, 10]. However, by comparing artificially selected pairs of housekeeping and narrowly expressed genes with similar average expression levels, Li et al.  recently found that housekeeping genes are no more compact than narrowly expressed genes if the expression level is controlled. This implies that expression level rather than breadth determines the compactness of genes. The third hypothesis is mutational bias, which supposes that highly expressed genes tend to localize in chromosomal regions with high deletion rates, or that there is a transcription-associated deletion bias [2, 5]. Urrutia and Hurst  found that the introns of highly expressed genes are still small even if the effects of chromosomal regions are controlled. Housekeeping genes are expected to have much higher germline transcriptional frequencies, and thus, more transcription-associated deletions, than genes that are narrowly expressed in somatic tissues. However, Li et al  found that housekeeping genes are no more compact than genes that are narrowly expressed in somatic tissues with similar average expression levels.
The transcription efficiency hypothesis includes two sub-hypotheses: an energetic cost hypothesis and a time cost hypothesis. Selection for short introns and short exons may be driven either by minimizing the energetic cost of transcription or by the requirement to transcribe large amounts of mRNA molecules within limited periods. Human antisense genes that have very short response times have been found to have short introns [11, 12], which directly supports the time cost hypothesis. Furthermore, Jeffares et al.  found that the intron density in common eukaryotes is positively correlated with the duration of life cycle. However, the time cost hypothesis has been argued against or overlooked in recent studies [3, 4, 6]. Seoighe et al.  pointed out that the transcription of multiple copies of mRNA does not necessarily require a much longer period of time than required to transcribe the first copy, because multiple polymerases may be simultaneously working on one template . The present paper presents evidence against the energetic cost hypothesis and evaluates evidence for the time cost hypothesis and other alternatives.
In animals, different organs may differ up to hundreds of times in cell number and weight. For example, in an adult human, a lung weighs about 1000 g while a prostate weighs only about 20 g. Thus, humans produce tens of times more mRNA molecules for a lung-specific gene (for example, SFTPD) than for a prostate-specific gene (for example, SEMG1) with a similar expression level (considered to be a measure of mRNA abundance per cell in this paper; see Methods for the source of the expression data of these two example genes). Expression of SFTPD is thus expected to have tens of times higher energetic cost to a human body than expression of SEMG1, if these two genes have similar lengths. According to the energetic cost hypothesis, SFTPD should have much shorter introns than SEMG1. On the contrary, SFTPD has a longer average intron length and total intron length than SEMG1 (Additional File 1). The present paper surveys large-tissue/organ-specific (LTS) genes and small-tissue/organ-specific (STS) genes at a genome-wide scale and compares their compactness for a statistically convincing result.
Large-tissue/organ-specific genes and small-tissue/organ-specific genes have similar sizes
Tissue/organ samples and the number of specific genes analyzed in this studya
Large tissue/organ (number of specific genes; tissue/organ weight)b
Small tissue/organ (number of specific genes; tissue/organ weight)b
Cultured adipocytes (18; 9 Kg)
Brain amygdala (22; --)
Liver (79; 1.5 Kg)
Hypothalamus (7; 4 g)
Lung (18; 1 Kg)
Pituitary (6; 5 g)
Skeletal muscle (4; 27 Kg)
Tonsil (1; 30–40 g)
Skin (6; 5 Kg)
Prostate (13; 20 g)
Smooth muscle (24; --)
Thymus (11; 30–40 g)
Thyroid (25; 18–60 g)
Tongue (11; 70 g)
Adipose tissue (13; --)
Amygdala (4; --)
Liver (76; 2 g)
Hypothalamus (12; < 60 mg)
Skeletal muscle (47; --)
Pituitary (29; 3 mg)
Epidermis (4; --)
Trigeminal (7; --)
Prostate (24; 0.11 g)
Thymus (64; < 60 mg)
Thyroid (21; 15 mg)
Tongue epidermis (14; --)
Retina (71; --)
Comparison of compactness between genes expressed at different levelsa
Average intron length
Total intron length
Top 30% quantile
2768 ± 608
28117 ± 7347
8 ± 1
1313 ± 90
775 ± 107
5369 ± 770
Bottom 30% quantile
10448 ± 4237
901046 ± 33210
9 ± 1
1764 ± 232
1478 ± 244
267 ± 14
P = 0.001
P = 0.019
P = 0.844
P = 0.273
Top 30% quantile
2631 ± 290
16190 ± 1828
7 ± 1
1214 ± 65
779 ± 136
6219 ± 794
bottom 30% quantile
8032 ± 2706
37391 ± 4615
8 ± 1
1450 ± 128
1496 ± 190
365 ± 16
P = 0.001
P = 0.001
P = 0.444
P = 0.589
P = 0.001
The weight ratio of a large tissue/organ to a small tissue/organ is much larger than the ratio in mRNA abundance required producing a significant difference in average intron length, total intron length and UTR length. However, large differences in tissue/organ weights do not produce significant differences in intron length or UTR length (Figure 1). This result is unexpected on the basis of the energetic cost hypothesis.
Qualitatively estimating the energetic burden of long introns in highly expressed genes
We also qualitatively estimated the length and number of introns in genomes that may be selected against because of their energetic cost during transcription. In a highly expressed housekeeping gene (housekeeping genes are expressed in all cells in the human body, so their cumulative energetic burden is higher), let us assume that there is an intron with the threshold length (L) to trigger natural selection. Several studies have shown that most eukaryotic genes are expressed at the level of two or three copies of mRNA per cell [25–27], so a gene that produces 30 mRNA copies in each cell can be viewed as a highly expressed gene. The median half-life of human mRNA is about 10 h, and fast decay mRNAs have half-lives of < 2 h . For a conservative estimation, we can assume that the gene needs to synthesize 30 mRNA copies every 2 h, that is, 360 mRNA copies per day, per cell. The expense of transcription is two ATP molecules per nucleotide. Therefore, transcription of the intron requires 360 × 2 L = 720 L ATP molecules per day in each cell. Estimates of the number of cells in an adult human body vary from 1013 to 1014 . For a conservative estimation of the energetic cost of gene transcription, we used the higher value, 1014 cells. As an adult human consumes about 200 mol of ATP per day [18, 30], the energy consumption of each human cell is (200 × 6.02 × 1023)/1014 = 1.2 × 1012 ATP molecules per day. It should be noted that this is a conservative estimation; the energy consumption involved in strenuous exercises (for example, mountain climbing) may be as much as 10 times more than that used when resting . The proportion of human daily energy consumption representing the energetic cost of the long putative intron of a highly expressed housekeeping gene (which can be considered as the coefficient of natural selection, S) is 720 L/(1.2 × 1012) = 6 L × 10-10. The recent effective population size (Ne) of humans is ≤ 104 [31, 32]. According to S = 1/(2 Ne) as the margin above which natural selection is stronger than genetic drift, L = 1/(2 × 104× 6 × 10-10) = 8.3 × 104 nt. In human genome, only 0.9% of introns are longer than this threshold. In principal, this estimation is applicable to the energetic cost of the transcription of a CDS or UTR.
The major differences between humans and mice are in their body sizes, their metabolic rates and their effective population sizes. We could not find an estimation of the number of cells in a mouse body. However we did find data on mass-specific metabolic rates [33, 34], from which we can estimate energy consumption per mouse cell by assuming that human and mouse cells do not differ greatly in mass. The mass-specific metabolic rate of mice is 0.0151 W/g and that of humans is 0.00118 W/g , so a mouse cell uses ~12.8 times more energy than a human cell. As estimated above, the energy consumption of each human cell is about 1.2 × 1012 ATP molecules per day, so that of each mouse cell is about 1.5 × 1013 ATP molecules per day. The proportion of mouse daily energy consumption (S) representing the energetic cost of the long putative intron of a highly expressed housekeeping gene is (360 × 2 L)/(1.5 × 1013) = 4.8 L × 10-11, where L is defined as described in the previous paragraph. Different sources of data on the effective population size of mice are not consistent [35, 36]; we retained a higher value (Ne = 8.1 × 105) for a conservative estimation. Thus, in mice, the threshold length of introns to trigger natural selection is L = 1/(2 × 8.1 × 105× 4.8 × 10-11) = 1.3 × 104 nt. Similar to the situation in humans, only a small fraction of introns in the mouse genome (6.8%) are longer than this threshold.
Owing to a lack of the required information (such as mRNA decay rates), it is impossible to accurately estimate the burden of long introns in other vertebrates and invertebrates. Considering that the effective population size of vertebrates is only about 104 , we suggest that long introns in highly expressed vertebrate genes are unlikely to be selected against. However, for invertebrates, with an effective population size of about 106 , it would be too bold to give a rough estimation.
Benefiting from the extensive studies on yeast Saccharomyces cerevisiae, we also found enough data to estimate the energetic burden of a long intron in a unicellular eukaryote. A gene that produces 30 mRNA copies in each cell can also be viewed as a highly expressed gene in yeasts [25–27]. The median half-life of yeast mRNAs is about 21 min, and the 90th percentile of mRNA half-lives is 10 min . Conservatively, we assumed that such a gene would need to synthesize 30 mRNA copies every 10 min; that is, 30 × 24 × 60/10 = 4320 copies of mRNA every day. To transcribe a long intron, a yeast cell consumes 4320 × 2 L = 8340 L ATP molecules, where L is defined as previously. A yeast cell weighs 3.35 × 10-11 g and the median value of yeast metabolic rates at eight different temperatures is 0.267 W/g , so the metabolic rate of a yeast cell is 8.9 × 10-12 W, which can be convert to 1.39 × 1013 ATP molecules per day. The proportion of yeast daily energy consumption representing the energetic cost of the putative long intron in a highly expressed gene is 8640 L/(1.39 × 1013) = 6.2 L × 10-10. The effective population size of yeasts is about 107 [37, 39]. Thus, in yeasts, the threshold length of introns to trigger natural selection is L = 1/(2 × 107× 6.2 × 10-10) = 81 nt. Unlike the situation in humans and mice, 86.5% of the introns in the genome of S. cerevisiae are longer than this threshold length. The fractional energetic cost of long introns may be overestimated here; thus the extant long introns, even in highly expressed genes, may be not under negative selection. At least, this result is helpful to explain the fact that unicellular eukaryotes generally have much shorter introns than mammals, and it is consistent with a previous study, which showed that energy is a constraint on evolutionary changes in yeast gene expression . However, these estimations are at least seemingly contradictory to the observations that highly expressed genes have longer introns than weakly expressed genes in yeasts [40, 41]. To reach a conclusion, further investigations are required.
Considered just from the point of view of the energetic cost of transcription, loss of entire introns may be favored in yeasts, but unlikely in mammals. On the other side, intron gain may be selected against in yeasts, but is most likely neutral, and thus, under genetic drift in mammals. This idea is consistent with the paucity of introns in yeast genes and the abundance of introns in animal genes [42, 43]. Previously, the existence of different rates of intron loss in the evolution of different lineages was explained by differential retrotransposon activities [44–46]. We look forward to further evidence to determine whether selection to reduce energetic cost is a complementary explanation. In evolution, insertion of several nucleotides or various transposons into introns and deletion of short sequences from introns are much more frequent than gain and loss of entire introns. Considered just from the point of view of the energetic cost of transcription, the effects of common indels are negligible in mammals, but visible to natural selection in yeasts. This idea is similar to the theory of Lynch on the evolution of genome complexity [47, 48].
Alternate hypotheses for short introns in highly expressed genes
The first alternate hypothesis is the time cost hypothesis. RNA polymerase II can elongate only about 20–40 nt per second [1, 49]. Recent evidence indicates that elongation, instead of RNA polymerase II recruitment, may be the predominant rate-limiting event in gene activation [50, 51]. Therefore, gene length should have an important impact on the duration of gene expression. To be completely transcribed, a large gene in the human genome, such as DMD (2.3 Mb), requires 16 hours , a medium-sized gene (for example, TUBE1, 16.7 Kb) requires about 7–14 minutes, and a small gene (for example, HBA2, 834 bp) requires only about 20–40 seconds. Seoighe et al.  argued that the time required to transcribe multiple copies of mRNA is not a multiple of the transcription period of the first copy, because one template can be transcribed by several polymerases simultaneously . Assuming a normal elongation rate of 0.03 seconds per nucleotide, the completion of the transcription of the first copy of a gene with L nt requires 0.03 L seconds. Assuming that there are k polymerases attached to the same template simultaneously, the completion of an additional copy of this transcript requires an additional 0.03 L/k seconds. Thus, the completion of the transcription of n copies of an mRNA requires T n = 0.03 L (1 + (n-1)/k) seconds. Apparently, if n <<k, T n ≈ 0.03 L, gene length and transcript copy number are not related. However, in highly expressed genes, n is unlikely to be much smaller than k; thus, both gene length (L) and transcript copy number (n) contribute to the duration of transcription. To produce a large number of transcripts in a limited period of time, natural selection may decrease L or increase k. Unfortunately no genome-wide data on the values for k are now available in animals.
On the other side of the same coin, the time taken to transcribe introns has long been proposed to contribute to the timing mechanisms during development [52–54]. An extension of this hypothesis is that long introns may be maintained in some genes to reduce the number of mRNA products in the otherwise too-long time during which the genes are activated.
Another alternate hypothesis is that short genes may experience lower frequencies of abortive transcription and/or erroneous splicing than long genes. Successful transcription requires the polymerase to be stably associated with the DNA template during the elongation process. However, in some cases, the RNA-DNA duplex may not be stable enough to avoid abnormal pausing and arrest of elongation . In a study of the human DMD gene, Tennyson et al.  found that 30–40% of transcription events were terminated or stopped at premature sites. Recently, Guenther et al.  found that many genes that have experienced transcription initiation do not produce complete transcripts. The short lengths of highly expressed genes may lead to a decreased possibility of a gene containing such sequences that are difficult to transcribe and cause abortion of elongation. In addition, evidence shows that long introns increase the frequency of erroneous splicing of nearby exons .
Long introns (and long UTRs) in highly expressed genes may also be selected against because of the crowding of active genes in a restricted interchromatin compartment .
A slightly more speculative and seemingly less likely hypothesis is that long introns are selected for in weakly expressed genes to avoid DNA damage resulting from transcriptional R-loops [6, 58]. The fact that mRNA lengths have a similar correlation with expression levels as intron lengths [1, 6, 9] negates this hypothesis.
In addition, there is also the possibility that highly expressed genes are compact because their epigenetic regulation is relatively simple, as suggested by the "genome design" hypothesis . Although there is some evidence against this idea, indicating that the lengths of intergenic spacers rather than those of introns are correlated with the complexity of epigenetic regulation [6, 59], there is also evidence supporting it [60–64].
In contrast to the observations that highly expressed genes have short introns in animals, P. patens and the pollen of A. thaliana, highly expressed genes were found to have longer introns than weakly expressed genes in unicellular organisms, the sporophytes of A. thaliana and Oryza sativa, and, at least, the vegetative stage of the slime mould Dictyostelium discoideum [[40, 41, 65], Y.F. Huang and D.K. Niu, unpublished results from analyzing the data from ]. To date, there has been no satisfactory explanation for this difference [4, 65]. Perhaps, the compact genomes and compact genes in large genomes have lost most of their nonfunctional sequences; thus, most of the retained intronic sequences have regulatory functions in gene expression [67–70]. Surprisingly, a weak, but significant negative correlation of mRNA length (and protein length) with expression level was found in all studied organisms [1, 2, 5, 6, 71–74], which is also generally explained by minimizing the energetic cost of gene expression. In light of this study, we suggest other potential reasons for the short introns of highly expressed genes: to minimize the duration of gene expression, or to reduce the frequencies of abortive transcription and/or erroneous splicing. However, we do not wish to completely discount the energetic cost hypothesis for mRNA compactness, because we have insufficient data on protein abundance (note that translation is also an expensive process).
By assuming that intronic sequences are mostly junky, it is reasonable to attribute the fact that highly expressed genes have short introns to potential selection to minimize the energetic cost of gene expression. However, this hypothesis is not supported by our comparison of tissue/organ-specific genes between large tissue/organs and small tissue/organs in humans or mice. In addition, by conservatively selecting the values of a series of parameters, we quantitively estimated the energetic burden of a long intron in highly expressed genes. In mammals, the burden seems to be too negligible to trigger purifying selection against long introns. Further investigations are required to establish a new theory from a series of alternate hypotheses.
The reference genomes of Homo sapiens (build 36, version 2) and Mus musculus (build 36, version 1) were downloaded from the NCBI genome database . These genomes have been reviewed by NCBI staff. Genes with obvious annotation errors were excluded from our analyses. In the case of alternative splicing variants, we used the longest mRNA for analysis (although similar results were obtained by analyzing the shortest mRNA, data not shown). UTRs shorter than 30 nt were considered as trustless annotations. In analyzing UTR length, we retained only those genes with both 5' -UTRs and 3' -UTRs of 30 nt or longer. The UTR length of a gene is the sum of the lengths of its 5' UTR and 3' UTR.
The microarray gene expression datasets of H. sapiens and M. musculus were downloaded from GNF Genome Informatics Applications & Datasets [15, 76]. These are the most extensive gene expression datasets freely available online. Besides quantitive signals, the datasets contain qualitative indicators of gene expression for each Affymetrix probe set in each tissue/organ sample: P (present), M (marginal), A (absent). Several probe sets may be annotated as one gene and each probe set has two repeats. In this study, we defined a gene as being expressed in a tissue/organ sample by a conservative criterion and a relaxed one. In the conservative criterion, all probe sets and repeats of a gene should be marked as P in the datasets, and in the relaxed criterion, two repeats of at least one probe set should be marked as P or M. These two criteria gave similar results. We present the results of analysis based on the conservative criterion in the main text of this paper, and those based on the relaxed criterion as Figure S1 and Table S1 of Additional File 3. Some probes of the probe sets annotated with a "_x" appended to the probe set name may cross-hybridize with other sequences, and so the resulting signal may partially arise from transcripts other than the one being intentionally measured (Affymetrix Technical Note, Array Design for the HGU133 set). We repeated our analysis by removing such probe sets from the gene expression datasets and obtained similar results (see Figure S2 and Table S2 of Additional File 3).
where A is the expression level of an LTS gene and B is the expression level of an STS gene. As shown in Figure S4 of Additional File 3, the within-pair differences in expression levels were not biased to either LTS genes or STS genes.
We thank the anonymous referees for their comments. This study was supported by Beijing Normal University and Program for NCET-07-0094.
- Castillo-Davis CI, Mekhedov SL, Hartl DL, Koonin EV, Kondrashov FA: Selection for short introns in highly expressed genes. Nat Genet. 2002, 31 (4): 415-418.PubMedGoogle Scholar
- Comeron JM: Selective and mutational patterns associated with gene expression in humans: Influences on synonymous composition and intron presence. Genetics. 2004, 167 (3): 1293-1304. 10.1534/genetics.104.026351.PubMed CentralView ArticlePubMedGoogle Scholar
- Seoighe C, Gehring C, Hurst LD: Gametophytic selection in Arabidopsis thaliana supports the selective model of intron length reduction. PLoS Genet. 2005, 1 (2): e13-10.1371/journal.pgen.0010013.PubMed CentralView ArticlePubMedGoogle Scholar
- Stenoien HK: Compact genes are highly expressed in the moss Physcomitrella patens. J Evol Biol. 2007, 20 (3): 1223-1229. 10.1111/j.1420-9101.2007.01301.x.View ArticlePubMedGoogle Scholar
- Urrutia AO, Hurst LD: The signature of selection mediated by expression on human genes. Genome Res. 2003, 13 (10): 2260-2264. 10.1101/gr.641103.PubMed CentralView ArticlePubMedGoogle Scholar
- Li SW, Feng L, Niu DK: Selection for the miniaturization of highly expressed genes. Biochem Biophys Res Commun. 2007, 360 (3): 586-592. 10.1016/j.bbrc.2007.06.085.View ArticlePubMedGoogle Scholar
- Buckley KM, Smith LC: Extraordinary diversity among members of the large gene family, 185/333, from the purple sea urchin, Strongylocentrotus purpuratus. BMC Mol Biol. 2007, 8: 68-10.1186/1471-2199-8-68.PubMed CentralView ArticlePubMedGoogle Scholar
- Hurst LD, McVean G, Moore T: Imprinted genes have few and small introns. Nat Genet. 1996, 12 (3): 234-237. 10.1038/ng0396-234.View ArticlePubMedGoogle Scholar
- Eisenberg E, Levanon EY: Human housekeeping genes are compact. Trends Genet. 2003, 19 (7): 362-365. 10.1016/S0168-9525(03)00140-9.View ArticlePubMedGoogle Scholar
- Vinogradov AE: Compactness of human housekeeping genes: selection for economy or genomic design?. Trends Genet. 2004, 20 (5): 248-253. 10.1016/j.tig.2004.03.006.View ArticlePubMedGoogle Scholar
- Chen J, Sun M, Hurst LD, Carmichael GG, Rowley JD: Human antisense genes have unusually short introns: evidence for selection for rapid transcription. Trends Genet. 2005, 21 (4): 203-207. 10.1016/j.tig.2005.02.003.View ArticlePubMedGoogle Scholar
- Chen J, Sun M, Rowley JD, Hurst LD: The small introns of antisense genes are better explained by selection for rapid transcription than by "genomic design". Genetics. 2005, 171 (4): 2151-2155. 10.1534/genetics.105.048066.PubMed CentralView ArticlePubMedGoogle Scholar
- Jeffares DC, Mourier T, Penny D: The biology of intron gain and loss. Trends Genet. 2006, 22 (1): 16-22. 10.1016/j.tig.2005.10.006.View ArticlePubMedGoogle Scholar
- Femino AM, Fay FS, Fogarty K, Singer RH: Visualization of single RNA transcripts in situ. Science. 1998, 280 (5363): 585-590. 10.1126/science.280.5363.585.View ArticlePubMedGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA. 2004, 101 (16): 6062-6067. 10.1073/pnas.0400782101.PubMed CentralView ArticlePubMedGoogle Scholar
- de la Grandmaison GL, Clairand I, Durigon M: Organ weight in 684 adult autopsies: new tables for a Caucasoid population. Forensic Sci Int. 2001, 119 (2): 149-154. 10.1016/S0379-0738(00)00401-1.View ArticlePubMedGoogle Scholar
- Weniger G, Lange C, Irle E: Abnormal size of the amygdala predicts impaired emotional memory in major depressive disorder. J Affect Disord. 2006, 94 (1-3): 219-229. 10.1016/j.jad.2006.04.017.View ArticlePubMedGoogle Scholar
- Flindt R: Amazing Numbers in Biology. 2006, Berlin , Springer-Verlag, 295-Google Scholar
- Janssen I, Heymsfield SB, Wang ZM, Ross R: Skeletal muscle mass and distribution in 468 men and women aged 18-88 yr. J Appl Physiol. 2000, 89 (1): 81-88.PubMedGoogle Scholar
- International Commission on Radiological Protection: Reference Man: Anatomical, Physiological and Metabolic Characteristics. 1975, Elsevier, 512-Google Scholar
- Kyselova V, Peknicova J, Buckiova D, Boubelik M: Effects of p-nonylphenol and resveratrol on body and organ weight and in vivo fertility of outbred CD-1 mice. Reprod Biol Endocrinol. 2003, 1 (1): 30-10.1186/1477-7827-1-30.PubMed CentralView ArticlePubMedGoogle Scholar
- Rossier J, Rogers J, Shibasaki T, Guillemin R, Bloom FE: Opioid peptides and alpha -melanocyte-stimulating hormone in genetically obese (ob/ob) mice during development. Proc Natl Acad Sci USA. 1979, 76 (4): 2077-2080. 10.1073/pnas.76.4.2077.PubMed CentralView ArticlePubMedGoogle Scholar
- Mukherjee K, Knisely A, Jacobson L: Partial glucocorticoid agonist-like effects of imipramine on hypothalamic-pituitary-adrenocortical activity, thymus weight, and hippocampal glucocorticoid receptors in male C57BL/6 mice. Endocrinology. 2004, 145 (9): 4185-4191. 10.1210/en.2004-0147.View ArticlePubMedGoogle Scholar
- Fujimoto N, Watanabe H, Nakatani T, Roy G, Ito A: Induction of thyroid tumours in (C57BL/6NxC3H/N) F1 mice by oral administration of kojic acid. Food Chem Toxicol. 1998, 36 (8): 697-703. 10.1016/S0278-6915(98)00030-1.View ArticlePubMedGoogle Scholar
- Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA: Dissecting the regulatory circuitry of a eukaryotic genome. Cell. 1998, 95 (5): 717-728. 10.1016/S0092-8674(00)81641-4.View ArticlePubMedGoogle Scholar
- Wang Y, Liu CL, Storey JD, Tibshirani RJ, Herschlag D, Brown PO: Precision and functional specificity in mRNA decay. Proc Natl Acad Sci USA. 2002, 99 (9): 5860-5865. 10.1073/pnas.092538799.PubMed CentralView ArticlePubMedGoogle Scholar
- Carter MG, Sharov AA, VanBuren V, Dudekula DB, Carmack CE, Nelson C, Ko MSH: Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray. Genome Biol. 2005, 6 (7): R61-10.1186/gb-2005-6-7-r61.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang E, van Nimwegen E, Zavolan M, Rajewsky N, Schroeder M, Magnasco M, Darnell JE: Decay rates of human mRNAs: Correlation with functional characteristics and sequence attributes. Genome Res. 2003, 13 (8): 1863-1872.PubMed CentralPubMedGoogle Scholar
- Freitas RA: Nanomedicine, Volume I: Basic Capabilities. 1999, Georgetown, TX , Landes BioscienceGoogle Scholar
- Voet D, Voet JG: Biochemistry. 1990, New York , John Wiley & Sons, 1223-Google Scholar
- Tenesa A, Navarro P, Hayes BJ, Duffy DL, Clarke GM, Goddard ME, Visscher PM: Recent human effective population size estimated from linkage disequilibrium. Genome Res. 2007, 17 (4): 520-526. 10.1101/gr.6023607.PubMed CentralView ArticlePubMedGoogle Scholar
- Takahata N: Allelic genealogy and human evolution. Mol Biol Evol. 1993, 10 (1): 2-22.PubMedGoogle Scholar
- Savage VM, Allen AP, Brown JH, Gillooly JF, Herman AB, Woodruff WH, West GB: Scaling of number, size, and metabolic rate of cells with body size in mammals. Proc Natl Acad Sci USA. 2007, 104 (11): 4718-4723. 10.1073/pnas.0611235104.PubMed CentralView ArticlePubMedGoogle Scholar
- Savage VM, Gillooly JF, Woodruff WH, West GB, Allen AP, Enquist BJ, Brown JH: The predominance of quarter-power scaling in biology. Funct Ecol. 2004, 18 (2): 257-282. 10.1111/j.0269-8463.2004.00856.x.View ArticleGoogle Scholar
- Nachman MW: Patterns of DNA variability at X-linked loci in Mus domesticus. Genetics. 1997, 147 (3): 1303-1316.PubMed CentralPubMedGoogle Scholar
- Keightley PD, Lercher MJ, Eyre-Walker A: Evidence for widespread degradation of gene control regions in hominid genomes. PLoS Biol. 2005, 3 (2): 282-288. 10.1371/journal.pbio.0030042.View ArticleGoogle Scholar
- Lynch M: The origins of eukaryotic gene structure. Mol Biol Evol. 2006, 23 (2): 450-468. 10.1093/molbev/msj050.View ArticlePubMedGoogle Scholar
- Gillooly JF, Brown JH, West GB, Savage VM, Charnov EL: Effects of size and temperature on metabolic rate. Science. 2001, 293 (5538): 2248-2251. 10.1126/science.1061967.View ArticlePubMedGoogle Scholar
- Wagner A: Energy constraints on the evolution of gene expression. Mol Biol Evol. 2005, 22 (6): 1365-1374. 10.1093/molbev/msi126.View ArticlePubMedGoogle Scholar
- Vinogradov AE: Intron length and codon usage. J Mol Evol. 2001, 52 (1): 2-5.View ArticlePubMedGoogle Scholar
- Juneau K, Miranda M, Hillenmeyer ME, Nislow C, Davis RW: Introns regulate RNA and protein abundance in yeast. Genetics. 2006, 174 (1): 511-518. 10.1534/genetics.106.058560.PubMed CentralView ArticlePubMedGoogle Scholar
- Mourier T, Jeffares DC: Eukaryotic intron loss. Science. 2003, 300 (5624): 1393-10.1126/science.1080559.View ArticlePubMedGoogle Scholar
- Niu DK, Hou WR, Li SW: mRNA-mediated intron losses: evidence from extraordinarily large exons. Mol Biol Evol. 2005, 22 (6): 1475-1481. 10.1093/molbev/msi138.View ArticlePubMedGoogle Scholar
- Roy SW, Hartl DL: Very little intron loss/gain in Plasmodium: Intron loss/gain mutation rates and intron number. Genome Res. 2006, 16 (6): 750-756. 10.1101/gr.4845406.PubMed CentralView ArticlePubMedGoogle Scholar
- Roy SW, Penny D: Large-scale intron conservation and order-of-magnitude variation in intron loss/gain rates in apicomplexan evolution. Genome Res. 2006, 16 (10): 1270-1275. 10.1101/gr.5410606.PubMed CentralView ArticlePubMedGoogle Scholar
- Roy SW, Penny D: Widespread intron loss suggests retrotransposon activity in ancient apicomplexans. Mol Biol Evol. 2007, 24 (9): 1926-1933. 10.1093/molbev/msm102.View ArticlePubMedGoogle Scholar
- Lynch M, Conery JS: The origins of genome complexity. Science. 2003, 302 (5649): 1401-1404. 10.1126/science.1089370.View ArticlePubMedGoogle Scholar
- Lynch M: The Origins of Genome Architecture. Sunderland, Sinauer Associates, Inc., 494-2007Google Scholar
- Tennyson CN, Klamut HJ, Worton RG: The human dystrophin gene requires 16 hours to be transcribed and is cotranscriptionally spliced. Nat Genet. 1995, 9 (2): 184-190. 10.1038/ng0295-184.View ArticlePubMedGoogle Scholar
- Guenther MG, Levine SS, Boyer LA, Jaenisch R, Young RA: A chromatin landmark and transcription initiation at most promoters in human cells. Cell. 2007, 130 (1): 77-88. 10.1016/j.cell.2007.05.042.PubMed CentralView ArticlePubMedGoogle Scholar
- Darzacq X, Shav-Tal Y, de Turris V, Brody Y, Shenoy SM, Phair RD, Singer RH: In vivo dynamics of RNA polymerase II transcription. Nat Struct Mol Biol. 2007, 14 (9): 796-806. 10.1038/nsmb1280.View ArticlePubMedGoogle Scholar
- Gubb D: Intron-delay and the precision of expression of homoeotic gene products in Drosophila. Dev Genet. 1986, 7 (3): 119-131. 10.1002/dvg.1020070302.View ArticleGoogle Scholar
- Thummel CS: Mechanisms of transcriptional timing in Drosophila. Science. 1992, 255 (5040): 39-40. 10.1126/science.1553530.View ArticlePubMedGoogle Scholar
- Swinburne IA, Silver PA: Intron delays and transcriptional timing during development. Dev Cell. 2008, 14 (3): 324-330. 10.1016/j.devcel.2008.02.002.PubMed CentralView ArticlePubMedGoogle Scholar
- Palangat M, Landick R: Roles of RNA : DNA hybrid stability, RNA structure, and active site conformation in pausing by human RNA polymerase II. J Mol Biol. 2001, 311 (2): 265-282. 10.1006/jmbi.2001.4842.View ArticlePubMedGoogle Scholar
- Fox-Walsh KL, Dou YM, Lam BJ, Hung SP, Baldi PF, Hertel KJ: The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proc Natl Acad Sci USA. 2005, 102 (45): 16176-16181. 10.1073/pnas.0508489102.PubMed CentralView ArticlePubMedGoogle Scholar
- Prachumwat A, DeVincentis L, Palopoli MF: Intron size correlates positively with recombination rate in Caenorhabditis elegans. Genetics. 2004, 166 (3): 1585-1590. 10.1534/genetics.166.3.1585.PubMed CentralView ArticlePubMedGoogle Scholar
- Niu DK: Protecting exons from deleterious R-loops: a potential advantage of having introns. Biol Direct. 2007, 2 (1): 11-10.1186/1745-6150-2-11.PubMed CentralView ArticlePubMedGoogle Scholar
- Farre D, Bellora N, Mularoni L, Messeguer X, Alba MM: Housekeeping genes tend to show reduced upstream sequence conservation. Genome Biol. 2007, 8 (7): R140-10.1186/gb-2007-8-7-r140.PubMed CentralView ArticlePubMedGoogle Scholar
- Nakaya H, Amaral P, Louro R, Lopes A, Fachel A, Moreira Y, El-Jundi T, da Silva A, Reisand E, Verjovski-Almeida S: Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue-specific patterns and enrichment in genes related to regulation of transcription. Genome Biol. 2007, 8 (3): R43-10.1186/gb-2007-8-3-r43.PubMed CentralView ArticlePubMedGoogle Scholar
- Pozzoli U, Menozzi G, Comi GP, Cagliani R, Bresolin N, Sironi M: Intron size in mammals: complexity comes to terms with economy. Trends Genet. 2007, 23 (1): 20-24. 10.1016/j.tig.2006.10.003.View ArticlePubMedGoogle Scholar
- Vinogradov AE: "Genome design" model: Evidence from conserved intronic sequence in human-mouse comparison. Genome Res. 2006, 16 (3): 347-354. 10.1101/gr.4318206.PubMed CentralView ArticlePubMedGoogle Scholar
- Petit N, Casillas S, Ruiz A, Barbadilla A: Protein polymorphism is negatively correlated with conservation of intronic sequences and complexity of expression patterns in Drosophila melanogaster. J Mol Evol. 2007, 64 (5): 511-518. 10.1007/s00239-006-0047-5.View ArticlePubMedGoogle Scholar
- Haddrill P, Charlesworth B, Halligan D, Andolfatto P: Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biol. 2005, 6 (8): R67-10.1186/gb-2005-6-8-r67.PubMed CentralView ArticlePubMedGoogle Scholar
- Ren XY, Vorst O, Fiers MWEJ, Stiekema WJ, Nap JP: In plants, highly expressed genes are the least compact. Trends Genet. 2006, 22 (10): 528-532. 10.1016/j.tig.2006.08.008.View ArticlePubMedGoogle Scholar
- Iranfar N, Fuller D, Loomis WF: Transcriptional regulation of post-aggregation genes in Dictyostelium by a feed-forward loop involving GBF and LagC. Dev Biol. 2006, 290 (2): 460-469. 10.1016/j.ydbio.2005.11.035.View ArticlePubMedGoogle Scholar
- Castillo-Davis CI: The evolution of noncoding DNA: how much junk, how much func?. Trends Genet. 2005, 21 (10): 533-536. 10.1016/j.tig.2005.08.001.View ArticlePubMedGoogle Scholar
- Pleiss JA, Whitworth GB, Bergkessel M, Guthrie C: Rapid, transcript-specific changes in splicing in response to environmental stress. Mol Cell. 2007, 27 (6): 928-937. 10.1016/j.molcel.2007.07.018.PubMed CentralView ArticlePubMedGoogle Scholar
- Yu J, Yang ZY, Kibukawa M, Paddock M, Passey DA, Wong GKS: Minimal introns are not "junky". Genome Res. 2002, 12 (8): 1185-1189. 10.1101/gr.224602.PubMed CentralView ArticlePubMedGoogle Scholar
- Gazave E, Marques-Bonet T, Fernando O, Charlesworth B, Navarro A: Patterns and rates of intron divergence between humans and chimpanzees. Genome Biol. 2007, 8 (2): R21-10.1186/gb-2007-8-2-r21.PubMed CentralView ArticlePubMedGoogle Scholar
- Coghlan A, Wolfe KH: Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000, 16 (12): 1131-1145. 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO;2-F.View ArticlePubMedGoogle Scholar
- Jansen R, Gerstein M: Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins. Nucleic Acids Res. 2000, 28 (6): 1481-1488. 10.1093/nar/28.6.1481.PubMed CentralView ArticlePubMedGoogle Scholar
- Akashi H: Translational selection and yeast proteome evolution. Genetics. 2003, 164 (4): 1291-1303.PubMed CentralPubMedGoogle Scholar
- Warringer J, Blomberg A: Evolutionary constraints on yeast protein size. BMC Evol Biol. 2006, 6 (1): 61-10.1186/1471-2148-6-61.PubMed CentralView ArticlePubMedGoogle Scholar
- NCBI genome database. [ftp://ftp.ncbi.nih.gov/genomes/]
- GNF Genome Informatics Applications & Datasets . [http://wombat.gnf.org/index.html]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.