Bidirectional transcription of a novel chimeric gene mapping to mouse chromosome Yq
© Ellis et al; licensee BioMed Central Ltd. 2007
Received: 17 April 2007
Accepted: 24 September 2007
Published: 24 September 2007
The male-specific region of the mouse Y chromosome long arm (MSYq) contains three known highly multi-copy X-Y homologous gene families, Ssty1/2, Sly and Asty. Deletions on MSYq lead to teratozoospermia and subfertility or infertility, with a sex ratio skew in the offspring of subfertile MSYqdel males
We report the highly unusual genomic structure of a novel MSYq locus, Orly, and a diverse set of spermatid-specific transcripts arising from copies of this locus. Orly is composed of partial copies of Ssty1, Asty and Sly arranged in sequence. The Ssty1- and Sly- derived segments are in antisense orientation relative to each other, leading to bi-directional transcription of Orly. Genome search and phylogenetic tree analysis is used to determine the order of events in mouse Yq evolution. We find that Orly is the most recent gene to arise on Yq, and that subsequently there was massive expansion in copy number of all Yq-linked genes.
Orly has an unprecedented chimeric structure, and generates both "forward" (Orly) and "reverse" (Orlyos) transcripts arising from the promoters at each end of the locus. The region of overlap of known Orly and Orlyos transcripts is homologous to Sly intron 2. We propose that Orly may be involved in an intragenomic conflict between mouse X and Y chromosomes, and that this process underlies the massive expansion in copy number of the genes on MSYq and their X homologues.
The mammalian Y chromosome is constitutively haploid, restricted to males, and subject to ongoing genetic deterioration due to lack of recombinational exchange with a homologous partner. Set against this, however, there is strong evolutionary drive to preserve the function of male-benefit genes on the Y chromosome, and to acquire novel male-benefit genes on the Y [1–7]. These opposing effects lead to a heterogeneous structure of Y chromosomal DNA, with functional genes (often male specific, sometimes highly amplified) set among a sea of degenerate pseudogenes, repetitive sequence, and parasitic transposable elements.
The long arm of the mouse Y chromosome is a spectacular example of this process, being highly repetitive, transcriptionally silent in the majority of cell types, and yet indispensable for normal spermatogenesis [8–14]. Deletions on mouse Yq lead to teratozoospermia and reduced fertility. The severity of the phenotypes varies according to the extent of the deletion, with large deletions (> = 9/10 of Yq) resulting in complete infertility [8, 13], while smaller deletions (~2/3 of Yq) result in reduced fertility and a less severe sperm shape abnormality [9, 11, 14]. Intriguingly, the offspring of males with 2/3 Yq deletions show an approximately 60:40 sex ratio skew in favour of females , and this is due to reduced efficiency of Y-bearing sperm .
Recently, we have made considerable progress in defining the gene content of mouse Yq, identifying two new repeat gene families (Sly, Asty) in addition to the one family previously known (Ssty1/2). During this work, we observed novel "recombinant" transcripts arising from loci that contain exons from both Ssty1 and Asty, and termed this new transcript Asty(rec) . Here, we describe the detailed genomic arrangement of this rearranged locus and show expression of a large variety of transcriptional variants arising from these rearranged loci. These variant transcripts are differentially regulated during testis development.
We were also interested to know how these rearranged loci arose, and whether there were further examples of such "exon shuffling" on mouse Yq. We therefore compared the genomic organisation of the loci encoding all known Yq genes to each other and to their X-linked homologues, in order to more clearly delineate the composition of the novel rearranged loci, the differences between each of the Yq genes and their X-linked relatives, and the sequence of events involved in the genesis and amplification of these genes.
Finally, we investigated the wider genomic context of the rearranged loci by in silico mapping of the location of all known MSYq genes within the currently-released draft Y chromosome sequence contigs. The MSYq gene copies located by the mapping project were used to construct phylogenetic trees elucidating the sequence of events in MSYq evolution
A rearranged locus formed by chimerism between three Yq-specific genes
As previously detailed , a search of the nr database revealed two full-length cDNAs arising from Orly, both originating from the relict Ssty1 promoter. In this article we will refer to these transcripts as Orly_v1 (accession number [GenBank:AK015935]) and Orly_v2 ([Genbank:AK016790], referred to in our previous work as Asty(rec)). The splicing patterns of Orly_v1 and Orly_ v2 are shown in Figure 1B.
Orly generates a wide diversity of alternative splice variants
Primers used to characterise Orly transcripts (locations shown in Figure 1C)
When using primer pairs directed at the outermost exons (S1 and N2), a single major band corresponding to Orly_v1 is observed, suggesting that this is the most abundant Orly transcript. Two faint larger bands were also detected by the S1.f2/N2.r2 primer pair (arrowed in Figure 3), however, we were unable to obtain sequence for these products. These upper bands are likely to represent transcripts including portions of Ssty1 exons 2/3 or Asty exons 2–4, as detected in the other reactions (see below). No larger bands were seen in the S1.f1/N2.r1 reaction. It is possible that the larger bands detected by the S1.f2/N2.r2 primer pair arises from copies of the Orly locus where the exon S1.f1 and/or N2.r1 primer binding sites are mutated.
Using primer pairs directed at the other Orly_v2 exons gave a wide variety of bands. Many of the products generated from RT-PCR on adult testis were gel purified and sequenced to confirm which regions of Orly are included in each detected transcript. Unfortunately we were unable to generate clean sequence for the products produced by the S3a.f/A3.r primer pair. This is likely due to the presence of a large number of similarly-sized transcripts which cannot be separated on a gel. The resulting partial transcriptional map for Orly is shown in Figure 1D. In most cases, the sequenced bands correspond to spliced transcripts, however, the majority of the products do not conform to the splicing pattern of Orly_v1 or Orly_v2. It appears that there is a plethora of different Orly isoforms expressed at low levels, which are only detected when specific primers are used.
We investigated whether any of the detected Orly isoforms had any significant coding potential. The Ssty1 open reading frame is encoded by exon 3 of Ssty1, which is not fully included in any Orly transcriptional variant (though two shorter forms of this exon are variably included). Orly transcripts thus do not encode the SSTY1 protein. The Sly portion of the locus is in antisense, thus Orly transcripts cannot encode any portion of SLY. Finally, Asty does not contain any open reading frame, thus the Asty-related portion of Orly also has no coding potential. Further electronic searching of the various Orly transcriptional variants revealed no significant open reading frames other than a partial degenerate retroviral pol sequence (see below).
Orly retains potential promoter sequence from both Ssty1 and Sly
The regions of high sequence identity between Orly and its various progenitor loci extend a further 5 kb into the upstream region of Ssty, and 3 kb into the upstream region of Sly, indicating that the rearranged locus has retained the proximal upstream promoter regions of both genes, in antisense orientation relative to each other (Additional File 1). Orly_v1 and Orly_v2 are known to be transcribed from the relict Ssty1 promoter , which is thus shown to be functional.
Turning to the relict Sly promoter region, we conducted a search for transcription factor binding sites using TFSEARCH . This showed that of the 42 predicted transcription factor binding located between -600 and +10 of the reference Sly sequence, 35 were present at the corresponding site in Orly, indicating retention of potentially functional promoter elements (Additional File 2). Overall sequence identity between Sly and Orly across this region is 95.7%. Significantly, the conserved elements include a GCCAAT box at position -161 of the reference Sly locus. This motif is a strong transcriptional signal, and is known to be present in other spermatid specific TATA-less promoters such as the Pgk-2 promoter .
Both of the promoters at opposite ends of Orly are functional
This electronic promoter analysis suggested to us that the relict Sly promoter at the 3' end of the Orly locus may have retained functionality, and be able to generate opposite-strand transcripts. We designate such opposite-strand transcripts as Orlyos. No Orlyos transcripts were present in the nr or dbEST databases. We used strand-specific RT-PCR to determine whether any of the bands shown in Figure 2 corresponded to Orlyos transcripts.
Of the other splice variants shown in Figure 2, all were confirmed by strand-specific RT-PCR to be "forward" (Orly) transcripts (data not shown). This is unsurprising as the primers were designed against the forward transcript Orly_v2. Orlyos transcripts must necessarily have different exon boundaries, which presumably do not include the majority of the primer locations included in our screen.
The terminal exons of Orly derive from a retrovirus and are in antisense to Sly and Orlyos
Both of the known Orly transcripts terminate with two novel exons with antisense homology to intron 2 of Sly (see Figure 1A, exons N1 and N2). As discussed above, this section of Orly is also transcribed in the opposite orientation, generating Orlyos transcripts. Thus there is the potential for Orly transcripts to form dsRNA either by pairing with Orlyos transcripts or with nascent Sly transcripts.
Exons N1 and N2 derive from a partial degenerate retrovirus belonging to the MuRVY lineage of mouse Y chromosome specific repeats , which is embedded in this intron of Sly (see below). Orly-F transcripts terminate at the transcription stop site of this MuRVY-related element. We therefore deduce that although the MuRVY element is degenerate and does not encode a functional retrovirus, its transcriptional termination site has remained functional and become co-opted to form the transcriptional termination site for Orly forward transcripts. None of the known Orly transcripts contain any large open reading frames, however, Orly_v1 contains a short ORF running from bases 117–284. This ORF has 63% identity and 75% similarity over 49aa to a partial pol gene (data not shown), further demonstrating the retroviral origin of the terminal exons of Orly.
Tissue- and developmental stage-specific expression of Orly isoforms
Orly transcripts are under tight transcriptional control. All variant forms (both forward and reverse) are only observed after day 19 of postnatal life, and thus are deduced to be spermatid-specific. This is to be expected as both Ssty1 and Sly promoters are spermatid specific [9, 13, 16]. The age of first appearance for each band varied from day 19 to 23, indicating differential regulation of Orly isoforms in successive spermatid stages. This variation was observed both between different primer pairs (e.g. the majority of S1.f1/A4.r bands appear at 23 dpp, while the majority of S1.f1/A3.r bands appear at 21 dpp), and between different bands detected by the same primer pair (e.g. the upper, lower and middle bands in the N1.f/N2.r2 reaction appear at 19, 21 and 23 days respectively). This differential regulation may be due to spermatid stage dependent splicing of transcripts, or may represent varying subsets of transcripts arising from different copies of Orly with subtly different promoter activities. It is unfortunately not possible to use in situ or Northern blot data to confirm the detailed cellular expression patterns of these transcripts, since there is no portion of any of them which is not also part of a different Y-linked gene or retrovirus with a confounding expression pattern.
Genomic comparisons of Orly, its progenitor loci, and their X homologues
We carried out a detailed comparison of the genomic loci encoding Orly, the other MSYq genes and their X counterparts, in order to better delineate the sequence of events during MSYq evolution.
Genomic comparison of Ssty1 and Ssty2
Genomic comparison of Xmr and Xlr
There are two partial degenerate LINE elements within the Xmr locus, the first lying in the second intron of the longer isoform, and the second lying in the sixth intron (and thus also present in the fourth intron of Xlr). In addition to these degenerate LINEs, Xmr also contains a full-length LINE element from the L1MD-A2 lineage, which includes upstream monomer repeats and thus is potentially transcriptionally active . The element lies in intron 7 of Xmr but is not found in the corresponding location (intron 6) of Xlr, indicating that the LINE insertion occurred subsequent to Xmr/Xlr divergence.
Sly arose as a chimeric gene via fusion of Xmr and Xlr
The origin of the 5' end of Sly is demonstrated by exons 1–4, which match the 4 additional exons uniquely present in the longer Xmr isoform. The origin of the 3' end of Sly is demonstrated by exons 7–10, which match the final exons of Xlr including the Xlr-specific portion of exon 5. Sly lacks the L1MD-A2 element present in Xmr intron 7, further confirming the chimeric nature of this gene as an Xmr/Xlr hybrid. Exons 5–6 of Sly arose via duplication of exons 3–4 and show 88/102 nucleotide identity to these exons. There are degenerate LINE elements at the borders of this duplication event, and also at the border between the Xmr-derived and Xlr-derived segments of Sly, thus it is likely that recombination between LINE elements was responsible for the creation of Sly.
The LINE element present in intron 2 of Sly is interrupted by a stretch of DNA with distant sequence similarity to the mouse MSYq-specific retrovirus, MuRVY. This LINE element is uninterrupted in the progenitor Xmr, thus we conclude that the MuRVY insertion occurred subsequent to the creation of Sly. The MuRVY-related sequence is inserted in antisense orientation relative to Sly itself. The extent of the MuRVY-related stretch of DNA varies between Sly copies (see phylogenetic tree analysis below), but in all cases the terminal portion (including MuRVY transcription termination site) is retained. RepeatMasker analysis  of the insert shows 13.1% divergence, 13.1% deletion and 3.2% insertion relative to the consensus MuRVY LTR sequence, and 32.5% divergence for non-LTR portions of the insert. As discussed above, the MuRVY-related sequence in Sly intron 2 forms the source for the terminal Orly exons.
Recent work has shown that both Xmr encodes a cytoplasmic protein, in contrast to the protein encoded by Xlr, which is nuclear. The KRKR nuclear localisation signal in Xlr, which is conserved from the autosomal progenitor gene SCP3, is located in exon 5. Xmr does not include this exon, suggesting that this is the reason for the altered protein localisation. Interestingly, this signal is mutated to KRKW in the corresponding portion of Sly.
Genomic comparison of Asty/Astx
As reported previously , Asty and Astx have an identical genomic organisation, and share ~95% sequence identity across introns and exons.
The genomic context of Orly
We used BLAST comparison to search for all copies of each Yq-linked gene (Ssty1/2, Asty, Sly, Orly) in the currently-released draft sequence contigs [Mouse Chromosome Y Mapping Project (Jessica E. Alfoldi, Helen Skaletsky, Steve Rozen, and David C. Page at the Whitehead Institute for Biomedical Research, Cambridge MA, and the Washington University Genome Sequencing Center, St. Louis MO)].
We then used this information to generate a "fingerprint" for each available Yq contig, noting the order and orientation of the various copies of each gene present in each contig (see Additional File 4). Interestingly, we found that Orly always has the same genomic context, being flanked downstream by Ssty1 and upstream by Ssty2, with both loci in the same orientation as Orly. The neighbouring copies of Ssty1 all contain a SINE insertion at position 393, and form a distinct sub-group within the phylogenetic tree (see below: bootstrap support value for this clade is 1000/1000 replicates).
Using the fingerprints as a guide, we were able to assemble a "super-contig" containing 3 copies of Ssty1, 3 copies of Ssty2, two copies of Asty, two copies of Sly and one copy of Orly. In all, 13 of the 33 Yq contigs are congruent with this super-contig ordering, and a further 4 contigs appear to be slight variants upon it. This "super-contig" indicates the presence on mouse Yq of a highly amplified repeat unit of greater than 500 kb in length, which presumably corresponds to the Huge Repeat Array reported at conferences by Alfoldi et al . Sequence identity between the various contigs contributing to this "super-contig" is very high (> 98% excluding indels), indicating substantial homogeneity between copies of the Huge Repeat.
Classes of contig identified on mouse Yq
[GenBank:NT_161868, GenBank:NT_161875, GenBank:NT_161877, GenBank:NT_161879, GenBank:NT_161885, GenBank:NT_161892, GenBank:NT_161926, GenBank:NT_161928, GenBank:NT_165790, GenBank:NT_165791, GenBank:NT_165793, GenBank:NT_165794, GenBank:NT_165796]
Similar to Huge Repeat
[GenBank:NT_161923, GenBank:NT_161924, GenBank:NT_161937, GenBank:NT_165792]
[GenBank:NT_161897, GenBank:NT_161898, GenBank:NT_161904, GenBank:NT_161919, GenBank:NT_165795, GenBank:NT_165797, GenBank:NT_165798]
[GenBank:NT_161866, GenBank:NT_161872, GenBank:NT_161913, GenBank:NT_161916, GenBank:NT_161925]
[GenBank:NT_161895, GenBank:NT_161902, GenBank:NT_161906, GenBank:NT_161911]
Dynamics of Yq gene family expansion
A key question is whether these four gene families (Ssty1/2, Asty, Sly and Orly) were amplified separately on Yq during mouse evolution, or whether there was a single period of amplification increasing the copy number of all genes simultaneously.
From this phylogenetic analysis we observe:
In all three cases, Orly sequences form a discrete clade (bootstrap support value of 1000/1000 replicates for all three trees).
Gene copies lying within the Huge Repeat contigs also form distinct clades in all three trees (bootstrap support of 960/100 to 1000/1000 in all cases). Note that each copy of the Huge Repeat unit contains several copies of Ssty1, Ssty2 and Asty. These three genes thus give rise to several Huge Repeat-associated clades in each tree. Each of these clades contains the gene copies from matching locations within the Huge Repeat unit.
A final set of contigs forms a distinct clade in both the Ssty and Asty-related trees. This clade contains a group of Ssty1 /Asty-enriched contigs, [GenBank:NT_161904], GenBank:NT_161906, GenBank:NT_161911] (bootstrap values 969/1000 to 1000/1000 in the two trees). At slightly lower confidence levels, this clade also includes [GenBank:NT_165795] (bootstrap values 902/1000 to 989/1000 in the two trees).
At the time of Orly divergence, the Ssty family was already moderately amplified on the Y, with ~8 Ssty1 lineages and ~13 Ssty2 lineages present. By contrast, at the time of Orly divergence, there were only ~4 Asty lineages and 1 Sly lineage present on the Y
In all three cases, there was a massive amplification of gene copy number subsequent to Orly divergence. This amplification occurred predominantly in branches of the phylogenetic tree corresponding to Huge Repeat contigs, however, there was also amplification of a Ssty1 /Asty-enriched clade subsequent to divergence of the Orly clade.
From these trees, we also observed that all genes within each family showed very similar degrees of divergence from the root of the tree in all cases. This is to be expected as all three trees were based on noncoding sequence. The sequence used to build the trees is thus likely to be evolving at nearly neutral rates. Given nearly neutral rates of evolution, the degree of sequence divergence forms a "molecular clock" indicating the timing of the various events on mouse Yq. We therefore also generated trees using the UPGMA algorithm, which explicitly assumes a molecular clock (Additional Files 5, 6, 7).
In this analysis, the percentage divergence of Orly from its progenitor loci (representing the date of generation of Orly) is 1.24% for Orly/Ssty1, 1.79% for Orly/Asty and 1.87% for Orly/Sly. The percentage divergence between the Orly branches of the tree (representing the date of amplification of the Huge Repeat Array) is 0.47% for the Ssty1- derived region, 0.41% for the Asty- derived region and 0.43% for the Sly- derived region. While the absolute rate of the clock cannot be determined from these data, the numbers obtained from the three trees are in good agreement with each other, strengthening our inferences of the timing of events on Yq.
Conclusions of the phylogenetic study
Sstx/Ssty divergence (too long ago to be addressed by nucleotide sequence analysis)
Generation of Sly by chimerism between Xmr and Xlr
Moderate amplification of Ssty1, Ssty2 and Asty
Generation of Orly by chimerism between Ssty1, Asty and Sly
Massive amplification of two familes of large-scale repeat on Yq. The first repeat family contains representatives of all Yq genes including Orly and constitutes the Huge Repeat Array, while the second specifically contains Ssty1 and Asty.
At present unresolved is the question of when the MuRVY retrovirus arrived on Yq. The presence of MuRVY-related sequence within intron 2 of every copy of Sly indicates that Sly acquired its MuRVY-derived insert in intron 2 some time between stages (3) and (6), however, the origin of MuRVY itself cannot be placed in the above sequence from available evidence.
We report here on the genomic locus Orly and the wide variety of alternatively spliced transcripts arising from it. Orly has a complex and unusual genomic structure, being derived from partial copies of three other Yq-linked genes. Intriguingly, we also found Sly to be derived by combination of existing genes, in this case a fusion of the 5' region of Xmr with the 3' region of Xlr, together with an internal duplication of exons 3–4 of the Xmr-derived segment. This may indicate that chimerism and "exon shuffling" are a general feature of novel Y chromosome gene creation. Significantly, the two outermost partial gene loci contributing to Orly are in antisense orientation relative to each other, and retain their upstream promoter regions. We detected Orlyos transcripts in addition to Orly transcripts, and thus deduce that both promoters have retained their activity. In particular, exons N1, N2 and the intervening intron are transcribed in both directions. This region derives from a MuRVY retroviral insertion into intron 2 of Sly.
There is an intriguing parallel to be drawn with the Stellate system in Drosophila melanogaster, where there is a sense/antisense regulatory loop between X-encoded Stellate and Y-encoded Su(Ste) repeat genes . In the case of Stellate, the Y gene arose from the X gene by insertion of a transposon (with active promoter) in reverse orientation [26, 27]. Antisense Su(Ste) transcripts primed from the transposon promoter act to regulate both sense Su(Ste) and Stellate transcript levels via an RNAi mechanism [25, 28]. Similarly, Orly and Orlyos transcripts could potentially regulate each other and also Sly. A key avenue of future work is to determine the full length sequence of Orlyos, in particular whether it contains any Ssty1- or Asty- derived regions which may in turn regulate these genes.
The comparison to Stellate is especially interesting given the sex ratio skewing in male mice bearing partial Yq deletions. Partial deletions of the repressor Su(Ste) on Drosophila Y chromosome lead to sex ratio skewing or infertility dependent upon the X chromosomal Stellate haplotype present . Stellate was hypothesised to be a meiotic drive gene [30, 31], although this is now disputed . In male mice, partial deletions of Yq lead to mild teratozoospermia and sex ratio skewing [9, 11, 14], with reduced effectiveness of Y-bearing sperm . Larger deletions lead to severe teratozoospermia and infertility [8, 13]. The mice with partial deletions show normal fertility and fecundity (in terms of number of successful matings and number of offspring per litter), thus the only effect of the decrease in Yq gene copy number appears to be the sex ratio skew.
It should be understood that the sex ratio skew in mouse with Yq deletions does not constitute meiotic drive in the classical sense, since equal numbers of X- and Y-bearing gametes are generated at meiosis . Nevertheless, the presence of Yq-encoded genes affecting sex ratio indicates the potential for a conflict between these Yq-encoded genes and other interacting X- or autosomally-encoded factors. Given that Yq deletion also leads to a spermatid-specific derepression of X transcripts , with increasing X gene expression correlated with the extent of the deletion, we have suggested that there may indeed be an ongoing genomic conflict between the mouse X and Y chromosomes, with X-linked sex ratio distorter genes acting to favour generation of female offspring, and Yq-linked repressor genes acting to restore a normal 50:50 sex ratio. Such an intragenomic conflict is expected to lead to massive amplification of gene number on both chromosomes due to an "arms race" between the conflicting genes . Intriguingly, the hybrid sterility seen in Mus musculus musculus/Mus musculus molossinus consomic strains is X-dependent .
Whether genomic conflict is involved or not, the fact that Yq-encoded genes are necessary for normal levels of Y chromosome transmission necessarily leads to a strong and direct evolutionary pressure to maintain the function of these genes. This may be one of the factors behind the recent and highly unusual gene amplification seen on mouse Yq. Orly, being composed of portions of all the other known MSYq-linked genes, must also necessarily be the most recent known addition to MSYq gene content.
Orly is a novel chimeric locus on mouse chromosome Yq which is bidirectionally transcribed, giving rise to Orly and Orlyos transcripts. These transcripts may potentially form dsRNA in partnership with each other, or with the progenitor loci Ssty1,Asty and Sly. A phylogenetic tree analysis of Yq genes indicates that Orly arose shortly prior to a massive expansion in copy number of all the Yq-linked genes. Also, potentially significantly, copies of Orly are only found in the context of the Huge Repeat Array that distinguishes MSYq – a particular segment of around 500 kb that appears to have been amplified en bloc. Taking the above evidence together, we propose that the emergence of Orly may have been one of the triggers that led to massive amplification of Yq sequence. Further analysis of the genomic complement of MSYq, and the copy number of the corresponding X genes, in a range of different mouse subspecies should help date these events more precisely, and establish whether X-Y genomic competition is a contributing factor to the gene amplifications.
Sequence comparison and detection of copies of Yq-linked genes
Nucleotide sequence alignment was performed using BLAST and ClustalW. Copies of Yq-linked genes were located within the currently-released draft sequence contigs [Mouse Chromosome Y Mapping Project (Jessica E. Alfoldi, Helen Skaletsky, Steve Rozen, and David C. Page at the Whitehead Institute for Biomedical Research, Cambridge MA, and the Washington University Genome Sequencing Center, St. Louis MO)] by pairwise alignment of reference gene sequences to each contig. All full-length hits were recorded, thus this study does not distinguish between genes and pseudogenes in each family. For Ssty1, Ssty2, Asty and Orly, a window size of 40 nt was used, while for Sly a window size of 100 nt was used. Additional File 4 is a complete record of all loci detected in the course of this study. The reference sequences used for this search were as follows.
The reference sequences for Ssty1, Ssty2 and Sly are drawn from the Gene database of the NCBI . The reference sequence for Asty was selected as the hit with the highest percentage identity to the known partial cDNA sequence [GenBank:DQ874391]. In the case of Orly, we define the locus as extending from the transcriptional start site (TSS) of the relict Ssty1 partial sequence to the TSS of the relict Sly partial sequence. The locus chosen as a reference is that encoding the known transcript Orly_v1 ([GenBank:AK015935]). Note that both the reference genome sequence and the reference gene sequences are from the C57/Bl6 strain. Dot plots of selected contigs and gene loci were generated using JDotter , with grey scale values set to highlight the appropriate homologies.
Phylogenetic tree analysis
All full-length copies of Ssty1/2, Sly, Asty and Orly identified by the contig search were used to build these trees. The reference sequences for Xmr and Astx were included in the appropriate trees in order to determine the timing of MSYq events relative to the split between X and Y homologues, however, the high degree of nucleotide sequence divergence between Sstx and Ssty precluded the inclusion of the X-linked gene for this tree.
For each gene family, a region excluding known protein-coding sequence was selected for alignment, thus nearly neutral rates of evolution can be assumed. Since Asty appears to be non-coding, the full length of all detected Asty sequences (~2.1 k) was used for the Astx/Asty/Orly tree, together with the homologous regions of Astx and Orly. For the Ssty1/Ssty2/Orly and for Xmr/Sly/Orly trees, the aligned region comprises the 3' UTR and all introns within the 3' UTR. This is ~1.5 kb for the Ssty1/Ssty2/Orly tree and 1.4 k for the Xmr/Sly/Orly tree.
Interestingly, the opening ATG codon was conserved in all detected copies of both families, including conservation of this codon at both ends of all copies of Orly. The significance of this observation is unclear. Alignment of the gene copies was performed using ClustalW via the EBI website , and ClustalX used to generate each tree using the Saitou/Nei NJ algorithm. 1000 bootstrap replicates were used to assess the robustness of each tree. Additional File 8 contains the three NJ trees and the ClustalW files used to generate each tree. JalView  was used to generate the figures included in this manuscript.
RNA samples were treated for DNA contamination using the RNAse free DNAse set (Qiagen). RT-PCR was performed using the One Step RT-PCR kit (Qiagen). Briefly, a reverse transcription step at 50°C for 30 minutes was followed by an activation step at 94°C for 15 minutes, and then 30 cycles of PCR at 94°C/Tm/72°C for 10s/10s/30s. The annealing temperature Tm varied from 53–55°C depending on primer combination. 23 Orly partial cDNA sequences detected in this work have been submitted to GenBank, accession numbers ES316436 to ES316458.
Single-band RT-PCR products were purified using the Qiagen Qiaquick kit according to the manufacturers instructions. If multiple bands were present, these were gel purified using the Qiagen gel extraction kit. Purified RT-PCR products were sequenced from 5' and 3' ends using standard cycle sequencing methods.
This project was funded by the BBSRC.
- Fisher R: The evolution of dominance. Biol Rev. 1931, 6: 345-368. 10.1111/j.1469-185X.1931.tb01030.x.View ArticleGoogle Scholar
- Charlesworth B: The evolution of chromosomal sex determination and dosage compensation. Curr Biol. 1996, 6: 149-162. 10.1016/S0960-9822(02)00448-7.View ArticlePubMedGoogle Scholar
- Graves JA: The origin and function of the mammalian Y chromosome and Y-borne genes – an evolving understanding. Bioessays. 1995, 17: 311-320. 10.1002/bies.950170407.View ArticlePubMedGoogle Scholar
- Lahn BT, Page DC: Functional coherence of the human Y chromosome. Science. 1997, 278: 675-680. 10.1126/science.278.5338.675.View ArticlePubMedGoogle Scholar
- Lahn BT, Pearson NM, Jegalian K: The human Y chromosome, in the light of evolution. Nat Rev Genet. 2001, 2: 207-216. 10.1038/35056058.View ArticlePubMedGoogle Scholar
- Vallender EJ, Lahn BT: How mammalian sex chromosomes acquired their peculiar gene content. Bioessays. 2004, 26: 159-169. 10.1002/bies.10393.View ArticlePubMedGoogle Scholar
- Ellis PJ, Affara NA: Spermatogenesis and sex chromosome gene content: an evolutionary perspective. Hum Fertil (Camb). 2006, 9 (1): 1-7.View ArticleGoogle Scholar
- Burgoyne PS, Mahadevaiah SK, Sutcliffe MJ, Palmer SJ: Fertility in mice requires X-Y pairing and a Y-chromosomal "spermiogenesis" gene mapping to the long arm. Cell. 1992, 71: 391-398. 10.1016/0092-8674(92)90509-B.View ArticlePubMedGoogle Scholar
- Conway SJ, Mahadevaiah SK, Darling SM, Capel B, Rattigan AM, Burgoyne PS: Y353/B: a candidate multiple-copy spermiogenesis gene on the mouse Y chromosome. Mamm Genome. 1994, 5: 203-210. 10.1007/BF00360546.View ArticlePubMedGoogle Scholar
- Styrna J, Imai HT, Moriwaki K: An increased level of sperm abnormalities in mice with a partial deletion of the Y chromosome. Genet Res. 1991, 57: 195-199.View ArticlePubMedGoogle Scholar
- Styrna J, Klag J, Moriwaki K: Influence of partial deletion of the Y chromosome on mouse sperm phenotype. J Reprod Fertil. 1991, 92: 187-195.View ArticlePubMedGoogle Scholar
- Suh DS, Styrna J, Moriwaki K: Effect of Y chromosome and H-2 complex derived from Japanese wild mouse on sperm morphology. Genet Res. 1989, 53: 17-19.View ArticlePubMedGoogle Scholar
- Toure A, Szot M, Mahadevaiah SK, Rattigan A, Ojarikre OA, Burgoyne PS: A new deletion of the mouse Y chromosome long arm associated with the loss of Ssty expression, abnormal sperm development and sterility. Genetics. 2004, 166: 901-912. 10.1534/genetics.166.2.901.PubMed CentralView ArticlePubMedGoogle Scholar
- Xian M, Azuma S, Naito K, Kunieda T, Moriwaki K, Toyoda Y: Effect of a partial deletion of Y chromosome on in vitro fertilizing ability of mouse spermatozoa. Biol Reprod. 1992, 47: 549-553. 10.1095/biolreprod47.4.549.View ArticlePubMedGoogle Scholar
- Ward MA, Burgoyne PS: The effects of deletions of the mouse Y chromosome long arm on sperm function – intracytoplasmic sperm injection (ICSI)-based analysis. Biol Reprod. 2006, 74 (4): 652-658. 10.1095/biolreprod.105.048090.View ArticlePubMedGoogle Scholar
- Toure A, Clemente EJ, Ellis PJ, Mahadevaiah SK, Ojarikre OA, Ball PA, Reynard L, Loveland KL, Burgoyne PS, Affara NA: Identification of novel Y chromosome encoded transcripts by testis transcriptome analysis of mice with deletions of the Y chromosome long arm. Genome Biol. 2005, 6 (12): R102-10.1186/gb-2005-6-12-r102.PubMed CentralView ArticlePubMedGoogle Scholar
- TFSEARCH: Searching Transcription Factor Binding Sites (ver 1.3). [http://www.cbrc.jp/research/db/TFSEARCH.html]
- Gebara MM, McCarrey JR: Protein-DNA interactions associated with the onset of testis-specific expression of the mammalian Pgk-2 gene. Mol Cell Biol. 1992, 12 (4): 1422-1431.PubMed CentralView ArticlePubMedGoogle Scholar
- Eicher EM, Hutchison KW, Phillips SJ, Tucker PK, Lee BK: A repeated segment on the mouse Y chromosome is composed of retroviral-related, Y-enriched and Y-specific sequences. Genetics. 1989, 122 (1): 181-192.PubMed CentralPubMedGoogle Scholar
- Severynse DM, Hutchison CA, Edgell MH: Identification of transcriptional regulatory activity within the 5' A-type monomer sequence of the mouse LINE-1 retroposon. Mamm Genome. 1992, 2 (1): 41-50. 10.1007/BF00570439.View ArticlePubMedGoogle Scholar
- RepeatMasker at NCKU Bioinformatics Center. [http://www.binfo.ncku.edu.tw/RM/RepeatMasker.php]
- Reynard LN, Turner JM, Cocquet J, Mahadevaiah SK, Toure A, Hoog C, Burgoyne PS: Expression analysis of the mouse multi-copy X-linked gene Xlr-related, meiosis-regulated (Xmr), reveals that Xmr encodes a spermatid-expressed cytoplasmic protein, SLX/XMR. Biol Reprod. 2007, 77 (2): 329-335. 10.1095/biolreprod.107.061101.View ArticlePubMedGoogle Scholar
- Alfoldi JE, Skaletsky H, Graves T, Minx P, Wilson RK, Page DC: Sequence of the Mouse Y Chromosome. Conference presentation 18th International Mouse GenomeConference, Seattle, USA. 2004Google Scholar
- Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987, 4 (4): 406-425.PubMedGoogle Scholar
- Aravin AA, Naumova NM, Tulin AV, Vagin VV, Rozovsky YM, Gvozdev VA: Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster germline. Curr Biol. 2001, 11 (13): 1017-1027. 10.1016/S0960-9822(01)00299-8.View ArticlePubMedGoogle Scholar
- Balakireva MD, Shevelyov YuYa, Nurminsky DI, Livak KJ, Gvozdev VA: Structural organization and diversification of Y-linked sequences comprising Su(Ste) genes in Drosophila melanogaster. Nucleic Acids Res. 1992, 20 (14): 3731-3736. 10.1093/nar/20.14.3731.PubMed CentralView ArticlePubMedGoogle Scholar
- Kogan GL, Epstein VN, Aravin AA, Gvozdev VA: Molecular evolution of two paralogous tandemly repeated heterochromatic gene clusters linked to the X and Y chromosomes of Drosophila melanogaster. Mol Biol Evol. 2000, 17 (5): 697-702.View ArticlePubMedGoogle Scholar
- Aravin AA, Klenov MS, Vagin VV, Bantignies F, Cavalli G, Gvozdev VA: Dissection of a natural RNA silencing process in the Drosophila melanogaster germ line. Mol Cell Biol. 2004, 24 (15): 742-750. 10.1128/MCB.24.15.6742-6750.2004.View ArticleGoogle Scholar
- Palumbo G, Bonaccorsi S, Robbins LG, Pimpinelli S: Genetic analysis of Stellate elements of Drosophila melanogaster. Genetics. 1994, 138 (4): 1181-1197.PubMed CentralPubMedGoogle Scholar
- Hurst LD: Is Stellate a relict meiotic driver?. Genetics. 1992, 130 (1): 229-230.PubMed CentralPubMedGoogle Scholar
- Hurst LD: Further evidence consistent with Stellate's involvement in meiotic drive. Genetics. 1996, 142 (2): 641-643.PubMed CentralPubMedGoogle Scholar
- Belloni M, Tritto P, Bozzetti MP, Palumbo G, Robbins LG: Does Stellate cause meiotic drive in Drosophila melanogaster?. Genetics. 2002, 161 (4): 1551-1559.PubMed CentralPubMedGoogle Scholar
- Ellis PJ, Clemente EJ, Ball P, Toure A, Ferguson L, Turner JM, Loveland KL, Affara NA, Burgoyne PS: Deletions on mouse Yq lead to upregulation of multiple X- and Y-linked transcripts in spermatids. Hum Mol Genet. 2005, 14 (18): 2705-2715. 10.1093/hmg/ddi304.View ArticlePubMedGoogle Scholar
- Partridge L, Hurst LD: Sex and conflict. Science. 1998, 281: 2003-2008. 10.1126/science.281.5385.2003.View ArticlePubMedGoogle Scholar
- Oka A, Mita A, Sakurai-Yamatani N, Yamamoto H, Takagi N, Takano-Shimizu T, Toshimori K, Moriwaki K, Shiroishi T: Hybrid breakdown caused by substitution of the X chromosome between two mouse subspecies. Genetics. 2004, 166 (2): 913-924. 10.1534/genetics.166.2.913.PubMed CentralView ArticlePubMedGoogle Scholar
- Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2007, D26-31. 10.1093/nar/gkl993. 35 DatabaseGoogle Scholar
- Brodie R, Roper RL, Upton C: JDotter: a Java interface to multiple dotplots generated by dotter. Bioinformatics. 2004, 20 (2): 279-281. 10.1093/bioinformatics/btg406.View ArticlePubMedGoogle Scholar
- EBI Tools: ClustalW. [http://www.ebi.ac.uk/clustalw/index.html]
- Clamp M, Cuff J, Searle SM, Barton GJ: The Jalview Java alignment editor. Bioinformatics. 2004, 20 (3): 426-427. 10.1093/bioinformatics/btg430.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.