Evolution of cichlid vision via trans-regulatory divergence
© O'Quin et al.; licensee BioMed Central Ltd. 2012
Received: 28 August 2012
Accepted: 19 December 2012
Published: 26 December 2012
Skip to main content
© O'Quin et al.; licensee BioMed Central Ltd. 2012
Received: 28 August 2012
Accepted: 19 December 2012
Published: 26 December 2012
Phenotypic evolution may occur through mutations that affect either the structure or expression of protein-coding genes. Although the evolution of color vision has historically been attributed to structural mutations within the opsin genes, recent research has shown that opsin regulatory mutations can also tune photoreceptor sensitivity and color vision. Visual sensitivity in African cichlid fishes varies as a result of the differential expression of seven opsin genes. We crossed cichlid species that express different opsin gene sets and scanned their genome for expression Quantitative Trait Loci (eQTL) responsible for these differences. Our results shed light on the role that different structural, cis-, and trans-regulatory mutations play in the evolution of color vision.
We identified 11 eQTL that contribute to the divergent expression of five opsin genes. On three linkage groups, several eQTL formed regulatory “hotspots” associated with the expression of multiple opsins. Importantly, however, the majority of the eQTL we identified (8/11 or 73%) occur on linkage groups located trans to the opsin genes, suggesting that cichlid color vision has evolved primarily via trans-regulatory divergence. By modeling the impact of just two of these trans-regulatory eQTL, we show that opsin regulatory mutations can alter cichlid photoreceptor sensitivity and color vision at least as much as opsin structural mutations can.
Combined with previous work, we demonstrate that the evolution of cichlid color vision results from the interplay of structural, cis-, and especially trans-regulatory loci. Although there are numerous examples of structural and cis-regulatory mutations that contribute to phenotypic evolution, our results suggest that trans-regulatory mutations could contribute to phenotypic divergence more commonly than previously expected, especially in systems like color vision, where compensatory changes in the expression of multiple genes are required in order to produce functional phenotypes.
Phenotypic evolution can result from mutations that affect either the structure or expression of protein-coding genes. The evolution of vertebrate color vision has primarily been attributed to structural mutations within the opsin genes , a group of G-protein-coupled receptors expressed within the light-sensitive photoreceptor cells of the retina . These structural mutations include protein-coding substitutions that alter the polarity of amino acids within the functional retinal-binding region of the opsin protein, causing the opsins and the photoreceptors they are expressed within to absorb light at longer or shorter wavelengths, thereby altering color vision (e.g., [3, 4]). But recent research has shown that photoreceptor sensitivity can also evolve through regulatory mutations that affect opsin expression [5–7], although the genetic basis of these regulatory changes is largely unknown. Since a complex network of both cis- and trans-regulatory factors controls vertebrate opsin expression [8–10], quantitative genetic studies of opsin gene regulation have the potential to clarify the role that different structural, cis-, and trans-regulatory mutations play in the evolution of color vision.
African cichlid fishes have undergone a dramatic adaptive radiation that involves many traits, including color vision . In some cases, closely related cichlids differ in the maximum sensitivity of their photoreceptors by as much as 100 nm . The primary genetic mechanism responsible for this diversity is the differential regulation of seven opsin genes, although structural mutations also play a role [13, 14]. These seven opsins are maximally sensitive to different wavelengths of light and include SWS1 (ultra-violet-sensitive), SWS2B (violet-sensitive), SWS2A (blue-sensitive), RH2B (blue-green-sensitive), RH2A-α and -β (green-sensitive), and LWS (red-sensitive) ). The expression of these genes varies qualitatively among cichlids, resulting in groups of species with distinct “visual palettes”, or sets of photoreceptors broadly sensitive to short-, middle-, or long-wavelength light [13, 16]. Within these groups, cichlids also exhibit quantitative variation in the expression of pairs of opsins, including the SWS2B/SWS2A opsin pair and the RH2A/LWS pair, where a reduction in the expression of one gene is compensated by an increase in the expression of the other. Such qualitative and quantitative changes in opsin expression may serve to fine-tune cichlid vision in response to specific prey items, changes in the ambient light environment, or conspecific signals [13, 16].
The genetic factors responsible for interspecific variation in cichlid opsin expression and photoreceptor sensitivity are currently unknown. In two previous studies [17, 18], we used (a) a scan of non-coding DNA located near the opsins and (b) a genetic cross of two Lake Malawi cichlids that express different visual palettes to determine that both cis- and trans-regulatory factors contribute to the evolution of cichlid color vision, although we could not resolve the location or relative importance of these loci. Here, we expand on our previous studies by sampling a larger number of F2 progeny from the hybrid cross used in Carleton et al. . We perform a genome scan of > 500 sequenced restriction-site associated DNA (RAD-seq) tags to map the eQTL responsible for inter-generic differences in cichlid opsin expression. Our results identify numerous overlapping eQTL which suggest that divergence in opsin trans-regulatory loci may be as important to the evolution of color vision as structural mutations within the opsins themselves.
Restriction-site associated DNA sequencing (RAD-seq) is a relatively new and cost-effective method of simultaneously identifying and genotyping hundreds of single nucleotide polymorphisms (SNPs) in non-model species . For our cichlid cross, we generated reduced-representation RAD-seq libraries for all 115 individuals and sequenced these for 100 cycles on the Illumina HiSeq2000 platform. Sequencing yielded an average of 3.14 million (± 0.17 million SEM) reads per individual among ~ 66,500 unique SbfI restriction sites (see Additional file 1). We used the program Stacks v0.996  to match orthologous RAD-tags between the P0, identify alternatively-fixed SNPs, and infer genotypes at these sites among the F2. We filtered these SNPs for genotyping completeness, adherence to Hardy-Weinburg equilibrium, coverage, and unambiguous placement within a draft assembly of the Metriaclima zebra cichlid genome (assembly available at the Cichlid Comparative Genome Browser, Bouillabase.org). We ultimately retained a conservative set of 562 SNPs for linkage mapping and QTL analysis. The average coverage of each genotyped SNP was 45x (± 2x, SEM) in the F2 and 130x in the P0, while the average genotype completeness was 66%. In addition to the RAD-seq loci, we also genotyped microsatellite and SNP markers for ten trans-regulatory factors that have been shown to influence vertebrate opsin expression , including members of the thyroid hormone and retinoic acid family of transcription factors, retinoic acid related orphan receptors, photoreceptor-specific nuclear receptors, and steroid co-activators. We also included one microsatellite marker for each tandem array of the opsin genes located on cichlid linkage group (LG) 5 (Additional file 2).
Because RAD-seq is a relatively new method, there are currently no published estimates of this method’s accuracy. To estimate the potential error-rate of our study, we validated the genotypes inferred from eleven RAD-seq markers via PCR and direct cycle sequencing. We found errors in the genotypes inferred from all eleven loci, regardless of whether or not the marker passed our conservative filtering criteria (Additional file 3). We estimate that the average genotyping error rate of these eleven markers is 11.07±1.36% (min = 3.23%, max = 20.59%). In particular, the genotypes inferred from RAD-seq underestimated the true number of heterozygotes found by cycle sequencing, suggesting that reads containing alternate alleles were either (a) disproportionally amplified and sequenced, or (b) misassembled into separate RAD-seq loci. The accuracy of RAD-seq can be influenced by numerous technical and user-related factors, and an analysis of the precise source of this genotyping error is out of the scope of our present study. However, we do present an expanded discussion of this problem in Additional file 4. Although this is the first published estimate of the accuracy of RAD-seq, a similar analysis of a second cross of cichlids from Lake Malawi presented data which suggests that 3.13% of genotypes for the P0 were incorrectly inferred to be homozygous . Although disconcerting, we present evidence that these errors do not significantly hinder our ability to detect eQTL for opsin expression (Additional file 3; see below).
We assembled the RAD-seq and candidate gene loci into a high-density linkage map of the cichlid genome, and then we refined the placement of markers in this map by aligning the 100 bp consensus sequence of each RAD-seq tag to draft assemblies of the Oreochromis niloticus and Metriaclima zebra cichlid genomes (Bouillabase.org). The resulting genetic map comprised 23 linkage groups spanning 1669 cM (Additional file 5; Additional file 6). This map is largely co-linear with a genetic map of the model cichlid O. niloticus, which has a haploid genome of 22 chromosomes and a combined length of 1.1 Gb and 1331 cM . All linkage groups in our map share the same numbering scheme as those for O. niloticus, with the sole exception of the twenty-third linkage group, LG UN.
Summary of cis - and trans- regulatory eQTL detected for cichlid opsin expression b
The five opsin genes that vary in expression between A. baenschi and T. intermedius are all found in two tandem arrays on LG 5 (Figure 2); thus, the three eQTL on this LG represent potential cis-regulatory eQTL given their proximity to the opsins (Figure 2). Two eQTL for SWS2B and SWS2A expression on LG 5 each include a microsatellite marker located < 1 cM from the SWS2B SWS2A LWS opsin gene array. A third eQTL for RH2B expression is also located on LG 5, but it does not overlap with the other two eQTL and it does not include the microsatellite marker for the RH2B-RH2Aα-RH2Aβ array (Figure 2). Although we consider these three eQTL putative cis-regulatory loci, we note that trans-regulatory factors such as microRNAs also occur in close physical proximity to the cichlid opsin genes . Additionally, the peak of the RH2B eQTL is located far (12 cM or ~ 9 Mb) from the RH2B opsin. Therefore, it is possible that one or more of the eQTL on LG 5 actually represent divergent trans-regulatory factors. The presence of these three eQTL partially confirms the preliminary mapping results of Carleton et al.  who also found evidence of eQTL for these three opsins on LG 5, although we did not find comparable evidence for an SWS2B/SWS2A eQTL on LG 13 as in .
The remaining eQTL on LGs 1, 10, 15, and 23 represent divergent trans-regulatory loci since no opsin genes are located on these four linkage groups (the remaining opsin, SWS1, is located on LG 17 ). Two of the eQTL for LWS opsin expression are found alone on LGs 1 and 15, and a third was found on LG 10, where it overlaps perfectly with the sole eQTL for RH2A expression (Figure 2; Table 1). The fourth and final eQTL for LWS expression was found on the distal arm of LG 23, where it overlaps three other eQTL for SWS2B, SWS2A, and RH2B expression (Figure 2).
To confirm the results of our eQTL mapping analysis, we re-sequenced the RAD-seq loci underlying six eQTL and used single-marker regression to confirm these associations (Additional file 3). Our cycle sequencing results confirm the presence of eQTL for RH2B, RH2A/LWS, and SWS2B/SWS2A/RH2A expression on LGs 5, 10, and 23, respectively (ANOVA, all F > 8.45, all P < 0.006; see Additional file 3), while three additional eQTL for LWS and SWS2B/SWS2A expression on LGs 1 and 5 are also confirmed by candidate microsatellite loci (Figure 2). Since we found numerous eQTL that match our expectations from previous Castle-Wright estimates  and support these associations with re-sequencing, it seems unlikely that the genotyping errors inferred from RAD-seq impaired out ability to detect QTL in this study.
Although experimental work in humans, mammals, and fish has identified many cis- and trans-regulatory elements that control opsin gene expression and photoreceptor identity (e.g., [8–10]), our study has identified those elements of the regulatory network that allow for rapid evolutionary changes in cichlid opsin gene expression and color vision, many of which have occurred in only the last two million years. In particular, the overlapping eQTL on LGs 5, 10, and 23 suggests that many opsins may be co-regulated by tightly linked or pleiotropic loci. Consistent with the hypothesis of compensatory regulation, we find that the overlapping eQTL for SWS2B/SWS2A and RH2A/LWS expression are predominantly additive, with effects that are similar in magnitude but opposite in direction (Table 1; Additional file 7). The distinct “visual palettes” of African cichlids could therefore be explained by the presence of transcription factors or regulatory “hot spots” that control the compensatory expression of two or more opsins simultaneously. Unfortunately, however, none of the 10 candidate genes used in our analysis were found within these eQTL regions (Figure 1). But by mapping the RAD-seq markers from these regions to the M. zebra genome assembly, we found that the 95% Bayesian confidence intervals for most eQTL correspond to just one or two genomic scaffolds (Figure 2). A quick survey of genes located within these regions reveals additional candidates for photoreceptor function and opsin gene expression, including GNAO1, NEUROD1, and PAX6 (LG 1), NCOA1 and RAR-γ2 (LG 5), PRPH2 and OTX-2 (LG 10), NR2E1 (LG 15), and RXR-α (LG 23) . Future work will fine-map the causative mutations within these eQTL and use additional crosses to identify loci responsible for regulatory variation in the remaining SWS1 opsin gene. These future mapping studies should reveal more about the architecture and evolution of the genetic regulatory network governing vertebrate opsin gene expression.
Recent reviews of genetic adaptation have contrasted the role that different structural and regulatory mutations play in the evolution of morphological and physiological diversity [27–30]. To be sure, structural mutations within the opsins certainly contribute to interspecific differences in vertebrate photoreceptor sensitivity and color vision [1, 3, 4, 14]. In cichlids, A. baenschi and T. intermedius exhibit amino acid substitutions within several opsins (see Additional file 3 within ), including one, SWS2B A269T, that has been shown to shift the sensitivity of this opsin by 10 nm in some fish species . But regulatory mutations can also tune color vision by altering the ratio of photoreceptors that contain paralogous opsins of different sensitivities [5–7]. Compared to the presumably functional A269T mutation within the SWS2B opsin, we find that the overlapping regulatory eQTL for SWS2B/SWS2A expression on LGs 5 and 23 shift the expression of these opsins by an average of 6.81% and 8.95% in homozygotes, respectively (see the additive effect of each locus in Table 1 and the effect plots in Additional file 7). Although small, these regulatory changes are expected to shift the average absorbance of SWS2-containing photoreceptors by 9.00 nm and 11.90 nm – a shift that is equal to that caused by the SWS2B A269T structural mutation. This shift will be even greater if considered in combination with the other overlapping eQTL for RH2B and LWS expression on LG 23 (Figure 2). Thus, the evolution of color vision could result from regulatory as well as structural mutations, especially among species that possess several paralogous opsin genes.
In addition to comparisons of structural and regulatory mutation, reviews of regulatory evolution have also contrasted the likely impact of cis- and trans-regulatory mutations on expression phenotypes [27, 30, 32]. These studies have emphasized a key role for cis-regulatory evolution and the past decade of research has uncovered a growing number of cis-regulatory mutations responsible for morphological adaptations among divergent but closely related species (e.g., [33–36]). One reason for this emphasis is undoubtedly discovery bias. By definition, cis-regulatory mutations occur in close proximity to the genes they regulate; therefore, they are better candidates for genetic mapping than trans-regulatory mutations, which could be located anywhere in the genome. But a second, more compelling reason for this emphasis is the unique action of cis-regulatory alleles. Cis-regulatory alleles are typically additive and modular in their effect. These two features increase the chance that alternate cis-regulatory alleles can reach fixation via natural selection, and they also minimize any negative consequences of pleiotropy that could result from structural mutations within the developmental regulatory genes that typically govern morphological evolution [27, 30]. Furthermore, although the potential for mutation is probably greater for trans- rather than cis-regulatory sequences , direct comparisons within and between species suggest that trans-regulatory mutations are initially more common for intraspecific comparisons, but that cis-regulatory mutations accumulate preferentially over time and dominate interspecific and intergeneric comparisons [38–43]. Thus, both theory and observation suggest that cis-regulatory mutations should preferentially contribute to interspecific divergence in gene expression and phenotypic adaptation, and examples of this have become increasingly common, while examples of phenotypic adaptation due to trans-regulatory mutations are exceptionally rare (e.g., [44, 45]).
However, here we study inter-generic divergence in cichlid opsin expression and find that trans-regulatory loci account for the majority (8/11 or 73%) of the eQTL identified. Additionally, these trans-regulatory eQTL contribute to the divergent expression of all five variable opsins (Figure 2, Table 1), whereas the potential cis-regulatory eQTL on LG 5 only contribute to the divergent expression of three opsins, and then only in combination with other trans-regulatory eQTL (Figure 2). Thus, our results suggest that trans-regulatory variation in gene expression may contribute to phenotypic evolution more commonly than expected, and that the significance of trans-regulatory divergence need not be limited to intraspecific variation. In the future, allele-specific expression can be used to better proportion interspecific variation in cichlid opsin expression among different cis- and trans-regulatory factors; however, our current eQTL analysis indicates that divergence in trans-regulatory loci disproportionately contribute to the evolution of opsin gene expression and photoreceptor sensitivity among African cichlid fishes.
Five factors likely underlie our finding that trans-regulatory eQTL disproportionately contribute to the evolution of cichlid color vision; two factors are related to our study design and three to real biological phenomena. First, by measuring opsin gene expression directly, we can easily proportion vision QTL into potential cis- and trans-regulatory loci based on their genomic position relative to the opsin genes. Studies that examine the sequence and expression of genes related to specific phenotypes, like the opsins and color vision, may be better suited to tease apart the role of different structural and regulatory mutations than studies that simply start with morphological or physiological traits and map those easiest to identify (for an example, see ). Second, the cichlids of Lake Malawi are the product of an adaptive radiation that has produced hundreds of new species in less than 1 million years . Many cichlid species and genera remain interfertile and share numerous polymorphisms, possibly making interspecific comparisons of regulatory divergence among cichlids more analogous to intraspecific comparisons among other model species . Third, the opsin genes have a limited physiological function. They encode structural proteins that are not regulatory, and their expression is generally restricted to the neural retina and skin , although opsins are also expressed within the brain and ovaries of some fish species . This limited function means that either structural, cis-, or trans-regulatory mutations could be used to tune photoreceptor sensitivity while still avoiding the negative consequences of pleiotropy that are supposed to favor cis-regulatory alleles [27, 29, 30]. Fourth, the opsin genes occur at the end of a large regulatory network that includes multiple interacting transcription factors, including the retinoic acid and thyroid hormone nuclear receptors . Since the number of transcription factors that control a gene’s expression is positively correlated with the degree of trans-regulatory divergence , the large number of trans-regulatory factors that control vertebrate opsin expression suggests that these genes could be predisposed to evolve new expression patterns in trans. Finally, the distinct “visual palettes” of African cichlids are the result of heterochronic changes that simultaneously alter the expression of multiple opsins genes. Phenotypes intermediate to these three palettes are generally not found among wild-caught species [13, 16]. Theoretically, the coordinated compensatory changes required to produce these palettes could be easily achieved by a mutation that alters the structure or expression of a single trans-regulatory factor that controls multiple downstream opsins. In contrast, separate mutations upstream of each opsin would be required to make the same changes via cis-regulatory divergence, especially for opsins that are not joined in a tandem array. Thus, trans-regulatory mutations may be better suited to the evolution of complex phenotypes that are governed by the coordinated expression of multiple genes, like those that govern many sensory systems.
Although the evolution of color vision in vertebrates is primarily attributed to structural mutations within the opsin genes, we have shown that divergence in both cis- and trans-regulatory loci can also tune vertebrate photoreceptor sensitivity. In particular, we found that divergence in trans-regulatory eQTL disproportionally contribute to the inter-generic evolution of opsin gene expression and photoreceptor sensitivity among African cichlid fishes. Although several unique properties of the opsins may explain this observation, our results suggest that trans-regulatory mutations could contribute to phenotypic divergence more commonly than previously expected, especially in cases where the coordinated and compensatory expression of multiple genes are necessary to drive the evolution of phenotypic traits.
We sampled 115 progeny from three A. baenschi x T.intermedius F2 crosses previously described in Carleton et al. . Each cross had the same T. intermedius grandfather but unique A. baenschi grandmothers. We used 72, 23, and 20 F2 from each cross, respectively. We raised all progeny to sexual maturity (>6 months post fertilization) and then euthanized them at approximately the same time of day using buffered MS-222 following University of Maryland IACUC protocol R-09-73. Once euthanized, we clipped fin tissue for DNA and dissected retinas from each fish. We then extracted total RNA from the retinas, which we reverse-transcribed to cDNA and used to measure the expression of the opsin genes via quantitative real-time PCR (grouping the genetically and functionally similar RH2A-α and -β opsins). These methods followed our previously published protocols [13, 15, 17].
Our methods for RAD-seq library construction followed the protocols of Etter et al. . We extracted genomic DNA from 10–20 mg fin tissue from each fish using a DNeasy Blood & Tissue extraction kit and 4.0 μL RNAse (Qiagen). We estimated the concentration of each DNA sample using the Quant-iT™ dsDNA High-Sensitivity Kit (Invitrogen) and a BioTek FLx800 Fluorometer. Next, we digested 1 μg of DNA from each individual in a 50 μL restriction reaction of 10 units (U) SbfI-HF (New England Biolabs [NEB]) and NEBuffer4 (NEB) following the manufacturers protocols. We then ligated a modified Solexa© adaptor to each sample (P1, 2006 Illumina, Inc., all rights reserved; top: 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTxxxxxTGCA-3′; bottom: 3′-TTACTATGCCGCTGGTGGCTCTAGATGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGAxxxxx-5′), where xxxxx denotes one of 32 5-bp barcodes used to identify the sequences of each individual within a library (Additional file 8). We then combined 3.75 μL (0.0625 μg) DNA from the reactions of 32 individuals into a common 120 μL pooled library and randomly sheared them to an average length of 500 bp using a UCD-300 Biorupter (Diagenode) on high for 5 cycles of 30 seconds ON and 30 seconds OFF. We ran the contents of each library on a separate 1.25% agarose gel and excised DNA fragments between 300 bp and 650 bp and cleaned them using a MinElute Gel Extraction kit (Qiagen). Next, we repaired the blunt ends of each DNA library with a 10X Quick Blunting enzyme kit (NEB), cleaned the reaction with a QiaQuick PCR Purification kit, then added adenine overhangs to the 3′ end of the DNA fragments using Klenow 3′- 5′ exo (NEB). We once again cleaned the reaction with a QiaQuick PCR Purification kit, then we ligated a second modified Solexa© adaptor (P2, 2006 Illumina, Inc., all rights reserved; top: 5′-Phos-GATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCAGAACAA-3′; bottom: 3′-TCTAGCCTTCTCGCCAAGTCGTCCTTACGGCTCTGGCTAGAGCATACGGCAGAAGACGAAC-5′) to the DNA. Finally, we PCR amplified 2 μL of each raw library in a 50 μL reaction using modified Solexa© amplification primers (2006, Illumina, Inc., all rights reserved; P1 forward: 5′-AATGATACGGCGACCACCGA-3′; P2 reverse: 5′-CAAGCAGAAGACGGCATACGA-3′), and 2U Phusion Taq with 5X Phusion HF Buffer. The sole exception between our method and the one described in Etter at al.  is that we PCR amplified our reduced representation RAD-seq libraries for 19 cycles instead of 18. We gel-extracted each PCR amplified library from a separate 1% agarose gel, cutting out DNA fragments between 300 bp and 800 bp. We purified each extracted library with a MinElute Gel Extraction kit, then eluted the final libraries with 20 μL Buffer EB. We quantified the concentration of each library with Quant-iT™ and the BioTek fluorometer, then diluted the final library to 10 nM and sent them to the University of Oregon for sequencing.
Following sequencing, we filtered the reads for quality (Q20 across 90% of the 100 bp read) and the presence of a complete SbfI site and 5-bp barcode (Additional file 8). We then used the program Stacks v0.996  to assemble the reads and infer genotypes for linkage and QTL mapping. The Stacks script denovo_map.pl was used to (a) assemble the reads from the P0 and F2 into unique loci and identify potential heterozygous alleles, (b) aggregate the stacks from the P0 of each family into a separate catalog of orthologous loci and identify SNPs between them, and (c) match these loci back to their respective progeny and infer genotypes at loci containing SNPs. Because we generally expect to find only one polymorphism every 1 Kb between cichlid species , we allowed for one mismatch when assembling the 100 bp reads into loci and when matching orthologous loci among the P0 and F2 progeny. Additionally, we also chose conservative parameters to infer genotypes from the RAD-seq loci. We modified the default parameters of the Stacks script genotypes to set min_hom_seqs = 20, max_het_seqs = 0.08 or 2/25, and min_het_seqs = 0.02 or 1/50, where min_hom_seqs is the minimum number of sequencing reads required to call a genotype homozygous in the F2, max_het_seqs is the minimum ratio of the depth of the minor allele to the major allele required to call a genotype heterozygous, and min_het_seqs is the maximum ratio of the depth of the minor allele to the major allele required to call a genotype homozygous. For the reads that comprise a locus in any given F2, if the ratio of the depth of the minor allele to the major allele is greater than max_het_seqs (0.08), it is called a heterozygote; if the ratio of the depth of the minor allele to the major allele is less than min_het_seqs (0.02), it is called a homozygote; and if the ratio falls between max_het_seqs and min_het_seqs, it is considered ambiguous and is called as unknown or missing. Following genotyping, we further filtered the markers by excluding those that were (a) not differentially fixed between the parents of at least one cross, (b) genotyped in less than 50% of the F2, (c) did not meet the cut-off for Hardy-Weinberg equilibrium following a chi-square test at P > 0.0495, and (d) could not be unambiguously mapped to the cichlid genome within a Burroughs Wheeler aligner (BWA) edit distance of ≤2. To fulfill this latter criterion, we mapped the 100 bp consensus sequences from each cross’s denovo_map catalog to a draft assembly of the Metriaclima zebra genome (M. zebra v0, see the comparative cichlid genome browser at Bouillabase.org) using the program BWA v0.6.1 with default parameters . Finally, since Stacks v0.996 cannot automatically filter the haplotypes of the P0 based on the same genotyping parameters specified in genotypes, we manually inspected all markers passing the above criteria in the P0. We then further filtered markers that did not meet the same min_hom_seqs, max_het_seqs, and min_het_seqs criteria in the P0, although we raised the cutoff of min_het_seqs to 0.04 or 2/50, since the coverage for each stack was generally much greater in these individuals. The composition of genotypes in our final data set was 28.5% AA, 42.7% AT, and 28.8% TT.
We identified the location of several candidate genes for vertebrate opsin regulation in the Gasterosteus aculeatus genome (Broad/gasAcu1 assembly, Feb. 2006), then used the resulting coordinates to search for sequences from a genome assembly of the cichlid Oreochromis niloticus aligned to G. aculeatus (; see the comparative cichlid genome browser at Bouillabase.org). We used the programs Perfect Microsatellite Repeat Finder (http://sgdp.iop.kcl.ac.uk/nikammar/repeatfinder.html) and Sequencher® to identify microsatellite and SNP sequences within 600 kb (< 1 cM) of each gene, then designed primers and fluorescently-labeled probes for PCR with the program Primer3Plus . We used the same method to generate PCR and sequencing primers for 11 RAD-seq loci. Following PCR, we cycle sequenced candidate and RAD-seq loci containing SNPs using the BigDye terminator v3.1 cycle sequencing kit (Applied Biosystems, ABI). We ran the products of all microsatellite and SNP reactions on an ABI 3730xl Genetic Analyzer and genotyped them using GeneMapper® (ABI) and Sequencher® (GeneCodes) software. Additional file 2 lists all candidate gene and re-sequencing primers used.
We used the program R/qtl  to assemble the RAD-seq and candidate loci into a genetic linkage map of the cichlid genome and scan for QTL associated with opsin gene expression. We assembled markers into linkage groups (LG) based on recombination fractions after specifying a minimum LOD cutoff of 4.35, which is the cutoff that yielded the ~ 22 LG expected from the genome of the model cichlid Oreochromis niloticus. We then determined the order of markers along each LG by rippling their position within a window of seven markers and chosing the order that required the smallest number of obligate crossovers. We further refined this order by manually grouping markers that mapped to the same scaffold of the Metriaclima zebra genome assembly and then estimated the final inter-marker distances using the Carter-Falconer map function . We used the comparative cichlid genome browser to BLAST the consensus sequence of each marker to the anchored assembly of the Oreochromis niloticus cichlid genome (; assemblies available at Bouillabase.org) in order to infer the orthology of each LG relative to the genetic map of O. niloticus.
After linkage mapping, we scanned the cichlid genome for eQTL associated with opsin expression using stepwiseqtl, a model-selection approach to multiple-interval-mapping implemented in R/qtl. This method is described in detail by Broman and Sen . For this analysis, we specified the use of Haley-Knott regression when calculating the LOD of eQTL. We restricted all eQTL positions to genotyped markers only – we did not impute inter-marker loci at fixed distances in the genome – but we did fill in missing genotypes at marker positions using simulations based on the genotypes of near-by markers. Finally, we also used 1000 permutations of the data among hybrid families in order to account for family structure when estimating penalties for incorporating additional eQTL into the forward and backward selection models and when estimating the 95th percentile of genome-wide LOD scores under a null model of no eQTL.
Finally, we estimated the impact of changes in opsin gene expression on the average sensitivity of SWS2-containing photoreceptors by calculating the average wavelength of maximum absorbance of these photoreceptors weighted by the maximum absorbance of the SWS2B and SWS2A opsins and their relative level of expression in T. intermedius homozygotes. We describe this method in several previous publications [7, 13, 16, 20]. In general, this method produces results that are similar to values of photoreceptor absorbance determined physiologically .
Expression Quantitative Trait Loci
Logarithm of the Odds
Short wavelength sensitive opsin
Long wavelength sensitive opsin
Restriction site-associated DNA
Standard error of the mean
Bayesian credible interval.
We thank RB Roberts, C O’Brien, CT O’Quin, TD Kocher, and two anonymous reviewers for their helpful suggestions on earlier versions of this manuscript. We also thank CM Hofmann and J Ser for their help breeding and raising the fish used in this study. This work was supported by grants to KLC from NSF (IOS-0841270), NIH (R15 EY016721-01), and University of Maryland. KEO was supported by a Wayne T. and Mary T. Hockmeyer Doctoral Fellowship and an Ann G. Wylie Dissertation Fellowship from the University of Maryland. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.