Variation at key sites among LWS opsin gene duplicates in Poeciliidae
The guppy (Poecilia reticulata) and species in its sister group Micropoecilia possess four LWS opsin genes that we have named LWS S180, LWS S180r, LWS A180, and LWS P180. The first two genes encode proteins with the five key-site haplotype, SHYTA, and the second two genes encode proteins with key-site haplotypes, AHYTA and PHFAA, respectively. Three of these genes; LWS S180, LWS S180r, and LWS P180, were also amplified and sequenced from pygmy swordtail (Xiphophorus pygmaeus) genomic DNA. We found only the LWS S180 gene in Tomerus gracilis. Serine (S) and alanine (A) are common residues at position 180, but proline (P) is rare. This proline residue encoded by LWS P180 might disrupt the transmembrane domain [47, 48] and compromise opsin protein function [48, 49]. However, several observations suggest that it is functional. First, LWS P180 is at least 44 million years old, as it evolved before the divergence of Poecilia and Xiphophorus , and it has no other amino acid substitutions that are expected to disrupt function. Second, the LWS P180 locus has diverged from paralogous LWS opsins in ways that are expected to enhance color vision; positions 277 and 285 have experienced a tyrosine to phenylalanine and threonine to alanine substitution, respectively. These are the same key-site substitutions involved in the evolution of an MWS opsin from an LWS opsin in humans. Third, this gene is expressed, albeit at very low levels. Finally, LWS opsins from arctic lamprey, turbot, and the spotted green pufferfish also have a proline at position 180 and no other substitutions likely to disrupt protein function.
All four LWS opsins uncovered in this study are predicted to have unique roles in color vision. With three different five key-site haplotypes, they are predicted to be most sensitive to three different wavelengths of light. Also, despite encoding a gene with the same key-site haplotype as LWS S180, the LWS S180r opsin differs from all other LWS opsins at amino acid positions known to play a role in binding and activating transducin .
Southern blot experiments in our study revealed four bands (Fig. 1) consistent with the hypothesis that Cumaná guppies have four LWS opsin loci. Hoffman et al.  produced a southern blot with only three bands and suggested that guppies have a minimum of two LWS opsin genes. Variation in LWS opsin gene number among populations may be another trait guppies share with humans [50–52].
Two of the four LWS opsin genes described here (LWS S180 and LWS A180) were reported in Hoffman et al.'s  study of guppies from the Quare and Oropuche Rivers in Trinidad. Although sequence data reported by Weadick and Chang  did not include all five key sites, our phylogenetic analysis indicates that Weadick and Chang  sequenced portions of all four loci from their Paria River guppy. The phylogenetic relationships among guppy LWS opsin paralogs reported by Weadick and Chang  differ from those shown here. Both topologies were produced using maximum parsimony, but by surveying more individuals, more species and by obtaining longer sequences we have produced a larger set of parsimony-informative characters. This is most evident when considering the relationship between LWS P180 and Weadick and Chang's variant 5, and variant 4. The large number of differences between variant 5 and variant 4 are apparent in Weadick and Chang's  tree where maximum likelihood branch lengths have been superimposed on the MP topology. Nonetheless, in their analysis, these two sequences form a monophyletic group. However, the sister sequence relationship between variant 5 and variant 4 disappears with the addition of LWS P180 genes from other Poecilia species and Xiphophorus pygmaeus because many of the unique nucleotides (autopomorphies) in variant 5 become synapomorphies (shared derived traits) in this larger dataset. Also, two of the three characters that had united variant 5 and variant 4 in Weadick and Chang's  MP analysis (adenines at positions 126 and 213 in their alignment), appear to be homoplasious when compared to a much larger set of LWS P180 and LWS S180 sequences. The origin of these apparent homoplasies is intriguing and is discussed below.
Mechanisms of LWS duplication in poeciliids
The first duplication event that expanded the guppy LWS opsin repertoire produced two genes that have retained SHYTA five key-site haplotype: LWS S180 and LWS S180r. The later duplicate is missing introns II-V and is likely a product of retrotransposition. Partial cDNA sequences for each gene have been reported  but their intron-exon structure was unknown until now. It is not clear how LWS S180r has retained retinal expression, although it may occur in the vicinity of other LWS opsins (and their regulatory modules) in the guppy genome. Of interest, gene duplication by retrotransposition has also produced a pair of fish rhodopsin 1 (RH1) genes called errlo and single-exon rho . errlo and rho occur on different chromosomes. Therefore, the observation that rho is expressed in the retina (rod cells)  demonstrates that upstream regulatory elements can be retained during retrotransposition.
In medaka (Oryzias latipes) and zebrafish (Danio rerio), duplicated LWS opsins are linked and oriented in a head-to-tail manner [54, 55]. Phylogenetic analysis shows that independent mutations produced these gene pairs . Our study also characterized an LWS opsin tandem duplication event. Duplication of the LWS S180 gene early during the evolution of poeciliids, produced an LWS opsin that retained the SHYTA haplotype (LWS S180) and an LWS opsin that evolved a PHFAA haplotype (LWS P180). These two genes are linked, but in an inverted (i.e., tail-to-tail) orientation. Several models have been proposed to explain the formation of inverted duplicates. Secondary rearrangement after duplication by unequal sister chromatid exchange is one. Another is intra-chromosomal replication slippage in trans . This occurs when the DNA polymerase reverses direction using either the nascent strand (intra-molecular strand switch) or opposite strand (inter-molecular strand switch) as a template. By running backwards, a duplicate of the just-completed sequence is produced in an inverted orientation before the polymerase switches back to the correct template. The DNA downstream of LWS S180 has strings of adenines and thymines (data not shown) that might have facilitated strand switching by the polymerase during DNA replication . The inverted arrangement of LWS opsins in the Poecilia genome might make them even more prone to additional duplication events . Therefore, variation in LWS gene number (among species or populations) would not be surprising. Xiphophorus pygmaeus also has the LWS P180 gene. Post-duplication gene transposition or the expansion of intergenic DNA are possible explanations for our failure to amplify DNA between LWS P180 and LWS S180 in this species.
The third and most recent LWS gene duplication uncovered in our study lead to the production of LWS S180 and LWS A180. In the Cumaná guppy, the first five exons and four introns of LWS A180 are most similar to LWS S180, and the last intron, exon, and 3' UTR are identical to LWS P180. These data suggest that in this population LWS A180 is a hybrid gene; its formation might have been facilitated by the inverted tandem orientation of LWS S180 and LWS P180 . As mentioned above, variant 5, reported by Weadick and Chang  is also a hybrid sequence, with approximately one half of the sequence identical to the LWS S180 gene and the other half identical to LWS P180. As is the case for LWS A180 reported here in the Cumaná guppy, the position of variant 5 in the tree based upon the 390 bp alignment depends upon which fragments are used in the phylogenetic analysis (i.e., it would occur in the LWS S180/A180 clade if only the first 220 bp were utilized). As there are many more phylogenetically informative characters in the second half of this sequence, variant 5 was placed in the LWS P180 clade when the entire sequence was used, despite being identical to LWS A180 sequences over the first 220 bp. The observation that variant 5 has not diverged from either of its progenitor sequences and that it was recovered only once by Weadick and Chang  leads us to the conclusion that while both halves of the sequence can be found in the guppy genome, their concatenation is an artefact produced by template switching or mismatch repair during cloning.
In the larger alignment (Fig. 2), LWS S180 and LWS A180 are not partitioned into monophyletic clades. For instance, the P. bifurca sequence that clusters with the LWS S180 genes has an alanine at position 180. One explanation for this observation is that the P. bifurca LWS A180 sequence is an allele of the LWS S180 locus. A similar situation occurs in non-African humans where a common allele of the LWS opsin locus (which typically has the SHYTA haplotype) has an alanine in position 180 and thus, an AHYTA five key-site haplotype [59, 60].
The evolutionary consequences of LWS opsin duplication in guppies
The evolutionary implications of opsin gene duplication and divergence depend largely upon the expression patterns of these genes. In several species, the possession of a large opsin family allows the retina to be spectrally tuned for different environments and/or life stages. For example, eels (Anguilla anguilla) have two rhodopsins, each tuned to slightly different wavelengths. They express a green-shifted locus as juveniles in fresh water and a paralogous blue-shifted locus when they return to the ocean and mature . The lamprey (G. australis) also adjusts its spectral sensitivity by changing opsin gene expression as it moves between marine and riverine environments . In cichlids, opsin gene expression varies during development . Of particular note is the observation that variation in LWS opsin sequence and expression is associated with variation in water turbidity [9, 64]. This has lead to the hypothesis that species- and population-level differences in opsin gene sequence and expression represent adaptations for foraging in either turbid or clear water and that these differences in spectral sensitivity may drive and/or maintain divergence in male coloration via sexual selection . Guppies, however, do not move very far during their lifetime  and thus differential use of opsin gene duplicates in different habitats is an unlikely explanation for the evolution of LWS opsin gene diversity in this taxon.
The simultaneous expression of opsin paralogs with different sensitivities might expand the region of the spectrum where the guppy possesses high sensitivity and expand the range of detectable wavelengths. This enhancement and broadening of wavelength sensitivity can occur when individual cone cells express more than one opsin gene  or when adjacent cone cells express different opsins (e.g., humans and transgenic mice – see below). MSP data in guppies showing cells with a broad range of sensitivities in the long wave region of the spectrum [18, 19] are consistent with the hypothesis that the sensitivity of some cones is a consequence of the co-expression of different LWS opsins. By providing guppies with a broad region of maximum wavelength sensitivity, LWS opsin gene duplication and divergence might make multi-colored male guppies appear brighter (more conspicuous)  to other guppies, but not to predators with wavelength sensitivity limited (by LWS opsin gene copy number) to a narrower region of the spectrum.
Expression of different LWS opsins in adjacent cones not only improves overall spectral sensitivity, but is the basis of wavelength discrimination. Observations from humans and mice are consistent with the hypothesis that opsin gene duplication and divergence can lead to better color discrimination even without any associated revisions to neuroanatomy. Among human women who are heterozygous at either the LWS or MWS locus, some appear to have a pattern of X-inactivation that leads to tetrachromacy. These women see an average of 10 colors in a spectrum, whereas trichromatic women typically see seven [, but see reference ]. In mice, the hypothesis that extra opsin genes can improve wavelength discrimination was supported by data from females expressing an SWS opsin and two LWS opsins (an endogenous LWS gene on one X chromosome and a human LWS opsin on the other). These knock-in mice performed better in wavelength discrimination tests than wild-type mice with a single LWS gene .
Sexual selection in guppies favors males with more red, orange, and yellow color patches [71–73] suggesting that females use color diversity (chroma) to evaluate males. This may be because males with more chroma are more conspicuous . However, the 'extra' chroma is a consequence of the guppy visual system and this conspicuousness may not, therefore, apply to predators. Finally, if LWS opsin gene duplication improves motion detection, as proposed by White et al.  then female guppies might also be 'pre-adapted' to evaluate the well-characterized sigmoid display, a behavior that consists of the male arching its body into a S-shape and oscillating the long axis of the body both horizontally and vertically .
LWS expression in the guppy
Gene expression data will help us to test alternative hypotheses about the adaptive value of LWS opsin diversity. To expand the range of maximum sensitivity and enhance wavelength discrimination, it is necessary that different opsins be expressed at the same time. All four LWS opsin gene transcripts were amplified from cDNA derived from adult eyes in our lab and by Weadick and Chang . However, our qPCR experiments on three adults (1 male, 2 females) showed that most of the LWS opsin mRNA in the Cumaná guppy retinas was derived from the LWS A180 gene. Human SWS (blue) cone cells make up only 15% of the retina cone cell repertoire, yet play an important role in wavelength discrimination. Therefore, qPCR data showing unequal expression among LWS opsin paralogs in three adults do not rule out a role for LWS opsin gene duplication and divergence in wavelength discrimination in guppies, but does indicate the need for further investigation. We are currently using qPCR to examine LWS expression in a larger sample of adults and in fish at different stages of development. Finally, duplicated opsins are sometimes expressed in different regions of the retina [55, 76]. In-situ hybridization experiments are underway to test the hypothesis that different LWS opsin paralogs have unique expression domains within the guppy retina, as is the case in zebrafish .