Our experiments demonstrate that the predicted gene C05D2.4, which encodes an aromatic L-amino acid decarboxylase (AADC), corresponds to the genetically-defined bas-1 gene. Serotonin immunoreactivity is restored in bas-1 mutants by DNA containing an intact C05D2.4 gene, but not with DNA mutated in C05D2.4. The adjacent AADC-homologous gene, C05D2.3, is not needed to rescue bas-1 mutants. The bas-1 gene is therefore likely to encode the serotonin- and dopamine-synthetic AADC of C. elegans. Although we did not test for rescue of dopamine expression, it is likely that bas-1 encodes the same AADC required for DA synthesis. Mutants with point mutations in C05D2.4 – bas-1 alleles n2948 and n3008 – have been shown previously to be DA-deficient , and neither of these appears to contain mutations in the C05D2.3 gene. Furthermore, AADC proteins from other animals have been consistently shown to catalyze both 5HTP and L-dopa decarboxylation reactions . Finally, a bas-1 reporter construct is expressed both in identified serotonergic and dopaminergic cells.
The bas-1 gene is expressed in at least two alternatively spliced forms, one of which appears to be less common and contains a small additional 27 nucleotide exon. The short segment of protein encoded by the additional exon, and the surrounding region are not found in other AADC proteins, suggesting a novel function for this region of the AADC protein. In other organisms, the serotonin- and dopamine-synthetic AADC genes have alternative splicing that result in tissue-specific protein isoforms. Currently we have no indication that bas-1 is expressed in any cells other than serotonergic and dopaminergic neurons, and no information about the functional significance of this alternative splicing.
AADC has received somewhat less attention with respect to the regulation of serotonin and dopamine synthesis than the specific, rate-limiting synthetic enzymes tryptophan hydroxylase and tyrosine hydroxylase . This is in part due to the view that AADC activity is not limiting, and that its activity is not regulated. Regulation of AADC activity by protein kinase A-dependent phosphorylation has recently been proposed based on in vitro experiments , although its functional significance has been questioned . Our examination of the predicted BAS-1 protein revealed several potential phosphorylation sites that are highly conserved, although none fit the consensus sequence for PKA phosphorylation. Any possible regulation of C. elegans AADCs by phosphorylation remains speculation.
Possible functions of other AADC homologous genes in C. elegans
We compared the protein sequences of other predicted AADCs in C. elegans with those of other organisms in order to guess about their possible functions. This is particularly relevant because all bas-1 mutants retain weak, residual serotonin immunoreactivity (; C. Loer, unpublished) suggesting that other enzymes may be able to carry out the same reaction. This would not be surprising since animal AADCs tend to have broad specificity . Based purely on sequence homology, it seems that predicted genes K01C8.3 and ZK829.2 could act as AADCs or as HisDCs. In fact, the predicted gene K01C8.3 is now believed to be a tyrosine decarboxylase and has been named tdc-1 . If correct, then its best match in Drosophila (G30446), an uncharacterized AADC homolog, is likely to encode the fly's octopamine-synthetic tyrosine decarboxylase. It has long been known that a separate gene encoded this enzymatic activity in flies, since the activity is still detectable in Ddc deletion mutants . It will be interesting to see whether such tyrosine decarboxylases in animals have more restricted substrate specificity, such as the tyrosine and tryptophan decarboxylases in plants , or are more similar to typical animal AADCs with a broad specificity. Tighter substrate specificity of a tdc-1 protein could be reflected in the much slower rate of amino acid substitution seen in its C. elegans & briggsae orthologs than in the bas-1 orthologs which encode more 'promiscuous' enzymes.
Whether C. elegans or other nematodes make the neurotransmitter histamine, and therefore need a HisDC enzyme, is unclear. Although histamine has been reportedly isolated from C. elegans , this observation is unique among nematodes, and has not subsequently been confirmed. There is no particularly good candidate for a HisDC in C. elegans. The ZK829.2 predicted protein may be most closely related to tdc-1 in its core sequence, although its long N- and C-terminal extensions are perhaps suggestive of a new function. Unfortunately, transgenics with reporter fusions of this gene to date have shown no expression, the pattern of which might suggest a function (C. Loer, unpublished; M. Alkema, personal communication). As with tdc-1, C. elegans ZK829.2 and its C. briggsae ortholog have also evolved more slowly than the bas-1 orthologs. A recent analysis of eukaryotic AADC sequences that includes the C. elegans ZK829.2 and its C. briggsae ortholog as the only nematode representatives clearly demonstrates that AADC genes can evolve at very different rates, and that a constant "molecular clock" cannot be assumed in phylogenetic analyses .
Finally, since the C09G9.4 predicted protein is so highly divergent from the typical AADC, and lacks a critical lysine residue that binds the PLP cofactor, it is unlikely to be an AADC enzyme. It has a similar level of divergence from genuine AADCs as do other group II PLP-dependent enzymes such as cysteine sulfinic acid decarboxylase, to which it has little or no similarity. Whatever the function of a C09G9.4-encoded protein, it appears to represent a new PLP-DC-related protein; sequencing of more genomes may yet reveal additional members.
Duplicate gene retention and loss in Caenorhabditis
We found that the closest relatives of C05D2.4/bas-1 in C. elegans, the genes C05D2.3 and F12A10.3, are missing from C. briggsae. Furthermore, phylogenetic analysis indicates the two extra genes did not arise in the C. elegans line, but were present (or their commmon ancestor was present) in the species that gave rise to both the C. elegans and C. briggsae lines. Finally, careful examination of the cDNAs and predicted protein sequences of C05D2.3 and F12A10.3 reveals that neither is likely to be functional as an AADC: the former lacks critical amino acids and the latter can encode only a truncated protein. Both are expressed, based on the presence of cDNAs, but probably at a very low level, which is not above background in microarray experiments. It is possible that the duplicate genes are functionally 'lost' in C. elegans as well.
The features of C05D2.3 and F12A10.3 raise a number of interesting questions about the fate of duplicate genes, and the true nature of many 'predicted genes' in C. elegans. Taking a random sampling of predicted genes and generating transgenics with reporter fusion constructs (in order to determine a pattern of expression), Mounsey and colleagues  found that a much higher percentage of recently duplicated genes than conserved or unique genes failed to show expression. Assuming that failure of expression was no more likely among recently duplicated genes for technical reasons, this meant that many more of these are in reality not expressed. The numbers suggested that up to 20% of annotated, predicted genes in C. elegans may be pseudogenes. In fact, careful inspection of recently duplicated genes showed that many were actually pseudogenes, like we found to be the case for F12A10.3. Overall, close inspection of predicted genes revealed at least 4% were pseudogenes.
So, why are C05D2.3 and F12A10.3 still present in C. elegans if they lack a function? C. briggsae and C. elegans may have diverged 80 – 110 million years ago [37, 38]. Since the bas-1 -like gene or genes were likely present in the common ancestor of C. elegans and C. briggsae, then there seems to have been ample time for loss in the C. elegans line. Under a simple model of gene loss following duplication, only a few million generations would be the mean time to fix a null allele of the gene duplicate . In Caenorhabditis, a million generations could be completed in 10,000 years or less. This seems to suggest that the downstream duplicate of bas-1 (ancestor of C05D2.3) may have continued to function for a considerable time after the duplication, perhaps by gene conversion which might have continued until sufficient divergence from bas-1 . Loss of the critical six amino acids occurred after the second duplication giving rise to the ancestor of F12A10.3, since the appropriate sequence is still present there (although frame-shifted). It is also possible that the C05D2.3 gene retains some function. The gene still encodes a respectable protein, albeit one that seems unable to function as an AADC. It has diverged considerably from bas-1, but has not accumulated stop codons and frameshifts expected for a pseudogene. Walsh  has proposed that fixation of an allele with an advantageous new function, vs. becoming a pseudogene, may be the fate of many duplicate genes even when such mutations are rare, given a population that is sufficiently large.
C05D2.3 and F12A10. 3 seem to have been retained longer than expected. Lynch and Force  proposed that the unexpectedly high rate of gene duplicate retention in eukaryotic genomes is due to 'subfunctionalization' – the retention of a portion of the original single gene's function by each of the duplicates, which then complement one another. Although this was suggested to occur primarily by regulatory mutations that partition expression of the genes spatially, other forms of subfunctionalization could also occur. Another possible reason for retaining such genes is the presence of non-coding regulatory functions associated with transcription and splicing of these sub-functional transcripts that affect the transcription of other nearby genes, although a bas-1 ::GFP construct is expressed well without such sequences in cis.
Our analysis of synonymous vs. non-synonymous substitutions indicates that the bas-1 -like genes C05D2.3 and F12A10.3 are under relaxed selection relative to bas-1 and other AADCs. It should be noted that precise quantitative comparisons cannot be made with the results presented in the C. briggsae whole genome analysis , since we used a different method of calculating KA/KS; however our calculations indicate that bas-1 and the other AADC's, like most genes in the Caenorhabditis genomes, are under strong purifying selection. Even if both C05D2.3 and F12A10.3 are now pseudogenes, some significant period of time during which they functioned and were under purifying selection could act to obscure this fact in an analysis of KA/KS. Even if C05D2.3 has acquired a new, adaptive function, such a new function might result from changes in only a few sites in the protein, and so again this could be obscured by a majority of sites under purifying selection. With the sequencing of three related Caenorhabditis species it will be interesting to learn of the fates of bas-1 and the bas-1 -like genes in other lines.