Novel sequencing technologies have dramatically increased the number of available genome sequences over the last few years. However, the biological value of a new genome sequence is limited due to the lack of knowledge about homologous sequences in other organisms. The absence of homology to any known sequence, as in the case for a large fraction of P. pacificus genes [9, 10], exemplifies our lack of knowledge about the genome content of this model organism. Therefore, we have started to complement the available P. pacificus genome, transcriptome and proteome data with next generation sequencing approaches of related species [3, 9]. This approach has facilitated a mechanistic understanding of some of the HGT events that occurred in the evolutionary lineage giving rise to P. pacificus. For cellulase genes acquired from microbes, P. pacificus and related Pristionchus nematodes show functional assimilation, high gene turnover and rapid sequence diversification associated with positive selection .
In P. pacificus, the scarab beetle-associated ecology might result in a number of potential donors for HGT. The decaying beetle is an ecosystem consisting of bacteria, fungi, nematodes and presumably, a large number of unicellular eukaryotes. The previously described examples of cellulase and Diapausin genes clearly indicate that microbes and insects, at least, must be considered as potential HGT donors into the P. pacificus genome. One inroad into the identification of HGT events is a computational archaeology approach as originally described for E. coli .
In this study we have hypothesized that a substantial fraction of the P. pacificus orphans might be introduced into the genome by means of HGT. Hereby we refer to an orphan gene as a gene with no similarity to any other nematode sequence. Under the assumption that some horizontally transferred genes may exhibit a codon usage bias that is more similar to the donor genome than to the acceptor genome [1, 13], we could show that a fraction of P. pacificus orphans exhibits an atypical codon usage relative to the rest of the genome. The fact that the majority of orphan genes show a codon usage typical for nematodes might be due to two circumstances. First, HGT events most likely occurred repeatedly with more recent HGT events preferentially showing a codon usage bias. Second, with multiple potential donors, no common patterns of atypical codon usage are expected. For example, nematodes, insects and fungi show closely related codon usage patterns, whereas protozoans and other microbes, all of which are potential donors for HGT, exhibit very different codon usages. In our analysis, we found a similarity in codon usages for insects, nematodes and fungi. GC-normalized RSCU distances of P. pacificus genes to the genomewide profiles of P. pacificus, Drosophila melanogaster, and Aspergillus nidulans showed strong correlations (r > 0.87, Pearson). This circumstance highlights the need for a careful investigation of potential HGT events. We consider the work presented in this study as a novel computational entry road towards the identification of HGT patterns in P. pacificus.
In addition to the strong association of orphan genes with atypical codon usage, we could characterize this codon usage pattern by comparison to genomewide profiles for 71 species corresponding to six taxonomic groups. The extent to which codon usage profiles can predict species and taxonomic groups is still limited. However, comparisons of subsets of genes against all genes may help uncover the domain or phylum, from which these genes entered the P. pacificus genome. The most significant enrichment was detected for insect-like codon usage (P < 10-54).
It is important to note that atypical patterns of codon usage may also arise from other sources such as translational efficiency or secondary structures (see  for review). Thus analysis of codon usage alone may not be sufficient to support the proposed HGT events. We therefore complemented this analysis by cross-species comparisons to identify genes that show greatest similarity to homologs in insects.
We identified 509 HGT candidates using homology searches against a combined nematode and insect protein database and scanning for genes bearing greater resemblance to insect genes than to the closest homologs within the nematode phylum. These HGT candidates showed a significantly higher similarity to insect-like codon usage profiles. Further investigations revealed that in addition to the previously identified Diapausins (Table 2) , many of these genes encode endonuclease and reverse transcriptase proteins. Since 70 of the 159 P. pacificus reverse transcriptase sequences show a higher degree of similarity to those of insects, we speculate that reintroduction of these elements from insects represents one mechanism by which P. pacificus has acquired genes. Phylogenetic analysis of all HGT candidates identified by cross-species homology could provide more detailed information and further support for the proposed HGT events. Although P. pacificus is not an insect parasitic nematode, dauer larvae of P. pacificus are in constant physical contact with beetles . After the death of the beetle, nematodes resume development and feed on microorganisms growing on the carcass, presumably for several generations . Close physical contact between donor and recipient has been proposed as one criteria for HGT , making beetles a plausible candidate for HGT donors. While our data suggests that a substantial fraction of P. pacificus orphans originates from insect genomes, it is possible that HGT involves vectors as intermediate carriers. It is known that many viruses coexist with insects often in a species-specific interaction, so viruses are obvious candidates for HGT into P. pacificus. This hypothesis is supported by the finding that parts of the Diapausin genes found in leaf beetles and P. pacificus have also been observed in iridoviruses . We therefore hypothesize that viruses are potential intermediate carriers that promote HGT events from insects into P. pacificus.
Our data however, strongly support a second scenario. We identified a large number of non-LTR retrotransposon sequences in the P. pacificus genome that have highest sequence similarity to insects. In addition to permanence, integration into the host's biology is one necessary features of HGT . The non-LTR retrotransposons are unlikely to have a beneficial effect on the biology of P. pacificus. Thus the strong enrichment of retrotransposon associated genes among HGT candidates defined by cross-species homology seems counterintuitive. One explanation for this observation is that retrotransposons might have served as carriers of foreign genetic material into the P. pacificus genome. This hypothesis is supported by the fact that we detected a tendency for orphan genes and other HGT candidates to be colocalized near retrotransposon genes. It could provide one possible explanation for the presence of foreign retrotransposons and could serve as a model for how other genes might have integrated into the P. pacificus genome. An open question is the permanence of the transferred genetic material. Comparison with other Pristionchus species indicates, that a substantial fraction of HGT candidate dates back to ancestral Pristionchus sequences. However, more data from wild isolates will be needed to robustly measure the amount of selection acting on these genes.