The present, systematic analysis of indels in proteins of nematode origin has provided useful information to make inferences about protein evolution and function and could have important implications for the identification of novel genetic, biochemical or physiological markers and targets for drug design.
It is known that indels are distinct from point substitutions, and their evolution is affected by distinct influences . The present results support the proposal that insertions and deletions relate to distinct evolutionary processes. The number of deletions is significantly higher than that of insertions, and the sizes of deletions are larger than those of insertions. A previous study also found more deletions in coding sequences in mammals . It appears that mutational bias is one cause of these increased deletions. Analyses of non-coding genomic sequences have revealed a mutational process biased toward deletions [43, 44], and the present results indicate that similar mutational mechanisms might apply to the protein-coding sequences. However, the observed larger size of deletions cannot be explained by mutational bias, since deletion and insertion mutations have similar sizes . So, the role that natural selection and adaptation have played here is unclear. From the structural perspective, proteins are more flexible with insertions than deletions , although single-residue deletions can be tolerated . Therefore, deletions, particularly larger ones, are not expected to be maintained, which contrasts current observations. It is likely that function-related selection has played a role. Such selection has been identified in a primate sperm ion channel protein . Both the increased number of deletions at the terminal nodes (Figure 4) and the increased common deletions (Table 1) among parasitic nematodes further suggest that the function-related selection is associated with recent species adaptation. Nonetheless, the observed, larger number of deletions and their larger sizes (compared with insertions) indicate a size decrease in nematode transcriptomes during evolution. This size reduction in nematode transcriptomes seems to be consistent with a tendency for their genomes to be smaller than some other metazoa, such flatworms  and birds .
The biased distributions of indels on different functional pathways (Table 3) further demonstrate the selective forces on them, and suggest the association between increased indels and functional adaptation. Proteins involved in genetic information processing are believed to have stringent selective constraints, and are under strong "purifying selection" [51, 52]. Accordingly, these proteins have the least number of insertions and deletions per protein (Additional file 4). Depending on the function, some proteins are under positive selections and accumulate more insertions and deletions . The present study showed that nematode proteins involved in cellular processes (Table 3, Additional file 2) including endocrine signaling pathways and immune system (such as Toll-like receptors and antigen processing) had 50% more deletions and insertion than those involved in genetic information processing. This information agrees with the findings of previous studies [53, 54]. Rapidly evolving genes are also considered to be frequently associated with the immune and endocrine systems in other organisms [54, 55]. These systems are considered to be key to a specific molecular interaction with the environment where a rapid adaptation to a food source  or host may be required. Overall, the present results suggest that proteins bearing sizable nematode-specific indels are functionally grouped. Furthermore, adaptation can lead to an increase of protein sequence changes, including substitution, insertion, and deletion [55, 57]. The high rates of insertion and deletion events in proteins involved in multiple pathways might also be viewed as an evidence of functional adaptation, as suggested by recent research . Although a relaxed selective pressure can also lead to such high rates, it is unlikely here. The selective pressure on these proteins is assumed to be greater because they are involved in multiple pathways and potentially interact with more proteins. Their substitutional rates tend to be lower . Thus, increased indels in proteins involved in multiple pathways is suggested to be due to their positive role in an adaptation of nematodes to their host and environment.
More direct evidence for roles of nematode-specific indels in adaptation comes from a comparison of their distributions in different groups of nematodes. The number of indels common to plant parasitic nematodes, compared with other nematodes, is higher for proteins involved in electron transport and lipid/fatty acid/steroid metabolism (Table 2). The higher number could be related to the adaptation of PPN to their specific lifestyle, which includes different stages capable of surviving aerobic and anaerobic environments. Adaptation of these energy metabolism related functional classes is important for parasites. Direct biochemical evidences of adaptation with these two classes have been observed [60–62]. These detected common indels can be one of the reasons for those observed biochemical changes, and suggest that PPN are likely to use indels as a strategy for their adaptation as well as lateral gene transfer . Nonetheless, given the nature and potential bias of our data, the upcoming parasitic nematode genome data  will enable us to perform more comprehensive studies which will lead to more firm conclusions.
There is a very limited number of indels occurring on internal nodes (Figure 4), although these internal nodes do not represent short evolutionary times. For example, the internal branch leading to the split of Clade V, IV and III stands for more than 100 million years  (the branch leading to Clade III stands for about 350 million years [29, 30]). If all indels were retained and their rates are constant, the numbers of indels of these internal branches are expected to be in the same order of magnitude as the terminal branches. The extremely low observed numbers of insertions and deletions on internal nodes could reflect that nematodes have higher indel rates, and thus the number of newly generated indels is higher that the older indels. Thus, sequence comparisons can only detect fewer shared indels. However, it is also possible that an indel burst occurred in the evolution of the Nematoda during a recent adaptation to their life niches.
In addition to insights into fundamental aspects of indel events in proteins, including their evolution and roles in adaptation, this study might assist in the design of intervention strategies against nematodes. Indels common to all parasitic nematodes could provide drug targets for broad control. On one hand, if they are restricted to nematodes, specific targeting will not affect the regular functionality of the homologous proteins in the hosts. Importantly, these indels might also be crucial for the survival of parasites (or they will not be shared by all the parasites since parasitic adaptations are proposed to have occurred several times independently ). Using indels to design effective drugs has been successfully explored by Nandan et al.,  who designed a compound that targeted a 12 residue deletion in the EF-1α protein of the protozoan parasite Leishmania donovani. The compound attacks the parasite by blocking this protein without affecting the human homologue. It is likely that the indel used for drug design should be located on the surface of a protein or should relate to a unique structural component of a protein. Sizable deletions uniquely shared by all parasites could possess similar characteristics and become a target for these approaches as well. One such example is the deletion of an entire helix on a functionally important mitochondrial carrier protein (Figure 5). Hence, the identified conserved nematode-specific molecular signatures have possible applications for advancing our understanding of the nematodes.