Evolution of the vertebrate insulin receptor substrate (Irs) gene family
© The Author(s). 2017
Received: 3 March 2017
Accepted: 7 June 2017
Published: 23 June 2017
Insulin receptor substrate (Irs) proteins are essential for insulin signaling as they allow downstream effectors to dock with, and be activated by, the insulin receptor. A family of four Irs proteins have been identified in mice, however the gene for one of these, IRS3, has been pseudogenized in humans. While it is known that the Irs gene family originated in vertebrates, it is not known when it originated and which members are most closely related to each other. A better understanding of the evolution of Irs genes and proteins should provide insight into the regulation of metabolism by insulin.
Multiple genes for Irs proteins were identified in a wide variety of vertebrate species. Phylogenetic and genomic neighborhood analyses indicate that this gene family originated very early in vertebrae evolution. Most Irs genes were duplicated and retained in fish after the fish-specific genome duplication. Irs genes have been lost of various lineages, including Irs3 in primates and birds and Irs1 in most fish. Irs3 and Irs4 experienced an episode of more rapid protein sequence evolution on the ancestral mammalian lineage. Comparisons of the conservation of the proteins sequences among Irs paralogs show that domains involved in binding to the plasma membrane and insulin receptors are most strongly conserved, while divergence has occurred in sequences involved in interacting with downstream effector proteins.
The Irs gene family originated very early in vertebrate evolution, likely through genome duplications, and in parallel with duplications of other components of the insulin signaling pathway, including insulin and the insulin receptor. While the N-terminal sequences of these proteins are conserved among the paralogs, changes in the C-terminal sequences likely allowed changes in biological function.
KeywordsInsulin receptor substrate Gene duplication Protein evolution Episodic evolution Phylogeny Vertebrate Pseudogene
The intracellular actions of insulin are initiated by the binding of the hormone insulin to its specific cell surface receptor, the insulin receptor [1, 2]. The insulin receptor is a heterotetrameric protein consisting of two extracellular alpha subunits and two transmembrane beta subunits that are connected by disulfide bridges [3, 4]. The binding of insulin to the extracellular alpha subunits of the receptor induces a conformational change that activates the intracellular tyrosine kinase domain found in the beta subunits [5, 6]. Once the tyrosine kinase activity is triggered, the insulin receptor autophosphorylates key tyrosine residues (Tyr-1158, Tyr-1162, and Tyr1163, in the human sequence) in the intracellular portion of the beta subunit . Phosphorylation of these sites then allows interactions with docking proteins, which are also subsequently tyrosine phosphorylated by the insulin receptor tyrosine kinase activity , and downstream signaling via SH-2 domain-containing proteins to yield physiological responses . Insulin can initiate several different signaling pathways that regulate metabolic responses, cell survival, growth, and differentiation [1, 2, 9].
Docking proteins are key molecules as they allow the aggregation of components of signaling cascades . The first insulin receptor docking protein identified in mammalian cells was Insulin receptor substrate (Irs1) , with three additional docking proteins (Irs2, Irs3, and Irs4) subsequently characterized and found to share similarity in their sequences with Irs1 [11–13]. The four characterized members of the Irs protein family share similar protein architectures, with fairly well conserved N-terminal pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains located near their N-termini and having relatively long C-terminal extensions [14–18]. The C-terminal extensions, which show lower levels of similarity than the N-terminal region, contain multiple tyrosine phosphorylation motifs (as well as serine/threonine phosphorylation motifs) that interact with multiple signaling proteins [14–18]. The PH and PTB domains aid in targeting Irs proteins to the plasma membrane and insulin receptor, respectively [19, 20], while differences in the tyrosine phosphorylation motifs in the C-terminal sequences of the Irs proteins allow interactions with distinct downstream signaling pathways [15, 18]. Only three of the four Irs proteins found in the mouse are functional in humans, as the IRS3 gene sequence has been pseudogenized . Intriguingly, Irs3, at only 494 amino acids in length, is less than half the size of the other three characterized Irs proteins, which are about 1200–1300 amino acids in length [10–13]. Compared to the other Irs proteins, Irs3 has a shorter C-terminal domain but retains similar-sized PH and PTB domains [12, 18]. Additional proteins containing both the PH and PTB domains have been identified (i.e., Dok4 and Dok5) that interact with the insulin receptor, however these proteins lack C-terminal extension with multiple phosphotyrosine motifs . While Irs proteins were initially identified due to their interaction with the insulin receptor, they also interact, as docking proteins, with receptors for other growth factors, such as the insulin growth factor 1 receptor (IGF1R) and the insulin-related receptor (IRR), that also contain intracellular tyrosine domains [23, 24].
Irs proteins exert their unique functions through a combination of tissue-specific expression and differential binding of downstream signaling proteins [14–18, 25]. Irs1 is found in many classical targets of insulin action and is important for insulin sensitivity and embryonic and post-natal body growth . Irs2 is found in an overlapping set of tissues with Irs1, however appears to have a more important role in mediating the neuronal effects of insulin  and the growth and survival of pancreatic beta-cells . On the other hand, the function of Irs4 has been difficult to identify as genetic knockouts of this gene have little physiological effect . However, when these knockouts are combined with a brain-specific Irs2 knockout, unique changes in energy regulation and glucose homeostasis are observed . Irs3 is not essential for growth or glucose metabolism  and its expression is restricted to white adipocyte tissue in mice [12, 32] (and is absent in humans ), suggesting a possible, but non-essential, role for this protein in adipose tissue in rodents. In contrast to other Irs proteins, the PH domain of Irs3 has an additional role in targeting Irs3 to the nucleus, in addition to the plasma membrane, a localization necessary for Irs3 induced glucose uptake . Loss of the Irs3 gene on the human lineage indicates that the function of this gene is not essential in some mammals, and raises questions about the necessity of multiple Irs proteins.
A single Irs-like protein, named Chico, has been found in Drosophila melanogaster that also interacts with the Drosophila insulin receptor . Like the mammalian Irs proteins, Chico is a large protein of about 1000 amino acids in length that contains PH and PTB domains near its N-termini and multiple phosphotyrosine motifs in its C-terminal region . Only a few studies have examined the origin and evolution of the vertebrate Irs gene family, where it has been concluded that these genes diverged on the vertebrate lineage but these studies have reached differing conclusions concerning the relationships among the 4 Irs proteins [17, 35–37]. A number of questions remain to be answered. While it appears that the Irs genes duplicated and diverged from each other on the vertebrate lineage, before the mouse-human divergence, how early in vertebrate evolution this occurred is currently unknown. Did the duplications occur very early in vertebrate evolution in parallel with the duplications of other members of the insulin signaling pathway such as insulin  and the insulin receptor [39, 40]? Irs3 was lost on the human (primate) lineage . Was this loss a unique event, or has this gene been lost on other lineages? Have other Irs genes been lost on other vertebrate lineages? Which gene(s) are best conserved (i.e., potentially most essential), both in terms of retention in genomes and in conservation of their sequences within vertebrates? Why is the Irs3 protein sequence much shorter than for other Irs proteins? When did the protein become smaller? Here we show that the Irs genes duplicated very early in vertebrate evolution, likely at a similar time as the origin of the insulin and insulin receptor gene families [38–40] and as a consequence of the two rounds of genome duplications that occurred in the vertebrate ancestor [41, 42]. Our analyses also show that the Irs3 has been lost on multiple independent lineages, and that the genes for other Irs proteins, including Irs1 and Irs2, have occasionally been lost. The length of the Irs3 protein was reduced on the early tetrapod lineage, after divergence for fish, and was followed by a period of rapid sequence evolution in an early mammalian ancestor. Intriguingly, Irs4 also experienced an episode of rapid evolution, in parallel with Irs3, early in mammalian evolution.
Number of insulin receptor substrate (Irs) genes in vertebrate genomes
Numbers of Irs-like genes found in diverse vertebrates in the genome and coding sequence databases
43 | 81
42 | 81 (73)
39 | 73 (28)
31 | 38 (29)
42 | 78 (51)
5 | 54
5 | 51 (4)
5 | 32 (5)
0 | 0 (0)
5 | 54 (4)
2 | 5
2 | 5 (4)
2 | 5 (1)
2 | 5 (4)
2 | 5 (4)
1 | 2
1 | 2 (1)
1 | 1 (1)
1 | 1 (1)
1 | 2 (2)
1 | 1
1 | 1 (1)
1 | 1 (1)
1 | 1 (1)
1 | 1 (0)
11 | 25
3 | 5 (5)
21 | 44 (43)
21 | 49 (46)
21 | 47 (29)
1 | 1
1 | 1 (1)
1 | 1 (1)
0 | 0 (0)
1 | 1 (1)
1 | 0
0 | 0 (0)
1 | 0 (0)
0 | 0 (0)
0 | 0 (0)
65 | 167
55 | 146 (89)
71 | 157 (80)
56 | 94 (81)
73 | 188 (91)
Many of the Irs genes identified in our searches of the Ensembl database were incomplete (i.e., did not predict complete open reading frames). Some of the incomplete coding sequences contained unsequenced gaps (Ns) in the genome assemblies, while others could have been due to sequencing errors or pseudogenization. To complement the sequences identified from the Ensembl database, a BLAST search  was conducted of the NCBI database  to identify Irs coding sequences (Table 1 and Additional file 2: Table S2). Searches of the NCBI database identified a larger number (167) of vertebrate species with Irs coding sequences than the Ensembl database, but many of these are from species do not contain near complete genome sequences (e.g., Xenopus laevis), thus the full complement of Irs genes in these species might not have been found. A second limitation of our NCBI searches was that we only identified Irs-like sequences that had been annotated as coding sequences (i.e., if the gene was not annotated or was a pseudogene it would not be found) (see Additional file 1: Table S1 and Additional file 2: Table S2). The total number of vertebrate species with identified Irs-like genes was 172 (59 in both Ensembl and NCBI, 1 in both the Elephant Shark Genome project and NCBI, 107 only in NCBI, and 5 only in Ensembl, see Table 1 and Additional file 1: Table S1 and Additional file 2: Table S2). The distribution of the Irs-like gene paralogs among vertebrate classes identified in the NCBI database was similar to that seen with the Ensembl database (Table 1 and Additional file 2: Table S2).
Phylogeny of vertebrate insulin receptor substrate (Irs) genes
To better establish the orthology-paralogy relationships among the identified Irs genes, and determine when duplications of the Irs genes occurred, phylogenetic relationships of the sequences were established using maximum likelihood [48, 49] and Bayesian approaches [50, 51]. A total of 341 full-length, or near-full length (those missing only short portions of sequence at the N- or C-termini of their predicted proteins), Irs-like coding sequences from 172 vertebrate species (including 89 Irs1, 80 Irs2, 81 Irs3, and 91 Irs4 sequences (Table 1 and Additional file 1: Table S1, Additional file 2: Table S2 and Additional file 3: Figure S1) were used in this analysis. Maximum likelihood phylogenetic analysis of putative Irs orthologs yielded topologies consistent with the expected species topologies (Additional file 4: Figure S2, Additional file 5: Figure S3, Additional file 6: Figure S4 and Additional file 7: Figure S5; similar results were obtained using Bayesian methods, results not shown), suggesting that the analyzed genes were orthologous.
Origin of vertebrate insulin receptor substrate (Irs) genes
Duplication of Irs genes in Bony fish
Duplicate copies of Irs2, Irs3, and Irs4 were found in most species of bony fish examined (Table 1 and Additional file 1: Table S1 and Additional file 2: Table S2). Bony fish experienced an additional genome duplication not shared by other vertebrates [53, 54], thus duplicated Irs genes would be expected. Duplicated Irs genes were not found in the genome of the spotted gar, a species that diverged from other bony fish prior to the fish-specific genome duplication . Phylogenetic analysis of the Irs2, Irs3, and Irs4 sequences (Additional file 5: Figure S3, Additional file 6: Figure S4 and Additional file 7: Figure S5) demonstrated that the duplications of these genes occurred early in bony fish evolution consistent with the fish-specific genome duplication. When the genomic neighborhoods surrounding the zebrafish Irs genes were examined, only one of the fish duplicates (Irs1, Irs2b, Irs3b, and Irs4b) was located in a genomic neighborhood orthologous to those seen in mice (Fig. 2b), while the second paralogous gene (Irs2a, Irs3a, and Irs4a) resided in genomic regions with no similarity in gene composition to the genomic region found in mice.
Loss of the Irs3 Gene on the primate lineage
While mice have 4 Irs genes, only 3 functional Irs genes are found in humans, as Irs3 contains mutations that introduce a stop codon and delete part of the coding sequence . Genomic sequences similar to Irs3 were identified in a number of primate genomes in the Ensembl database; however, intact coding sequences could only be predicted for the Tree shrew and the Mouse lemur (Additional file 1: Table S1). Similarly, searches of the NCBI database for coding sequences similar to Irs3 only identified potentially functional Irs3 coding sequences in three primate species, the Mouse lemur, Coquerel’s sifaka, and Sunda flying lemur (Additional file 2: Table S2). Complete coding sequences could be predicted for the Mouse lemur and Coquerel’s sifaka but the sequences from the other two primates contained unsequenced gaps. Importantly, all four of these species with potentially intact Irs3 gene sequences are early branching lineages within primates . Alignment of the Irs3 genomic sequences from diverse primates (see Additional file 9: Figure S7) using MultiPipMaker [56, 57] demonstrated that the sequences were not well conserved as a large number of frameshift mutations were identified along with large deletions, including those previously identified in the human IRS3 pseudogene sequence . These results suggest that Irs3 was inactivated early in primate evolution, but after divergence of the Mouse lemur and Coquerel’s sifaka. When MultiPipMaker alignments were generated using the human sequence as the master sequence (results not shown), an Alu repetitive element that disrupts the human IRS3 coding region  was found to be shared by Irs3 sequences from primates that lack an intact coding sequence, suggesting that the insertion of this element into the gene occurred at about the same time as the pseudogenization of the gene.
Loss of the Irs3 Gene in birds
In addition to the absence of Irs1 in most bony fish and Irs3 in most primates, another notable group of animals that lack a specific Irs gene is birds, where no Irs3 coding or gene sequences were identified (Table 1 and Additional file 1: Table S1 and Additional file 2: Table S2). In contrast to primates, where genomic sequences similar to Irs3 were found containing mutations that disrupt the coding sequences (see above), genomic sequences similar to Irs3 were not found in any of the bird genomes examined (Additional file 1: Tables S1). To exclude the possibility that the avian Irs3 sequences had rapidly evolved, and thus were not detectable in the BLAST searches , we attempted to use genomic neighborhoods to identify these genes. However, searches for the genes that flank the mammalian Irs3 gene (i.e., Lrch4 and Agfg2, see Fig. 2) also failed to find orthologs of these genes (results not shown). These results suggest that the Irs3 genomic region, including adjacent genes, had been deleted from the genomes of birds.
Episodic evolution of vertebrate insulin receptor substrate (Irs) genes
Visual inspection of the phylogenies generated from the Irs coding sequences, using both single gene (Additional file 4: Figure S2, Additional file 5: Figure S3, Additional file 6: Figure S4 and Additional file 7: Figure S5) and gene family (Fig. 1 and Additional file 8: Figure S6) phylogenies, suggested accelerated evolution on the mammalian ancestral lineages for Irs3 and Irs4. Branch lengths displayed in our phylogenetic analysis are proportional to the number of inferred nucleotide substitutions. For both Irs3 and Irs4, mammals have accumulated more changes than sequences from the other vertebrate classes, suggesting that these genes experienced accelerated evolution early in mammalian evolution. To determine whether the longer branches are due to increased numbers of amino acid substitutions in the Irs3 and Irs4 protein sequences we conducted relative rate tests  with protein sequences encoded by Irs genes from four different mammalian species (if available) and 6 non-mammalian species (Additional file 10: Table S3). For all relative rate comparisons, the mammalian Irs3 and Irs4 protein sequences accumulated significantly higher numbers of amino acid substitutions compared to protein sequences from a diverse array of non-mammalian species. In contrast, only a small number of the comparisons with Irs1 displayed significantly higher numbers of amino acid substitution on the mammalian lineage, with none being significantly higher on the mammalian lineage for Irs2, although there were a few cases of significantly higher numbers on the non-mammalian lineage for this protein (Additional file 10: Table S3). These results show that the proteins encoded by Irs3 and Irs4, but not Irs1 or Irs2, have accumulated increased numbers of amino acid substitutions on the mammalian lineage.
Changes in the lengths of vertebrate insulin receptor substrate (Irs) proteins
Lengths of Irs proteins from representative vertebrate speceis
Conservation of Irs protein sequences
Tyrosine phosphorylation of Irs protein sequences
Tyrosine phosphoryation of Irs proteins
Prarie deer mouse
Origin of the Irs gene family
While multiple Irs-like genes have been previously characterized in several mammalian species [10–13, 17, 36], only a few non-mammalian Irs-like genes have been identified, which limited the ability to resolve when this gene family originated and how the different genes are related to each other [17, 35–37]. Here, our searches have identified a large number of Irs-like genes from a diverse array of vertebrate classes, which should allow better estimation of the time when this gene family originated and how the different genes are related to each other. Searches of vertebrate genomes identified multiple Irs-like sequences in the genomes of representative species for all vertebrate classes except Agnatha (Jawless fish) (Table 1 and Additional file 1: Table S1 and Additional file 2: Tables S2). However, given the low coverage of the sea lamprey somatic genome  and the loss of DNA in this species due to genomic remodeling in somatic tissue , Irs-like sequences may have been missed in this jawless fish. These observations suggest that the Irs gene family originated early in vertebrate evolution, and possibly before the earliest divergence of extant vertebrate species.
Phylogenetic analyses of the sequences (Fig. 1 and Additional file 8: Figure S6) strengthened this conclusion, demonstrating that the multiple genes originated early in vertebrate evolution and were not due to parallel duplications on diverse lineages. Analysis of genomic neighborhoods is a powerful tool for identifying orthologs , especially in gene families where multiple sequences have similar levels of similarity to a putative ortholog, where only the true othologs share genomic neighborhoods . In this context, we used genomic neighborhoods to confirm the orthology of many of the diverse Irs genes found in vertebrates. When the genomic locations of Irs-like genes were examined (Fig. 2), three of the 4 Irs genes were found to be in genomic neighborhoods that shared similar gene contents. The sharing of paralogus genes among genomic neighborhoods is consistent with these genes originating through genome duplications , which suggests that at least 3 of the 4 Irs genes originated via the two rounds of genome duplication that occurred in the common ancestral vertebrate lineage [41, 42]. Interestingly, both the insulin  and the insulin receptor [39, 40] gene families originated very early in vertebrate evolution, and potentially via the same genome duplications. Irs proteins not only interact with the insulin receptor, but also with other receptors, including the Insulin growth factor I (IGF-1) receptor and the Insulin-related receptor (Irr) [23, 24]. These observations suggest that duplications of the genes for the ligands, receptors, and docking proteins could lead to increased specialization in these signaling pathways, and the possibility to evolve new functions.
Change in number of Irs genes
While Irs gene originated very early in vertebrate evolution, the number of Irs genes is found to vary between species. Similar variations in the numbers of genes within gene families involved in insulin signaling in vertebrates have previously been reported [38, 66, 67]. Early studies demonstrated that the Irs3 gene was lost on the human lineage , and our analysis indicates that it was possibly inactivated by the insertion of a repetitive DNA element early in primate evolution (results not shown). Irs3 genes were also lost on the lineage leading to birds. A number of genes involved in insulin-regulated metabolism have been lost in the chicken , some of which have been shown to be missing in wide variety of birds (e.g., Resistin ), suggesting that the loss of Irs3 might have been part of an adaptation by birds to their new locomotive style. Teleost fish experienced a genome duplication , however rapid loss of many of the duplicates occurred . Here we found duplicated copies of Irs2, Irs3, and Irs4 in most teleost fish genomes, but most of these species have lost both copies of Irs1 (see Table 1 and Additional file 1: Table S1 and Additional file 2: Tables S2). The presence of multiple Irs genes, and the overlap in the functions of the Irs proteins [14–18] suggests a degree of redundancy among these genes allowing species to adapt to the loss of one (or more) of these genes.
Evolution of Irs proteins
Duplication of genes should allow the specialization of distinct proteins to unique biological roles [69, 70], thus duplication of the Irs genes might have allowed the evolution of novel regulatory roles for the insulin signaling pathway. While all Irs proteins are involved in insulin signaling, they each appear to have unique, but to some extent overlapping, biological roles [14–18]. Changes in the numbers of Irs genes also shows that the genes have retained a degree of redundancy and have not completely sub-functionalized since their origin. Despite the overlap in function, differences in evolutionary patters can be seen among the Irs genes. Irs3 and Irs4 both experienced episodes of more rapid protein sequence evolution on the common ancestral lineage leading to mammals (Additional file 10: Table S3), which suggests either a temporary relaxation of evolutionary constraints on these sequences on this lineage or that the rapid evolution was driven by positive selection. Both patterns of evolution could have resulted in changed biological functions for these proteins, and might explain why Irs3 and Irs4 might have functions that are less essential than Irs1 or Irs2. Irs3 is non-essential as loss of this gene in humans is tolerated , and our data shows that a number of primates, birds and potentially other vertebrates can survive without this gene. Knockout of Irs4 has little physiological effect , while Irs1 or Irs2 knockout mice have much more pronounced physiological defects [26, 27, 71, 72].
Further evidence for the diversification of the function of the Irs proteins is derived from the conservation plots. When each Irs protein is individually examined, areas of strong sequence conservation are seen across the entire protein sequence, although to a lower extent for Irs3, which might be due to the rapid evolution on the early mammalian lineage (Fig. 3a-d). However, when conservation is examined across the family of Irs proteins (Fig. 3e), most of the conservation is concentrated in the regions encoding the PH and PTB domains, sequences that are important for localizing these proteins to the plasma membrane  and insulin receptors , respectively. The plasma membrane localization, and insulin receptor interactions of these proteins have been conserved, but the C-terminal extension, which allow interaction with downstream signaling partners [15, 18], show greater levels of divergence to account for changes in downstream functions. However, there are a few areas of the C-terminal extension that are strongly conserved among all Irs, including two putative tyrosine phosphorylation sites that have been shown to be important in Irs1 and Irs2 for interactions with phosphatidylinositol 3-kinase (PI3K) [73–76], a key downstream signaling protein of insulin receptors . Thus, interaction with PI3K appears to be conserved among all Irs proteins, but changes in interactions with other signaling proteins might explain the differences in biological function of the different Irs proteins.
Here we have shown that the Irs gene family originated early in vertebrate evolution, with at least three of the genes likely generated during the two rounds of genome duplication that occurred in the vertebrate ancestor. Most groups of vertebrates have retained all 4 Irs genes, although some groups have lost genes, including primates and birds that have lost Irs3 and most fish that have lost Irs1. Duplication of Irs genes is only seen in fish that have experienced the fish-specific genome duplication, leading to duplicated Irs2, Irs3, and Irs4 genes. This suggests that while there are redundancies in the function of Irs gene, thus can tolerate the loss of a gene, gain of Irs genes is likely harmful, except when other genes in the insulin signaling pathway are duplicated. This conclusion is agreement with the finding of an increased number of retained duplicated genes involved in signal transduction pathways found in fish after the fish-specific genome duplications . The protein sequences of Irs1 and Irs2 are strongly conserved across vertebrates while Irs3 and Irs4 show lower levels of conservation. In addition to lower sequence conservation, the length of Irs3 progressively shorted along the lineage leading to mammals. Comparisons among the paralogous Irs sequences shows that most of the sequence is well conserved within a paralog, but only the PH and TTB domains, those responsible for binding to plasma membranes and the insulin receptor, are conserved between paralogs. Only a few regions within the C-terminal extensions of these proteins are conserved among Irs paralogs, suggesting that divergence in these sequences has allowed divergence in function.
Molecular sequence databases maintained by Ensembl  and the National Center for Biotechnology Information (NCBI)  were searched in January 2016 for insulin receptor substrate (Irs1, Irs2, Irs3, and Irs4)-like coding sequences. We initially searched the databases using the tBLASTn algorithm  using previously characterized mouse Irs1, Irs2, Irs3, and Irs4 protein sequences as queries. Putative Irs-like protein sequences identified were then used in subsequent tBLASTn searches. We also investigated the elephant shark (the sole representative of cartilaginous fish with a near-complete genome sequence) genome generated by the Elephant Shark Genome Project [45, 78]. All sequences that had E-scores below 0.01 were examined. Sequences identified by BLAST were used in reciprocal BLASTx searches of the mouse proteomes to ensure that their best matches were Irs-like sequences.
To examine genomic neighborhoods near Irs-like genes genomic comparisons were conducted using PipMaker and MultiPipMaker [56, 57]. Genes neighboring the Irs-like genes were identified from the genome assemblies in Ensembl  and the Elephant Shark Genome Project . The organization of genes adjacent to the Irs-like genes was used to determine whether the genes of interest reside in conserved genomic neighborhoods.
Phylogenies of vertebrate Irs-like gene coding sequences were generated using full-length, or near full-length (i.e., missing a short part of their N- or C-termini), Irs1, Irs2, Irs3, and Irs4 coding sequences from diverse vertebrate and outgroups (see Additional file 1: Table S1 and Additional file 2: Tables S2) and outgroups. Irs-like coding sequences were aligned using MAFFT  as implemented at the Guidance web site [80, 81], using default parameters. Similar results were obtained if Clustal Omega  was used as the alignment program. DNA sequence alignments were based on codons to retain protein alignments. The reliability of the alignments was examined using Guidance [80, 81] and trimmed alignments using sites that had values above the default cut-off of 0.93 were generated.
Phylogenetic trees of the sequences were generated using Bayesian methods with MrBayes 3.2 [50, 51, 83], maximum likelihood with IQ-tree [49, 84], and neighbor-joining distance approaches with MEGA6.06 . Bayesian trees were generated from coding sequences with MrBayes 3.2 using parameters selected by ModelFinder , whose results are presented in Additional file 13: Figure S9. MrBayes was run for 2,000,000 generations with four simultaneous Metropolis-coupled Monte Carlo Markov chains sampled every 100 generations. The average standard deviation of split frequencies dropped to less than 0.02 for all analyses. The first 25% of the trees were discarded as burn-in with the remaining samples used to generate the consensus trees. Trace files generated by MrBayes were examined by Tracer  to verify if they had converged. Maximum likelihood trees, constructed with 1000 replications by the ultrafast approximation , were generated with IQ-tree  on the IQ-tree webserver  using parameters for the substitution model suggested by ModelFinder . The maximum likelihood search was initiated from a tree generated by BIONJ and the best tree was identified after heuristic searches using the nearest neighbor interchange (NNI) algorithm. MEGA6.06  was used to construct bootstrapped (1000 replications) neighbor-joining distance trees, using either Maximum Composite Likelihood distances for the DNA sequences or JTT distances for the proteins sequences. Similar results were obtained, but with lower confidence (bootstrap or posterior probabilities) intervals if alternative outgroups were used (results not shown).
With respect to orthology-paralogy issues, choice of outgroup, alignment method (MAFFT  or Clustal ), or the use of full-length or trimmed (based on Guidance scores ) alignments had little influence on the key findings of these analyses. Methods that relied on shorter sequences (i.e., trimmed alignments or protein sequences) or simpler models of sequence evolution (i.e., neighbor-joining or parsimony) tended to yield weaker support for the earlier diverging lineages, but none of our analyses were in significant conflict with the key inferences of the phylogeny presented in Fig. 2 or Additional file 11: Figure S8.
Analysis of protein sequence conservation
Conservation of proteins sequences was assessed using Jenson-Shannon (JS) divergence scores  on the JS Divergence web server , using a window size of 3 and the BLOSUM62 matrix as background. Putative tyrosine phosphorylation sites in the protein sequences were predicted using NetPhos [63, 90].
This work has been supported by a grant from the Canadian Institutes of Health Research CCI-109605 (to DMI). The funding body did not have any role in the design, analysis, or interpretation of data or in the writing of the manuscript and the decision to submit the manuscript for publication.
Availability of data and materials
The data set supporting the results of this article is included within the article’s additional files (see Additional file 3: Figure S1).
AA and DMI designed the research and outlined the manuscript, obtained and analyzed the data, and drafted the manuscript. The authors have read, edited, and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Pirola L, Johnston AM, Van Obberghen E. Modulation of insulin action. Diabetologia. 2004;47:170–84.View ArticlePubMedGoogle Scholar
- Myers MG Jr, White MF. Insulin signal transduction and the IRS proteins. Annu Rev Pharmacol Toxicol. 1996;36:615–58.View ArticlePubMedGoogle Scholar
- De Meyts P, Whittaker J. Structural biology of insulin and IGF1 receptors: implications for drug design. Nat Rev Drug Discov. 2002;1:769–83.View ArticlePubMedGoogle Scholar
- Hubbard SR. The insulin receptor: both a prototypical and atypical receptor tyrosine kinase. Cold Spring Harb Perspect Biol. 2013;5:a008946.View ArticlePubMedPubMed CentralGoogle Scholar
- Ward CW, Lawrence MC. Ligand-induced activation of the insulin receptor: a multi-step process involving structural changes in both the ligand and the receptor. BioEssays. 2009;31:422–34.View ArticlePubMedGoogle Scholar
- Du Y, Wei T. Inputs and outputs of insulin receptor. Protein Cell. 2014;5:203–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Wei L, Hubbard SR, Hendrickson WA, Ellis L. Expression, characterization, and crystallization of the catalytic core of the human insulin receptor protein-tyrosine kinase domain. J Biol Chem. 1995;270:8122–30.View ArticlePubMedGoogle Scholar
- Brummer T, Schmitz-Peiffer C, Daly RJ. Docking proteins. FEBS J. 2010;277:4356–69.View ArticlePubMedGoogle Scholar
- Jensen M, De Meyts P. Molecular mechanisms of differential intracellular signaling from the insulin receptor. Vitam Horm. 2009;80:51–75.View ArticlePubMedGoogle Scholar
- Sun XJ, Rothenberg P, Kahn CR, Backer JM, Araki E, Wilden PA, et al. Structure of the insulin receptor substrate IRS-1 defines a unique signal transduction protein. Nature. 1991;352:73–7.View ArticlePubMedGoogle Scholar
- Sun XJ, Wang LM, Zhang Y, Yenush L, Myers MG Jr, Glasheen E, et al. Role of IRS-2 in insulin and cytokine signalling. Nature. 1995;377:173–7.View ArticlePubMedGoogle Scholar
- Lavan BE, Lane WS, Lienhard GE. The 60-kDa phosphotyrosine protein in insulin-treated adipocytes is a new member of the insulin receptor substrate family. J Biol Chem. 1997;272:11439–43.View ArticlePubMedGoogle Scholar
- Lavan BE, Fantin VR, Chang ET, Lane WS, Keller SR, Lienhard GE. A novel 160-kDa phosphotyrosine protein in insulin-treated embryonic kidney cells is a new member of the insulin receptor substrate family. J Biol Chem. 1997;272:21403–7.View ArticlePubMedGoogle Scholar
- White MF. The IRS-signalling system: a network of docking proteins that mediate insulin action. Mol Cell Biochem. 1998;182:3–11.View ArticlePubMedGoogle Scholar
- Giovannone B, Scaldaferri ML, Federici M, Porzio O, Lauro D, Fusco A, et al. Insulin receptor substrate (IRS) transduction system: distinct and overlapping signaling potential. Diabetes Metab Res Rev. 2000;16:434–41.View ArticlePubMedGoogle Scholar
- Withers DJ. Insulin receptor substrate proteins and neuroendocrine function. Biochem Soc Trans. 2001;29:525–9.View ArticlePubMedGoogle Scholar
- White MF. IRS proteins and the common path to diabetes. Am J Physiol Endocrinol Metab. 2002;283:E413–22.View ArticlePubMedGoogle Scholar
- Thirone AC, Huang C, Klip A. Tissue-specific roles of IRS proteins in insulin signaling and glucose transport. Trends Endocrinol Metab. 2006;17:72–8.View ArticlePubMedGoogle Scholar
- Jacobs AR, LeRoith D, Taylor J. Insulin receptor substrate-1 pleckstrin homology and phosphotyrosine-binding domains are both involved in plasma membrane targeting. J Biol Chem. 2001;276:40795–802.View ArticlePubMedGoogle Scholar
- Wolf G, Trüb T, Ottinger E, Groninga L, Lynch A, White MF, et al. PTB domains of IRS-1 and Shc have distinct but overlapping binding specificities. J Biol Chem. 1995;270:27407–10.View ArticlePubMedGoogle Scholar
- Björnholm M, He AR, Attersand A, Lake S, Liu SC, Lienhard GE, et al. Absence of functional insulin receptor substrate-3 (IRS-3) gene in humans. Diabetologia. 2002;45:1697–702.View ArticlePubMedGoogle Scholar
- Cai D, Dhe-Paganon S, Melendez PA, Lee J, Shoelson SE. Two new substrates in insulin signaling, IRS5/DOK4 and IRS6/DOK5. J Biol Chem. 2003;278:25323–30.View ArticlePubMedGoogle Scholar
- De Meyts P. Insulin and its receptor: structure, function and evolution. BioEssays. 2004;26:1351–62.View ArticlePubMedGoogle Scholar
- Marino-Buslje C, Martin-Martinez M, Mizuguchi K, Siddle K, Blundell TL. The insulin receptor: from protein sequence to structure. Biochem Soc Trans. 1999;27:715–26.View ArticlePubMedGoogle Scholar
- Lavin DP, White MF, Brazil DP. IRS proteins and diabetic complications. Diabetologia. 2016;59:2280–91.View ArticlePubMedGoogle Scholar
- Araki E, Lipes MA, Patti ME, Brüning JC, Haag B 3rd, Johnson RS, et al. Alternative pathway of insulin signalling in mice with targeted disruption of the IRS-1 gene. Nature. 1994;372:186–90.View ArticlePubMedGoogle Scholar
- Schubert M, Brazil DP, Burks DJ, Kushner JA, Ye J, Flint CL, et al. Insulin receptor substrate-2 deficiency impairs brain growth and promotes tau phosphorylation. J Neurosci. 2003;23:7084–92.PubMedGoogle Scholar
- Withers DJ, Burks DJ, Towery HH, Altamuro SL, Flint CL, White MF. Irs-2 coordinates Igf-1 receptor-mediated beta-cell development and peripheral insulin signalling. Nature Genet. 1999;23:32–40.PubMedGoogle Scholar
- Fantin VR, Wang Q, Lienhard GE, Keller SR. Mice lacking insulin receptor substrate 4 exhibit mild defects in growth, reproduction, and glucose homeostasis. Am J Physiol Endocrinol Metab. 2000;278:E127–33.PubMedGoogle Scholar
- Sadagurski M, Dong XC, Myers MG Jr, White MF. Irs2 and Irs4 synergize in non-LepRb neurons to control energy balance and glucose homeostasis. Mol Metab. 2013;3:55–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Liu SC1, Wang Q, Lienhard GE, Keller SR: Insulin receptor substrate 3 is not essential for growth or glucose homeostasis. J Biol Chem 1999, 274:18093–18099.Google Scholar
- Sciacchitano S, Taylor SI. Cloning, tissue expression, and chromosomal localization of the mouse IRS-3 gene. Endocrinology. 1997;138:4931–40.View ArticlePubMedGoogle Scholar
- Maffucci T, Razzini G, Ingrosso A, Chen H, Iacobelli S, Sciacchitano S, et al. Role of pleckstrin homology domain in regulating membrane targeting and metabolic function of insulin receptor substrate 3. Mol Endocrinol. 2003;17:1568–79.View ArticlePubMedGoogle Scholar
- Böhni R, Riesgo-Escovar J, Oldham S, Brogiolo W, Stocker H, Andruss BF, et al. Autonomous control of cell and organ size by CHICO, a drosophila homolog of vertebrate IRS1-4. Cell. 1999;97:865–75.View ArticlePubMedGoogle Scholar
- Uhlik MT, Temple B, Bencharit S, Kimple AJ, Siderovski DP, Johnson GL. Structural and evolutionary division of phosphotyrosine binding (PTB) domains. J Mol Biol. 2005;345:1–20.View ArticlePubMedGoogle Scholar
- Chakraborty C, Agoramoorthy G, Hsu MJ. Exploring the evolutionary relationship of insulin receptor substrate family using computational biology. PLoS One. 2011;6:e16580.View ArticlePubMedPubMed CentralGoogle Scholar
- McGaugh SE, Bronikowski AM, Kuo CH, Reding DM, Addis EA, Flagel LE, et al. Rapid molecular evolution across amniotes of the IIS/TOR network. Proc Natl Acad Sci U S A. 2015;112:7055–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Olinski RP, Lundin LG, Hallböök F. Genome duplication-driven evolution of gene families: insights from the formation of the insulin family. Ann N Y Acad Sci. 2005;1040:426–8.View ArticlePubMedGoogle Scholar
- Hernández-Sánchez C, Mansilla A, de Pablo F, Zardoya R. Evolution of the insulin receptor family and receptor isoform expression in vertebrates. Mol Biol Evol. 2008;25:1043–53.View ArticlePubMedGoogle Scholar
- Rentería ME, Gandhi NS, Vinuesa P, Helmerhorst E, Mancera RL. A comparative structural bioinformatics analysis of the insulin receptor family ectodomain based on phylogenetic information. PLoS One. 2008;3:e3667.View ArticlePubMedPubMed CentralGoogle Scholar
- Huminiecki L, Heldin CH. 2R and remodeling of vertebrate signal transduction engine. BMC Biol. 2010;8:146.View ArticlePubMedPubMed CentralGoogle Scholar
- Hokamp K, McLysaght A, Wolfe KH. The 2R hypothesis and the human genome sequence. J Struct Funct Genom. 2003;3:95–110.View ArticleGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.View ArticlePubMedPubMed CentralGoogle Scholar
- Ensembl Genome Browser [http://www.ensembl.org/index.html].
- Venkatesh B, Lee AP, Ravi V, Maurya AK, Lian MM, Swann JB, et al. Elephant shark genome provides unique insights into gnathostome evolution. Nature. 2014;505:174–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Braasch I, Gehrke AR, Smith JJ, Kawasaki K, Manousaki T, Pasquier J, Amores A, Desvignes T, Batzel P, Catchen J, Berlin AM, Campbell MS, Barrell D, Martin KJ, Mulley JF, Ravi V, Lee AP, Nakamura T, Chalopin D, Fan S, Wcisel D, Cañestro C, Sydes J, Beaudry FE, Sun Y, Hertel J, Beam MJ, Fasold M, Ishiyama M, Johnson J, Kehr S, Lara M, Letaw JH, Litman GW, Litman RT, Mikami M, Ota T, Saha NR, Williams L, Stadler PF, Wang H, Taylor JS, Fontenot Q, Ferrara A, Searle SM, Aken B, Yandell M, Schneider I, Yoder JA, Volff JN, Meyer A, Amemiya CT, Venkatesh B, Holland PW, Guiguen Y, Bobe J, Shubin NH, Di Palma F, Alföldi J, Lindblad-Toh K, Postlethwait JH: The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nat Genet 2016, 48:427–37.Google Scholar
- National Center for Biotechnology Information [http://www.ncbi.nlm.nih.gov/].
- Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17:368–76.View ArticlePubMedGoogle Scholar
- Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol Biol Evol. 2015;32:268–74.View ArticlePubMedGoogle Scholar
- Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP. Bayesian inference of phylogeny and its impact on evolutionary biology. Science. 2001;294:2310–4.View ArticlePubMedGoogle Scholar
- Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:539–42.View ArticlePubMedPubMed CentralGoogle Scholar
- Cañestro C, Albalat R, Irimia M, Garcia-Fernàndez J. Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates. Semin Cell Dev Biol. 2013;24:83–94.View ArticlePubMedGoogle Scholar
- Glasauer SM, Neuhauss SC. Whole-genome duplication in teleost fishes and its evolutionary consequences. Mol Gen Genomics. 2014;289:1045–60.View ArticleGoogle Scholar
- Inoue J, Sato Y, Sinclair R, Tsukamoto K, Nishida M. Rapid genome reshaping by multiple-gene loss after whole-genome duplication in teleost fish suggested by mathematical modeling. Proc Natl Acad Sci U S A. 2015;112:14918–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Perelman P, Johnson WE, Roos C, Seuánez HN, Horvath JE, Moreira MA, et al. A molecular phylogeny of living primates. PLoS Genet. 2011;7:e1001342.Google Scholar
- Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, et al. PipMaker--a web server for aligning two genomic DNA sequences. Genome Res. 2000;10:577–86.View ArticlePubMedPubMed CentralGoogle Scholar
- Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A. NISC comparative sequencing program, green ED, Hardison RC, Miller W: MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 2003;31:3518–24.View ArticlePubMedPubMed CentralGoogle Scholar
- Tajima F. Simple methods for testing molecular clock hypothesis. Genetics. 1993;135:599–607.PubMedPubMed CentralGoogle Scholar
- Capra JA, Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics. 2007;23:1875–82.View ArticlePubMedGoogle Scholar
- Blom N, Gammeltoft S, Brunak S. Sequence- and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999;294:1351–62.View ArticlePubMedGoogle Scholar
- Smith JJ, Kuraku S, Holt C, Sauka-Spengler T, Jiang N, Campbell MS, Yandell MD, Manousaki T, Meyer A, Bloom OE, Morgan JR, Buxbaum JD, Sachidanandam R, Sims C, Garruss AS, Cook M, Krumlauf R, Wiedemann LM, Sower SA, Decatur WA, Hall JA, Amemiya CT, Saha NR, Buckley KM, Rast JP, Das S, Hirano M, McCurley N, Guo P, Rohner N, Tabin CJ, Piccinelli P, Elgar G, Ruffier M, Aken BL, Searle SM, Muffato M, Pignatelli M, Herrero J, Jones M, Brown CT, Chung-Davidson YW, Nanlohy KG, Libants SV, Yeh CY, McCauley DW, Langeland JA, Pancer Z, Fritzsch B, de Jong PJ, Zhu B, Fulton LL, Theising B, Flicek P, Bronner ME, Warren WC, Clifton SW, Wilson RK, Li W: Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution. Nat Genet. 2013, 45:415–21.Google Scholar
- Smith JJ, Antonacci F, Eichler EE, Amemiya CT. Programmed loss of millions of base pairs from a vertebrate genome. Proc Natl Acad Sci U S A. 2009;106:11212–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Kurokawa T, Uji S, Suzuki T. Identification of cDNA coding for a homologue to mammalian leptin from pufferfish, Takifugu rubripes. Peptides. 2005;26:745–50.View ArticlePubMedGoogle Scholar
- Hu Q, Tan H, Irwin DM. Evolution of the vertebrate Resistin Gene family. PLoS One. 2015;10:e0130188.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuraku S, Meyer A. Detection and phylogenetic assessment of conserved synteny derived from whole genome duplications. Methods Mol Biol. 2012;855:385–95.View ArticlePubMedGoogle Scholar
- Arroyo JI, Hoffmann FG, Opazo JC. Gene turnover and differential retention in the relaxin/insulin-like gene family in primates. Mol Phylogenet Evol. 2012;63:768–76.View ArticlePubMedGoogle Scholar
- Hoffmann FG, Opazo JC. Evolution of the relaxin/insulin-like gene family in placental mammals: implications for its early evolution. J Mol Evol. 2011;72:72–9.View ArticlePubMedGoogle Scholar
- Daković N, Térézol M, Pitel F, Maillard V, Elis S, Leroux S, et al. The loss of adipokine genes in the chicken genome and implications for insulin metabolism. Mol Biol Evol. 2014;31:2637–46.View ArticlePubMedGoogle Scholar
- Massingham T, Davies LJ, Liò P. Analysing gene function after duplication. Bioessays. 2001;23:873–6.View ArticlePubMedGoogle Scholar
- Freeling M, Scanlon MJ, Fowler JE. Fractionation and subfunctionalization following genome duplications: mechanisms that drive gene content and their consequences. Curr Opin Genet Dev. 2015;35:110–8.View ArticlePubMedGoogle Scholar
- Tamemoto H, Kadowaki T, Tobe K, Yagi T, Sakura H, Hayakawa T, Terauchi Y, Ueki K, Kaburagi Y, Satoh S, Sekihara H, Yoshioka S, Horikoshi H, Furuta Y, Ikawa Y, Kasuga M, Yazaki Y, Aizawa S: Insulin resistance and growth retardation in mice lacking insulin receptor substrate-1. Nature. 1994;372:182–6.Google Scholar
- Withers DJ, Gutierrez JS, Towery H, Burks DJ, Ren JM, Previs S, et al. Disruption of IRS-2 causes type 2 diabetes in mice. Nature. 1998;391:900–3.View ArticlePubMedGoogle Scholar
- Sun XJ, Crimmins DL, Myers MG Jr, Miralpeix M, White MF. Pleiotropic insulin signals are engaged by multisite phosphorylation of IRS-1. Mol Cell Biol. 1993;13:7418–28.View ArticlePubMedPubMed CentralGoogle Scholar
- Esposito DL, Li Y, Vanni C, Mammarella S, Veschi S, Della Loggia F, et al. A novel T608R missense mutation in insulin receptor substrate-1 identified in a subject with type 2 diabetes impairs metabolic insulin signaling. J Clin Endocrinol Metab. 2003;88:1468–75.View ArticlePubMedGoogle Scholar
- Landis J, Shaw LM. Insulin receptor substrate 2-mediated phosphatidylinositol 3-kinase signaling selectively inhibits glycogen synthase kinase 3β to regulate aerobic glycolysis. J Biol Chem. 2014;289:18603–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Asano T, Fujishiro M, Kushiyama A, Nakatsu Y, Yoneda M, Kamata H, et al. Role of phosphatidylinositol 3-kinase activation on insulin action and its alteration in diabetic conditions. Biol Pharm Bull. 2007;30:1610–6.View ArticlePubMedGoogle Scholar
- Sato Y, Hashiguchi Y, Nishida M. Temporal pattern of loss/persistence of duplicate genes involved in signal transduction and metabolic pathways after teleost-specific genome duplication. BMC Evol Biol. 2009;9:127.View ArticlePubMedPubMed CentralGoogle Scholar
- Elephant Shark Genome Project [http://esharkgenome.imcb.a-star.edu.sg/].
- Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucl Acids Res. 2002;30:3059–66.View ArticlePubMedPubMed CentralGoogle Scholar
- Penn O, Privman E, Ashkenazy H, Landan G, Graur D, Pupko T. GUIDANCE: a web server for assessing alignment confidence scores. Nucl Acids Res. 2010;38:W23–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Guidance2 Web Server [http://guidance.tau.ac.il/ver2/].
- Clustal Omega Web Server [http://www.ebi.ac.uk/Tools/msa/clustalo/].
- MrBayes 3.2.2 Web Site [http://mrbayes.sourceforge.net/].
- IQ-tree Web Server [http://iqtree.cibiv.univie.ac.at/].
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;2017 in pressGoogle Scholar
- Tracer v1.6 Web Site [http://tree.bio.ed.ac.uk/software/tracer/].
- Minh BQ, Nguyen MAT, von Haeseler A. Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol. 2013;30:1188–95.View ArticlePubMedPubMed CentralGoogle Scholar
- JS Distance Web Server [http://compbio.cs.princeton.edu/conservation/].
- NetPhos 2.0 Web Server [http://www.cbs.dtu.dk/services/NetPhos/].