Signatures of positive selection in Toll-like receptor (TLR) genes in mammals

Background Toll-like receptors (TLRs) are a major class of pattern recognition receptors (PRRs) expressed in the cell surface or membrane compartments of immune and non-immune cells. TLRs are encoded by a multigene family and represent the first line of defense against pathogens by detecting foreigner microbial molecular motifs, the pathogen-associated molecular patterns (PAMPs). TLRs are also important by triggering the adaptive immunity in vertebrates. They are characterized by the presence of leucine-rich repeats (LRRs) in the ectodomain, which are associated with the PAMPs recognition. The direct recognition of different pathogens by TLRs might result in different evolutionary adaptations important to understand the dynamics of the host-pathogen interplay. Ten mammal TLR genes, viral (TLR3, 7, 8, 9) and non-viral (TLR1-6, 10), were selected to identify signatures of positive selection that might have been imposed by interacting pathogens and to clarify if viral and non-viral TLRs might display different patterns of molecular evolution. Results By using Maximum Likelihood approaches, evidence of positive selection was found in all the TLRs studied. The number of positively selected codons (PSC) ranged between 2-26 codons (0.25%-2.65%) with the non-viral TLR4 as the receptor with higher percentage of positively selected codons (2.65%), followed by the viral TLR8 (2.50%). The results indicated that viral and non-viral TLRs are similarly under positive selection. Almost all TLRs have at least one PSC located in the LRR ectodomain which underlies the importance of the pathogen recognition by this region. Conclusions Our results are not in line with previous studies on primates and birds that identified more codons under positive selection in non-viral TLRs. This might be explained by the fact that both primates and birds are homogeneous groups probably being affected by only a restricted number of related viruses with equivalent motifs to be recognized. The analyses performed in this work encompassed a large number of species covering some of the most representative mammalian groups - Artiodactyla, Rodents, Carnivores, Lagomorphs and Primates - that are affected by different families of viruses. This might explain the role of adaptive evolution in shaping viral TLR genes.


Background
Toll-like receptors (TLRs) are a major class of pattern recognition receptors (PRRs) in Drosophila and in mammals. Mammalian TLRs have essential roles in recognizing infectious agents and initiating intracellular signal transduction pathways that trigger the expression of genes leading to both innate and adaptive immune responses [1,2]. TLRs belong to the type I transmembrane glycoprotein receptor family and can be expressed either in the cell surface or membrane compartments of immune and nonimmune cells (e.g. epithelial cells) [3,4].
The TLR family is structurally characterized by the presence of an ectodomain, a signal transmembrane segment and a highly conserved cytoplasmic domain homologous to the human interleukin-1 receptor (IL1R) and human IL-18 receptor (IL-18R) and designated TIR domain [11,12]. Crystallographic studies of the ectodomain revealed a solenoid horseshoe-like structure constituted by a high but variable number (16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28) of leucine-rich repeats (LRRs) responsible for binding the "pathogen associated molecular patterns" (PAMPs). PAMPs are present in pathogens and not in host components, thus allowing the innate immune system to distinguish between what is self and what is non-self [13]. Nevertheless, TLRs not only sense microbial components but they can also target endogenous molecules that have resulted from host dying cells and which can activate the inflammatory response [14,15]. PAMPs are characteristic molecular signatures of the pathogens, highly conserved during evolution since they are involved in critical functions and are essential for survival of the pathogens. Indeed, mutations or loss of these patterns can be lethal to the pathogens and therefore are quite conserved [13]. This means that a limited number of PRRs is needed to detect the presence of an infection [13]. Their main recognition molecules, the LRRs, are capped in the amino and carboxy termini by LRR-NT and LRR-CT molecules, respectively [16,17], that stabilize the protein structure by protecting the hydrophobic core from exposure to solvent [18]. Delimitation of these domains is, however, not consensual among the different software programs used for their determination [19,20]. For some TLRs crystallographic models have been made available where these domains are well defined (e.g. the TLR1/ TLR2, TLR2/TLR6 heterodimers, TLR3 and TLR4 of Homo sapiens and of Mus musculus [16,17,[21][22][23]). However, this is not the case for all species which further complicates the correct delimitation of each domain.
TLRs are evolutionary conserved proteins and their characterization and of their ligands has contributed to the understanding of the function of the TLRs and to the host defense processes against infections [31]. They are candidate molecules to examine how natural selection molds innate immunity receptors. Several studies have been performed and purifying selection was suggested as the major force driving TLRs evolution, at least in humans [32,33]. However, other studies on Primate species revealed different degrees of positive selection acting on their evolutionary history. Evidence of positive selection was found in TLR1 [34] and TLR4 [35]. In a broader study in a primate group, Wlasiuk and Nachman (2010) showed evidence of positive selection in six TLR genes, TLR1, TLR4, TLR6, TLR7, TLR8 and TLR9, with the non-viral TLR4 having the highest number of positively selected codons (PSC). A study performed by Alcaide and Edwards (2011) in birds showed evidence of positive selection in TLR4. The most recent analysis of the TLR1 subfamily showed evidence of positive selection in TLR1, 2 and 6 in mammals and TLR2A/B in birds [36]. Overall, these studies also showed that non-viral TLRs tend to be more prone to positive selection than viral TLRs. From an evolutionary point of view, proteins involved in direct recognition of pathogens might have been shaped by these interactions. Here, we have studied ten mammal TLR genes in order to look for evidence of positive selection and to further clarify if viral and non-viral TLRs display different patterns of molecular evolution due to the different nature of the PAMPs they recognize.

Signatures of positive selection
Genes of the immune system, in particular those involved in the recognition of pathogens, and genes involved in the host-pathogen interaction have been shown to be highly prone to adaptive selection (e.g. [37,38]). By using Maximum-Likelihood (ML) approaches, evidence of positive selection was detected in all the TLRs studied (Table 1). For seven of the TLRs (TLR1-6, TLR10), analyses included species belonging to some of the most representative mammalian groups, i.e. Artiodactyla, Rodents, Carnivores, Lagomorphs and Primates, while for the remaining three TLRs, the Lagomorph group was not included due to the lack of data (Additional file 1, Table S1; Additional file 2, Table S2; Additional file 3, Table S3; Additional file 4, Table S4; Additional file 5, Table S5; Additional file 6,  Table S6; Additional file 7, Table S7; Additional file 8, Table S8; Additional file 9, Table S9; Additional file 10, Table S10).
The number of positively selected codons observed for each TLR studied ranged between 2-26 which   Table S11; Additional file 12, Table S12; Additional file 13, Table  S13; Additional file 14, Table S14; Additional file 15, Table S15; Additional file 16, Table S16; Additional file 17, Table S17; Additional file 18, Table S18; Additional file 19, Table S19; Additional file 20, Table S20. Previous studies argued that viral TLRs are under a stronger purifying selection than non-viral TLRs [33,39,40] since viral TLRs recognize viral nucleic acids but also target self components [1,2,41]. Therefore, these TLRs have the dual role of maintaining their function and avoid autoimmunity, and so they are not expected to accumulate non-synonymous substitutions as this might affect their functional integrity. On the other hand, nonviral TLRs that exist on the cell surface have a more flexible evolution and easily tolerate non-synonymous mutations which, in some circumstances, can be subject to positive selection and become fixed in some populations [33]. This higher tolerance is because the function of non-viral TLRs is more redundant than of viral TLRs. Indeed, several surface TLRs are able to recognize the same bacteria and fungi components, so one microorganism can be recognized by different TLRs. Therefore, a non-synonymous mutation in one TLR does not necessarily mean the extinction of the function and does not compromise immunity [33].
The viral TLR8 has never been identified as a candidate for being under positive selection; however, our results indicate a similar level of positive selection acting in TLR8 as in the non-viral TLR4. This might be the result of the inclusion of a larger group of species that might be affected by different pathogens with implications in their PAMPs recognition. Indeed, the groups previously analyzed by others are homogeneous probably being affected by only a restricted number of related viruses which accounts for their conservation. In addition, the presence of the positive selection signature may not mean a recent event but could result from ancient functional adaptations from each species that lead to the actual taxon specificities [42]. Furthermore, as the recognition of viral RNA is essential for host defense, the mutations that could affect the function should have been removed by purifying selection and then, only the polymorphisms that are advantageous, i.e. that confer resistance to the pathogen, might have become fixed and are now reflected in the differences between species [43].
The high number of PSC observed in the non-viral TLR4 is in line with results previously reported in primates and birds [39,40]. TLR8 and TLR4 recognize very different ligands. For PAMPs recognition, TLR8 forms a homodimer that is associated with response to ssRNA viruses while the TLR4-MD-2 heterodimer mostly recognizes LPS that are present in the outer membrane of Gram-negative bacteria [13,44]. In addition, TLR4 also targets components of yeast, trypanossoma and even viruses [45]. Despite this, the reason why these two TLRs have a remarkable picture of adaptive evolution is not yet clearly understood. Some recent studies showed that in different species, the same TLR molecule recognizes specific ligands or that the same ligand triggers responses with different intensities (reviewed in [46]). For example, in rodent species, TLR8 does not respond to synthetic ligands such as imiquimod (R837), resiquimod (R848), and some guanine nucleotide analogs, as non-rodent species do [47]. This is probably caused by the variation in the surface charge and the existence of different secondary structures in different species. In addition, for TLR4, differences in ligand recognition between humans, bovines, equines and murines have also been described (reviewed in [46]). Although more studies are required to fully assess the specificity of ligand recognition and the responses that are triggered in each species, this might explain the similar patterns of evolution observed for these TLRs. The reason for the difference observed between the number of PSC in TLR7 (0.67%) and TLR8 (2.50%) is also unclear since both recognize ssRNA. A similar degree of positive selection acting in both receptors would be expected. Functional or structural differences in ligand recognition or in ligand specificity should be at the basis of the observed differences. Structural differences do exist in ligand recognition, but more studies are required. Indeed, TLR8 and TLR9 exist as preformed dimers while TLR7, along with the others TLRs, exists as monomer and just form the dimer after ligand binding [48]. Differences in tissue expression have also been observed which might account for the different pattern. Although both TLR7 and TLR8 are expressed in the lung, TLR7 is also expressed in the placenta and spleen while TLR8 is expressed also in peripheral blood leukocytes [49].
TLR1, TLR6 and TLR9 were the receptors with the lower percentage of codons under positive selection (Table 1). TLR1 has been shown to be mostly under purifying selection, but it has previously been shown to have also been subject to positive selection in chicken, contrary to the remaining avian TLRs [50], and more recently four PSC were found in the vertebrates [36]. The study of Huang and co-workers also showed evidence of one PSC in vertebrate TLR6 which was the first report of adaptive selection acting on this receptor now further supported by the present study. TLR9 has also low proportion of PSC (0.39%), but positive selection has been previously found in Primates [40] and Teleosts [51].
Seven codons were found to be under positive selection in TLR10. The study of Huang and co-workers revealed no positive selection on TLR10 [36]. Their analyses of TLR10 encompassed sixteen species, including a marsupial, a monotreme and an amphibian, which were not included in our study. In addition, their analyses only focused on PAML (CODEML) results whether ours also included the different models implemented in the Data Monkey Web Server which might explain the difference observed.
The TLR10 interacts with TLR2 to recognize triacyl lipoproteins [26,27]. In turn, TLR2 also form heterodimers with TLR1 or TLR6 to recognize the largest variety of ligands of all the TLRs (e.g. peptidoglycan, bacterial lipoproteins, zymosan, a phenol soluble factor from Staphylococcus epidermidis) [52]. In TLR2, six codons were found under positive selection in mammals, in line with the observation of Huang and co-workers (2011). Signatures of positive selection in this gene were also found in bovines [42], primates [53], rodents [54] and in birds [55]. The wide range of ligands as well as the need for heterodimerization (TLR1-TLR2, TLR2-TLR6, TLR6-TLR10) make the TLR2 prone to contrasting evolutionary patterns: conservation of its function, including the capacity of heterodimerization, and adaptive evolution to the environment and the pathogens specific from each species [6,36,42].

Location and characterization of the PSC in the TLR domains
The LRRfinder software [20] was used to delimitate the functional domains of each TLR gene in order to assess the functional significance of the putatively selected sites. Human TLR sequences were used as a reference (Table 2 and Additional file 21, Table S21; Additional file 22,   Table S22; Additional file 23, Table S23; Additional file  24, Table S24; Additional file 25, Table S25; Additional  file 26, Table S26; Additional file 27, Table S27; Additional file 28, Table S28; Additional file 29, Table S29; Additional file 30, Table S30). The characterization of the charge and polarity of each amino acid possibility in sites under selection is also available in Additional file 31, Table S31; Additional file 32, Table S32; Additional file 33, Table S33; Additional file 34, Table S34; Additional file 35, Table S35; Additional file 36, Table S36; Additional file 37, Table S37; Additional file 38, Table S38; Additional file 39, Table S39; Additional file 40, Table  S40.
The TLRs are composed of an extracellular domain that binds the PAMPs, a signal transmembrane domain and an intracellular domain, designated the TIR domain that binds adapter molecules and that triggers the intracellular cascades leading to the innate immune response. The convex surface of the extracellular LRR domains, by being involved in the recognition of the PAMPs, is highly variable. At variance, the TIR domain is highly conserved as it is involved in the signaling cascades [56]. This suggests that the different domains of the TLR molecules are under different evolutionary pressures.
Most positively selected sites were located in the extracellular LRRs. A few instances of positive selection were detected in the remaining domains. All TLRs, with the exception of TLR6, have at least one codon under positive selection located in the LRR ectodomain and most of the PSC found within each TLR are mostly located in this domain ( Table 2). The LRR ectodomain is the main point of interaction with PAMPs [11,57], which are conserved motifs [13]. Therefore, some functional constrain is expected in order to preserve the TLR ability in identifying pathogens. However, as pathogens are evolving  constantly to evade host recognition, it is likely that TLRs should co-evolve with them. Our results suggest that this constant evolving nature of pathogens is accompanied by TLRs. Some species-specific substitutions may be reflected on the high number of PSC found in this domain. This could be related to the PAMPs they recognize.
In general, the LRRs are composed by a concave surface, more conservative, composed by a leucine-rich sequence, XLXXLXLXX, and by a convex surface, more exposed and more variable, XΦXXΦX4FXXLX, where X represents any amino acid and Φ a hydrophobic amino acid. Of the seventy one codons identified under positive selection in the LRRs of all TLRs, forty eight were localized in the variable segment which supports our hypothesis of co-evolution between host and pathogen ( Table 2 and Additional file 21, Table S21; Additional file 22, Table S22; Additional file 23, Table S23; Additional file 24, Table S24; Additional file 25, Table S25; Additional file 26, Table S26; Additional file 27, Table S27; Additional file 28, Table S28; Additional file 29, Table S29; Additional file 30, Table S30). Interestingly, the three TLRs with more PSC in variable segment of LRRs are the viral TLR7, 8 and 9, with 71.43%, 84.0% and 75.0%, respectively. This high proportion of PSC in the variable segment of LRRs found in three viral TLRs may be indicative of the receptors' evolution shaped by viral nucleic acids (ssRNA and CpG DNA) characteristic from each species. As nucleic acids are supposed to directly interact with this convex surface, variation or evidence of positive selection in it may be the result of host adaptation to viral evolution.
The two receptors with more signatures of natural selection, TLR8 and TLR4, showed a large number of PSC in the LRR domain, even though ligand recognition is made differently [25]. Along with TLR7 and 9, TLR8 has a longer amino acid sequence in its ectodomain domain than other TLRs and contains an irregular segment of 26 to 31 amino acids between LRR14 and 15 [18]. The ectodomain is cleaved in the endolysosome to enable ligand recognition [18,58]. Thus, the functional ectodomain of human TLR8 comprises LRR15-25 and C-terminal LRR. Following ligand binding, TLR8 recruits the TIR adaptor proteins and initiates signaling [58]. Our results show that fifteen of the PSC are located in the region comprising LRR-NT-LRR14 that is cleaved. Four amino acids are located to the irregular LRR insertion before LRR15 (Table 3); however the functional importance of this region has not yet been clarified and is regarded as a new N-terminal LRR of the truncated structure [58]. Recently, it has also been described for its crucial importance in TLR8 activation, especially the Alanine substitutions in this region that can affect the activation of this receptor [47]. Alanine at the amino acid position 481 that was found to be under selective pressure may be interesting to study in greater detail. Furthermore, Govindaraj et al. (2011) proposed that this undefined region is responsible for the species-specificity in ligand recognition that is found at least between non-rodents and rodents (rodents lack the undefined region 438-442) [47]. The surface charge variation among species is crucial for the species specific pathogen recognition even though this region is not directly involved in ligand interaction [47]. In all four positions identified under selection in this irregular insertion, the amino acid possibilities may result in charge variation which might suggest a role for the specificity in ligand recognition (Additional file 38, Table S38). Of the other eight residues identified in this molecule, only one is in LRR15. LRR15 has been described for its importance in ligand recognition together with LRR17 and 18 (Table 3) [58].
The TLR4 forms a dimer with MD-2. The LPS interacts with a large hydrophobic pocket in MD-2 and directly bridges m-shaped receptor dimer composed of two copies arranged symmetrically [23]. Nineteen of the twenty two PSC in the TLR4 are located in the LRR domain. At least six of these codons have been previously identified as sites under positive selection in primates [40] and some have functional importance in PAMP recognition (Table 3) [23]. From the three-dimensional structure of the TLR4 heterodimer (Figure 1 Table S34). Nevertheless, the result of these variations in the function and structure of the molecule remains to be assessed. In addition, some of the identified PSC are located in close contact in the TLR4 homodimers (eg. 370, 394, 468, 471, 487, 542) and might have implications for dimerization.
In TLR2, three PSC lay in LRR5, 6 and 10. The LRR10 PSC is in the variable segment and this LRR has been previously described as under positive selection in bovines [42]. Despite that, these LRRs have not been recognized as sites of direct interaction with PAMPs neither involved in heterodimerization, so this result may not necessarily reflect any present functional importance but the result of ancient selective events [42]. In the heterodimers TLR1-TLR2 ( Figure 2) and TLR2-TLR6 (Figure 3) only one of the identified amino acids (302 in LRR10 of the TLR2) are located in significant regions for ligand binding (LRR9-LRR12) and one (318 in TLR1) in close proximity with sites involved in heterodimerization (LRR11-LRR14) ( Table 3) [16]. It is interesting to note that in TLR2, the   regions identified as important for dimerization (LRR11- 13) have not been subject to positive selection, which is a trace of functional conservation, particularly important as TLR2 dimerizes with three other TLRs to recognize different PAMPs [16,42].
In the viral TLR3, nine PSC were identified, 3 of which are located in the LRR domain ( Figure 4). The residue 79 belongs to a dsRNA-TLR3 interaction site in the LRR-NT to LRR3 region (Table 3) [22]. In this site, five amino acid possibilities were detected along the species studied which could be a species adaptation to the recognition of specific dsRNA viruses and reflect coevolution.
For the TLR5, three PSC were detected in the LRR domain. The residues 207 and 400 are located within the 228 amino acid region identified by Andersen-Nissen et al (2007) as important for flagellin recognition [59] and were previously reported as being under positive selection in primates [60].
The TIR domain of the TLRs is highly conserved across multiple species of animals, plants [61] and microbes [62] due to its significance as signaling domain [63]. Three Box regions of the TIR domain, which are important in signal transduction, are highly conserved in the TLRs genes (Boxes 1, 2, and 3) and should be rather under a strong purifying selection [63,64]. As expected, due to their functional constraints, we verified that none of the nine sites identified as under positive selection in the TIR domain were located in these boxes. This observation is expected as these boxes, due to their functional constraints, should be rather under a strong purifying selection [63,64].
The TLR with more PSC in the TIR domain was TLR5 where three codons under selection were located within the highly conserved TIR domain and although amino acid alterations at codon 674 are conservative, alterations at codons 721 and 742 might induce differences in charge and polarity of the protein, respectively (Additional file 35, Table S35). Despite the expected functional constraints specific of this domain, it seems that the TLR5 protein may present some flexibility with regards to amino acid composition in this domain. In human, TLR5 has been suggested to be functionally redundant. Indeed, TLR5 392STOP is a non-functional allele that may reach considerable frequencies in some human populations (up to 23%; [65]) despite increasing the susceptibility to the   Legionnaires' disease. Positive selection has been proposed as a mechanism for favoring gene loss in human evolution [66,67]. This suggests that other proteins exist, yet to be determined, that might be able to compensate for a loss of function of this TLR.
In the remaining domains a few PSCs were identified although they can have functional or evolutionary importance. Indeed, in the signal domain, we identified five PSCs. This domain mostly mediates or regulates the transport of the secretory proteins to their destination compartment in the cell [68]. Given the role of this domain, it is likely that these five amino acids might interfere with the correct location of the secretory proteins in the different cell compartments. In the LRR-NT domain we identified one PSC in TLR8. This domain is known for its importance in stabilizing the LRR structure by protecting the hydrophobic core [18]. However, in the TLR8 molecule this domain is cleaved along with LRR1-14 to enable ligand recognition [18,58], thus no particular importance can be attributed. In the 3' end of the LRR structure we find the LRR-CT where two PSC were found. The role of this domain is similar to LRR-NT so these residues might be important for the stabilization of the molecule in particular for the formation of the dimers TLR4-MD2 and TLR10-TLR2 [18,69]. In the PSC identified in the LRR-CT, amino acid variation between species does not alter the charge (Additional file 34, Table S34; Additional file 40, Table  S40). This also happens in all five positions under positive selection in the transmembrane domain of TLR 2, 3, 4 and 6 (Additional file 32, Table S32; Additional file 33, Table S33; Additional file 34, Table S34; Additional file 36, Table S36). The transmembrane segment is responsible for the junction between the TLR and the plasmatic membrane. In some cases, it is also associated with the localization of the TLR in intracellular compartments and its interaction with accessory molecules [70,71]. This domain is expected to be highly conserved and only few mutations have been described [72,73]. Nevertheless, we found five codons under positive selection in this domain.
In summary, pathogens usually develop strategies to evade recognition by the host immune system. Therefore, motifs in the pathogen that are involved in the recognition tend to evolve faster to avoid this recognition. If the pathogen is evolving, the receptor that recognizes the pathogen should also evolve to keep pace with the changes that occur in the pathogen. This arms-race is responsible for this continuum of alterations in both the pathogen and the receptor which can be detected as signatures of positive selection. For example, positive selection in RNA viruses has been shown to occur as the result of this arms-race [74,75]. Thus, changes in the sequence of RNA or DNA of the pathogen will cause the receptor not to recognize them. This will force alterations in the receptor which might change the geometry of the interaction. For this reason, if different ligands (i.e. pathogens) are recognized by one receptor, it is likely that more changes are observed than for receptors that recognize only one ligand. In line with this, our results may reflect the co-evolution between each host and its pathogens and commensals, especially in viral TLRs (due to the wide variety of viruses they might recognize) and in the non-viral TLR4. In addition, the fact that the mutation rates for RNA and DNA viruses tend to be generally higher than for bacteria and yeast [76] also correlates with our results. Viral TLRs will thus encounter ligands that evolve faster than non-viral TLR. TLR4 might appear as an exception due to the wide variety of ligands that it recognizes, which include viruses.

Conclusions
Evidence of positive selection was found for all the mammalian TLRs studied. Adaptive selection has clearly played a role in shaping the diversity of both viral and non-viral TLRs. Location of some of the positively selected codons indicate that pathogens exert most of the selective pressures that lead to the changes observed mostly in the LRR ectodomain, especially in its variable segment responsible for direct interaction with PAMPS. This suggests that they are the result of co-evolution. Further studies are important to clarify the ligand for each TLR in each species as it could give new clues for the interpretation of these results. Also, crystallographic studies would be helpful for assessing the functional relevance of the PSCs detected.

Sequences
The sequences of the mammalian TLRs used in the analyses were retrieved from GenBank (http://www.ncbi. nlm.nih.gov/genbank/) and Ensembl (http://www. ensembl.org/index.html). For each TLR, a subset of 13-23 species was used, that included species from some of the most representative mammalian groups, Artiodactyla, Rodents, Carnivores, Lagomorphs and Primates. Lagomorphs were not included in the TLR7, 8 and 9 analyses due to the lack of data. The identification of the species used for each TLR and the accession numbers are presented in Additional files (Additional file 1, Table S1; Additional file 2, Table S2; Additional file 3,  Table S3; Additional file 4, Table S4; Additional file 5, Table S5; Additional file 6, Table S6; Additional file 7, Table S7; Additional file 8, Table S8; Additional file 9, Table S9; Additional file 10, Table S10).

Codon-based analyses of positive selection
Under neutrality, coding sequences are expected to present a ratio of non-synonymous substitutions (d N ) over synonymous substitutions (d S ) that does not significantly deviate from 1 (ω = d N /d S = 1) while significant deviations may be interpreted as either the result of positive selection (ω >> 1) or of negative selection (ω << 1).
To test for positive selection in individual codons of mammalian TLR sequences, the d N to d S ratios were compared using two maximum likelihood (ML) frameworks, the Hyphy package implemented in the Data Monkey Web Server (http://www.datamonkey.org [77] and the CODEML (PAML version 4) [78] as proposed by Wlasiuk and Nachman (2010).
In the Data Monkey Web Server, the best fitting nucleotide substitution model was searched for through the automatic model selection tool available on the server. All sequences of each TLR were analyzed under three distinct models, single likelihood ancestor counting [64], fixed-effect likelihood (FEL) and random effect likelihood (REL). The SLAC model is based on the reconstruction of the ancestral sequences and the counts of d S , d N at each codon position of the phylogeny. The FEL model estimates the ratio of d N /d S on a site-by-site basis, without assuming an a priori distribution across sites. The REL model first fits a distribution of rates across sites and then infers the substitution rate for individual sites. The criteria to identify codons under positive selection were the same used by Wlasiuk and Nachman (2010). Sites with P values <0.1 for SLAC and FEL, and Bayes Factor >50 for REL were considered as candidates to be under positive selection.
In CODEML, two alternative models M7 and M8 were implemented. M7 only allows codons to evolve neutrally or under purifying selection while M8 adds a class of sites under positive selection. The two previous nested models were compared using the likelihood ratio test (LRT) with 2 degrees of freedom [79]. Amino acids under selection for M8 were identified using Bayes Empirical Bayes approach (BEB) with posterior probability >90%. For each gene, a neighbor-joining tree was used as the working topology which was constructed using Mega 5 [80] with the options p-distance as the substitution model and complete deletion to gaps and missing data.
In accordance to the methodology adopted in Wlasiuk and Nachman (2010), only sites identified as under positive selection by more than one ML method were considered. The amino acid possibilities were characterized with regards to polarity, charge and location in the protein, and are described in the Additional file 31, Table  S31; Additional file 32, Table S32; Additional file 33, Table S33; Additional file 34, Table S34; Additional file 35, Table S35; Additional file 36, Table S36; Additional file 37, Table S37; Additional file 38, Table S38; Additional file 39, Table S39; Additional file 40, Table S40.

Identification of domains
To determine the delimitation of each of the domains of the TLR molecules, the LRRfinder software was used [20] (http://www.lrrfinder.com/). The human delimitations were used as a reference for the remaining species (Additional file 21, Table S21; Additional file 22, Table  S22; Additional file 23, Table S23; Additional file 24, Table S24; Additional file 25, Table S25; Additional file 26, Table S26; Additional file 27, Table S27; Additional file 28, Table S28; Additional file 29, Table S29; Additional file 30, Table S30).