Skip to main content

Advertisement

Figure 1 | BMC Evolutionary Biology

Figure 1

From: A non-tree-based comprehensive study of metazoan Hox and ParaHox genes prompts new insights into their origin and evolution

Figure 1

HoxPred classification approach. A. Generalised profile construction. A multiple alignment is built from a set of non-redundant homeodomain sequences that belong to a given homology group (PG9 for this illustration). This alignment then serves as input to a program from the pftools suite [62], which generates the corresponding generalised profile. This profile is a scoring matrix that allows to assign a score to a sequence, based on its similarity with the profile. Contrary to more simple pattern search technique, a profile can provide scores for residues that were not originally found at a given position of the motif. These scores are residue-specific, and extrapolated by using a substitution matrix when building the profile. B. HoxPred classification principle. The sequence to classify is scored by an optimal combination of profiles. The resulting vector of scores then serves as input to a discriminant function that has been previously trained to classify such a vector of scores into a specific class (eg PG4). C. Linear discriminant classifier training. The training phase aims at generating the discriminant function. The training dataset comprise sequences for which the class is known. They can be HOX, RANDOM or HOMEO sequences (see Materials and methods). All sequences are scored by the profiles, so that each sequence is represented by a vector of scores. The classifier is then trained to classify such vector of scores into their associated class (specified on the right). CTL is the control class (see Materials and methods).

Back to article page