Figure 4From: FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein functionThe FlowerPower algorithm. "Q" indicates the query (or seed) sequence. Sequences sharing the same domain structure are indicated as blue stars; all other sequences are indicated as brown triangles. SCI-PHY subfamilies are indicated by black ovals. 1. Identify a set of potential homologs S using PSI-BLAST; filter to remove much longer or much shorter sequences. 2. Select a core set for initial alignment. 3. Identify subfamilies using SCI-PHY and construct subfamily HMMs (SHMMs). 4. Score S with the SHMMs, and identify those sequences receiving scores with E-values below cutoff. Align each sequence to its closest SHMM. Evaluate the alignment with user-specified criteria; remove sequences that do not meet these criteria. 5. Run SCI-PHY on the new alignment to identify subfamilies and construct SHMMs. 6. Repeat steps 1–5 until convergence.Back to article page