Skip to main content

Advertisement

Table 1 Validation of our methodology on 10 deep phytogeny problems. Organism abbreviations are shown in Table 3, and the accepted clades are shown with parentheses. The column labeled "# Clades" gives the number of accepted clades to be found. The column labeled "# Genes" gives the number of genes used. The Trees column gives the number of gene trees that find all the accepted clades; results for representative proteins are on the left, and results for randomly picked ubiquitous proteins are on the right. For each gene, the most conserved 300-residue sequence was used, and randomly picked proteins were matched to the representative proteins in overall conservation level. Consensus gives the number of accepted clades found over all gene trees; an asterisk indicates that the consensus tree (computed using CONSENSE from the PHYLIP package [52]) finds all the accepted clades. Concatenation gives the number of clades found in 100 bootstraps from a concatenated alignment of all genes; an asterisk here indicates the success of the consensus over bootstrap trees. In problem 6 for example, there are 5 accepted clades, 8 single-gene trees, and 100 bootstrap trees, so a perfect "Consensus" score would be 40, and a perfect "Concatenation" score would be 500.

From: Automatic selection of representative proteins for bacterial phylogeny

Organisms # Clades # Genes Trees Consensus Concatenation
1. (Borr, Trep) (Chlor, Bac) (Campy, Bruc) 3 8 8* 2 24* 12 299 112
2. (Neiss, Rals) (Xyl, Haem) (Rick, Meso) 3 8 5* 3 21* 19 247 207
3. (Clost, Lacto) (Mycob, Bifid) (Campy, Rick) 3 8 6* 4* 18* 18* 294 283
4. (Buch, Rick) (Mycob, Bifid) (Staph, Mycop) 3 8 2 1* 13 15* 235 297
5. (Urea, Mycop) (Strep, Lacto) (Staph, List) 3 8 8* 5* 24* 21* 300 300
6. (Syn, Pro) (Rick, Buch) (Chlor, Bac) (Staph, Strep) (Borr, Trep) 5 8 7* 2* 37* 26* 481 472
7. ((Rick, Bruc) ((Vib, Esch, Haem), Neiss) (Heli, Campy)) (Syn, Pro) (Clost, Staph) (Borr, Trep) 8 17 3* 3 129* 108 762 741
8. ((Caul, Meso), Esch) (Chlor, Bac) (Pro, Nos) 4 8 7* 3* 30* 27* 400 398
9. ((Geo, Desulf), (Wol, Campy), (Caul, Rick)) (Borr, Lep) (Chlor, Bac) 6 8 1 2 31* 32 554 512
10. (Chlor, Bac) (Mycop, Strep, Clost) (Mycob, Bifid) 3 8 1* 2* 15* 13 255 245