Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 2 Effects of hit fraction threshold on cluster assembly. Bold indicates the threshold chosen for the current study.

From: Inferring angiosperm phylogeny from EST data with widespread gene duplication

Hit fractiona Clustersb Singletonsc Phylogenetically informative clustersd Max sizee TCs in phylogenetically informative clustersf
0.0 39924 26782 4423 6565 54051
0.1 47798 32824 4079 1947 42406
0.2 57229 41327 3324 1362 29403
0.3 64691 48864 2561 330 21504
0.4 71333 56383 1876 117 15457
0.5 77564 63890 1340 98 10721
0.6 83435 71539 897 95 7105
0.7 88864 79122 577 94 4536
0.8 94296 87186 324 92 2529
0.9 99843 95975 103 89 872
1.0 105144 104860 1 6 6
  1. a Minimum proportion of sequence similarity based on BLAST's pairwise comparisons. The hit fraction determines whether a sequence is linked to another (if a pair is linked, they will be placed in the same cluster) and thus affects the level of heterogeneity within clusters and the number of assembled clusters. Original number of sequences is 105,453 TCs.
  2. b Total number of assembled clusters.
  3. c Number of single-sequence clusters.
  4. d Phylogenetically informative clusters for this study are those that include at least three species and at least four sequences.
  5. e Number of tentative consensus sequences (TCs) in the largest phylogenetically informative cluster.
  6. f Total TCs in all phylogenetically informative clusters.