From: ChromaClade: combined visualisation of phylogenetic and sequence data

ChromaClade example applications. a-c A dataset of 1331 HIV-1 group M capsid sequences containing representatives from all subtypes was downloaded from the Los Alamos HIV-1 sequence database [4] and aligned manually. The phylogeny was estimated from the nucleotide sequences using RAxML 8 [5] with substitution model GTR + Gamma and rooted using HIV-1 group O sequences as an outgroup (not shown). ChromaClade was used to annotate taxon labels with residues found at capsid protein sites. a Site 1, proline is entirely conserved; b site 92, alanine is mostly conserved in subtypes, B, C and D, while proline is mostly conserved in the remaining subtypes; c, site 110, the wildtype threonine is found in most sequences, while the asparagine escape mutant has arisen multiple times independently. Prominent subtypes are indicated, right. d-f A phylogeny was estimated as above for an aligned set of avian and pandemic human influenza virus PB2 gene sequences downloaded from the influenza virus resource [6] and mid-point rooted; the sampling years of the human pandemic sequences are shown, right. Black circles indicate clades found in at least 700 of 1000 bootstrap replicates. ChromaClade was used to colour-annotate the taxon labels and branches according to residues found at sites 627 (d), 591 (e) and 271 (f); branches where the ancestral state is unclear are coloured grey. These annotated trees were visualised using FigTree [1]

