Resolving ambiguity in the phylogenetic relationship of genotypes A, B, and C of hepatitis B virus
- Yueming Jiang†1, 2, 3,
- Minxian Wang†4, 5,
- Hongxiang Zheng1, 2,
- Wei R Wang4, 5,
- Li Jin1, 2, 4, 5Email author and
- Yungang He4, 5Email author
© Jiang et al.; licensee BioMed Central Ltd. 2013
Received: 15 January 2013
Accepted: 5 June 2013
Published: 11 June 2013
Hepatitis B virus (HBV) is an important infectious agent that causes widespread concern because billions of people are infected by at least 8 different HBV genotypes worldwide. However, reconstruction of the phylogenetic relationship between HBV genotypes is difficult. Specifically, the phylogenetic relationships among genotypes A, B, and C are not clear from previous studies because of the confounding effects of genotype recombination. In order to clarify the evolutionary relationships, a rigorous approach is required that can effectively explore genetic sequences with recombination.
In the present study, phylogenetic relationship of the HBV genotypes was reconstructed using a consensus phylogeny of phylogenetic trees of HBV genome segments. Reliability of the reconstructed phylogeny was extensively evaluated in agreements of local phylogenies of genome segments.
The reconstructed phylogenetic tree revealed that HBV genotypes B and C had a closer phylogenetic relationship than genotypes A and B or A and C. Evaluations showed the consensus method was capable to reconstruct reliable phylogenetic relationship in the presence of recombinants.
The consensus method implemented in this study provides an alternative approach for reconstructing reliable phylogenetic relationships for viruses with possible genetic recombination. Our approach revealed the phylogenetic relationships of genotypes A, B, and C of HBV.
KeywordsPhylogeny Hepatitis B virus Recombination Consensus tree
Hepatitis B virus (HBV), a serious global public health problem, is the 10th leading cause of death worldwide. Approximately 2 billion people worldwide are infected with this virus and about 350 million live with chronic infection. An estimated 600,000 people die each year due to acute or chronic consequences of hepatitis B .
There are eight well-recognized HBV genotypes, labeled A through H, each pair of which differs by at least 8% of the complete genome sequence. The distribution of the genotypes varies across geographic regions with population migration [2, 3]. Type A is located mostly in Europe, South Africa, and North America; types B and C are prevalent in East Asia, Southeast Asia, and Oceania; type D is common in South Asia, the Mediterranean area, and the Middle East; type E is predominant in sub-Saharan Africa; types F, G, and H are common in the New World and are also found in some European countries, such as France and Germany. Within the 8 genotypes, HBV can be further divided into different subtypes that differ by 4% to 8% of the genome . Besides the 8 well known genotypes, there are two more putative genotypes that could not be classified into those groups above, genotype I and J [4, 5].
Several studies have reported controversial phylogenetic relationships among HBV genotypes, especially genotypes A, B, and C. Three reports suggest that genotypes A and C have a closer phylogenetic relationship than genotype B with A or C [4, 6, 7]. The above phylogenetic relationship has been brought into question, however, by the results of other studies demonstrating that genotypes B and C have a closer phylogenetic relationship than genotype A with B or C [8–10]. One study also reported that the phylogenetic relationship between genotypes A and B is much closer than that of genotype C with A or B . Further, three other studies were unable to elucidate the relationship of the genotypes in detail and suggested that the three genotypes were on the same phylogenetic clade [11–13]. The ambiguity of the phylogenetic relationship of the HBV genotypes is thought to be due in part to historical recombination in the HBV genome [8, 9, 14]. Recent efforts have been made to detection HBV recombinants in HBV genome and provided a comprehensive picture about the distribution of recombination in HBV genome [14–16].
In order to reduce the confounding effects of recombination in the process of phylogeny reconstruction, Fares and Holmes (2001) utilized gene non-overlapping regions of the HBV genome to reconstruct the phylogeny, but the reconstructed phylogeny from their study was not consistent with the geographic prevalence of the genotypes; i.e., genotypes B and C were distributed geographically closer while they were more distant in their reconstructed phylogenetic relationship [3, 6]. Therefore, it might be necessary to incorporate the whole-genome information of HBV, and it is highly unlikely that an approach that does not consider the recombination will solve the ambiguity of the phylogenetic relationship of HBV genotypes. To resolve the ambiguity, we were offered an opportunity to propose and validate effective phylogenetic methods for exploring genetic sequences with recombination.
Here, we reconstructed the phylogenetic relationship of HBV genotypes using a consensus-tree approach to integrate whole-genome information. The overall phylogeny indicated that HBV genotypes B and C have a closer phylogenetic relationship than genotype A with B or C. Multi-level evaluations implicated the reconstructed phylogenetic tree of HBV genotypes was reliable in many perspectives. We did not consider this report as a solely clarification of HBV phylogenies but rather a communication of the implemented methods. The methods implemented in this study could be an alternative choice for phylogeny reconstruction in the presence of recombinant.
Consensus relationship of local phylogenies
Tree-like phylogeny of HBV
Reliability of the consensus phylogenetic relationship
A good consensus phylogenetic tree should represent the majority of phylogenetic relationships of different segments of the HBV genome for all involved sequences. To gain a thorough understanding of the reliability of our results, we evaluated the constructed consensus phylogenetic trees at both the tree and branch levels.
Further demonstration for advantage of the consensus method
Maximum likelihood (ML) method is the most popular and comprehensive approach in studies of genetic phylogeny , as well as the studies of HBV evolution [6–8, 12, 20]. ML method builds inference on robust statistical models and searches trees for the best solution with maximum of likelihood value. Therefore, in many perspectives, the ML method performs excellent in phylogeny reconstruction . To demonstrate advantage of our consensus method in the presence of recombination, we applied both our method and ML method on HBV sequences mixed with simulated genotype A/C recombinants (see Methods for details). Using datasets with moderate recombinant frequency (f = 0.14), the ML method reconstructed incorrect phylogenetic relationship where genotype A and C was wrongly clustered together (Additional file 1: Figure S3). By contrast, using the same synthetic datasets, our consensus method reconstructed phylogenetic relationship with correct topological pattern (Additional file 1: Figure S4). It is worth to mention that both the method produced correct phylogenies if the recombinants were rare in the simulated datasets. And further, both the methods failed to reconstruct correct phylogeny when the frequency of recombinants was very high, for example f = 0.60.
Phylogenetic trees are efficient representations of the genetic relationship of biologic sequences, although a phylogenetic network is more informative in applications involving reticulate relationships, such as those due to recombinant sequences . Unfortunately, the currently available methods for reconstructing phylogenetic networks from genetic data containing recombinant sequences have very high false rates in terms of identifying the correct phylogeny . In contrast, many tree-building methods have a high probability for reconstructing the correct phylogeny for sequences without recombination . Phylogenies of aligned short pieces of sequences are rarely affected by recombination when recombination is not extremely frequent . A consensus of the local phylogenies of short sequence fragments, therefore, can be used to represent the phylogenetic relationship of the majority of the involved HBV sequences.
Inter- and intra-genotype recombination is widely recognized as a critical factor in HBV evolution. Recombinants in sequence pool could lead to inconsistencies among local phylogenies of different fragments of the aligned sequences . Recombination has thus posed a challenge to phylogenetic studies of HBV. In addition, uncertainty regarding the molecular clock also interferes with the reconstructed local phylogenies because, for short sequence fragments, mutation accumulation follows a Poisson distribution with great variance . Therefore, HBV sequence fragments with an extremely small size, for example 250 bp, did not help to distinguish genotypes B and C from genotype A in this study. Both recombination and the uncertainty contribute to the inconsistency between local phylogenies. For the same reason, it is difficult to fully identify all or most recombination events or completely eliminate their impact in phylogenetic studies based on the comparison of local tree topology. In this study, the phylogenetic relationship was reconstructed without explicitly identifying instances of recombination events and the reconstructed relationship was appropriately supported by local phylogenies at both the tree and branch levels. A similar approach may facilitate the reconstruction of reliable tree-like phylogenetic relationships of viruses in future studies.
Classic phylogenetic trees often present phylogenetic relationships of aligned full-length sequences. The consensus phylogenetic relationship in this report, however, is different. This consensus phylogenetic relationship extracts information from the majority of the sequences. A small part of the sequence fragments was automatically ignored during the phylogeny reconstruction and the useful fragments may locate at different positions for different sequences. Excluded fragments of the same sequence may have the same or different genetic origins, but the origins make only minor genetic contributions to the sequences. In this way, minor ancestors of a sequence are ignored by the consensus phylogenetic tree. This method provides a natural way to extract important phylogenetic information from sequences containing recombination.
The reliability of the consensus phylogeny was evaluated by comparing the consensus phylogeny with local phylogenies of sequence segments in this study. The phylogenies were split into rooted triplets to compare the consistency of the triplets during the process. In this novel approach, more consistency indicated smaller topological differences between the phylogenies and better reliability of the consensus phylogeny. This approach overcomes an obvious limitation of the classical consensus measure. The classical measure of majority rule consensus actually showed a split consensus for all taxa without considering the number of taxa . In the classical method, even a small difference in one or two branches was treated as having the same importance as a large difference between phylogenies. The evaluations in this report implemented an alternative approach in which a minor difference is distinguished from large differences. These findings provide another view of the reliability of consensus phylogenetic tree.
The phylogenetic relationships of HBV genotypes A, B, and C that were reconstructed in this study elucidated the geographic prevalence of the HBV genotypes and their phylogenetic relationship. In China and other East Asian countries, HBV carriers often have HBV genotype B or C, while most Japanese carriers have HBV genotype C. Genotype A is rare in East Asia and is found mostly in Western Europe, America, India, and Africa . The global prevalence of HBV suggests that genotypes B and C have a close phylogenetic relationship. Therefore, based on the present findings, the map indicating the origin and historical dispersion of the HBV genotypes that identifies genotype A as being more closely related to genotype B or C appears to be incorrect. In fact, the controversial results about the phylogenetic relationships among these genotypes reported in previous publications [3–13] have caused confusion. Our study sheds light on the origin and historical dispersion of HBV by using a comprehensive approach to confirm that genotypes B and C are closer relatives.
The effects of recombination were eliminated in our analysis to make the result robust. Our simulation suggested that the consensus method was superior to regular ML method in the presence of recombination. The simulation also supplied clues of possible explanation for the difference between our consensus phylogenetic relationship and Shi et al.’s ML tree of HBV genotypes . However, it is a limitation in our current study that this approach is not capable of indentifying historical recombination events in HBV genome. Fortunately, several publications have reported some progress in this field [14–16, 27–31]. Evolutionary history of HBV genome recombination will possibly be clarified in details in future although rigorous improvements of analysis tools are necessary.
Phylogenetic relationship can be reconstructed on majority of phylogenetic information of sequence segments without explicitly identifying historical recombination events. The serial phylogenetic methods proposed and employed in this study provide an effective approach for reconstructing reliable phylogenetic relationships for viruses with possible genetic recombination. In this approach, HBV genotypes B and C had a closer phylogenetic relationship than genotypes A and B or A and C.
We retrieved 3281 complete sequences of human HBV and one full-length sequence of woolly monkey HBV from the GenBank of the National Center for Biotechnology Information available on April 2011 . The full sequence set comprised 320 genotype A, 387 genotype B, 836 genotype C, 383 genotype D, 221 genotype E, 72 genotype F, 15 genotype G, 19 genotype H, and 1043 unknown or uncertain genotype sequences. The genotypes assigned to the different sequences were obtained either directly from the GenBank records or from the associated publications.
All the sequences were screened to exclude entries that were related to patents, artificial mutants, and identical sequences. Further, sequences with unknown, uncertain genotype or documented recombination information were removed. The remaining sequences were aligned using the MUSCLE software with default parameters . Results of the alignments were checked manually for further validation. Gaps (insertions/deletions) and all nonstandard nucleotide bases (all characters except A, C, G, T, and –) were considered as missing values in further analysis. After that, sequences with more than 20% gaps or missing data were removed. Positions of sites were identified by their relative positions to the traditional hypothetical EcoRI site in the full-genome alignments.
To achieve a fair and representative presentation for all the genotypes, we applied a multi-step procedure to remove extra sequences from the initial sequences set. In the first step, we sequentially removed sequences with high similarity to any others until all remaining sequences had a pairwise difference larger than or equal to 2.5%. After the initial cleaning, the sequence pool had 379 full-length HBV sequences (including 38 genotype A, 82 genotype B, 138 genotype C, 77 genotype D, 32 genotype E, 9 genotype F, 2 genotype G, and 3 genotype H).
From the filtered sequences, 30 sequences were randomly drawn for each of genotypes A, B, C, D, and E. Genotypes F, G, and H were not included in further analysis because the purpose of the present study was to elucidate the phylogenetic relationship of genotypes A, B, and C. Furthermore, to involving the limited sequences of genotypes F, G, and H (9 genotype F, 2 genotype G, and 3 genotype H) in the analysis may produce problematic results due to unequal number of involving sequences of each genotype. The full-length HBV sequence of woolly monkey was considered as an ancestral reference (outgroup) in this study . This woolly monkey HBV sequence and the randomly selected human HBV sequences were combined together and aligned by MUSCLE with default parameter settings. To improve the data quality of the aligned sequences, GBLOCKs was used to remove aligned columns with more than half gaps or with low data quality [35, 36]. In total, 105 columns (3.2%) were removed in the process. The working dataset therefore included 151 full-length sequences of HBV for further phylogenetic investigation.
Constructing a consensus phylogenetic relationship
A sliding window approach was used in which an analyzing window moves along the aligned HBV sequences with the same step length (10 bp), but a different window size in different runs. The work of sliding window is similar with that of previous publication about recombination detection . Analysis of the results from different runs with different window sizes (250 bp, 500 bp, 750 bp, 1000 bp, 1250 bp, or 1500 bp) could show how differences in window size impact phylogeny reconstruction. In each stop of the window movement, local phylogenetic trees of the aligned sequence fragments were reconstructed by Ninja software using the neighbor-joining method and Kimura 2 parameter model . With the given outgroup, all the local phylogenetic trees were further split into primary rooted triplets. From each local phylogenetic tree, 551,300 (, the number of combinations of any 3 sequences from the given set of 150 HBV sequences) primary rooted triplets were obtained. Because of the circular characteristic of HBV genome, the initial start of HBV sequences were concatenated at the end of the original sequences, in order to make each base have an equal coverage by the sliding window.
The primary rooted phylogenetic triplets of each window in each run were filtered to remove the minor triplets that presented two different minor phylogenetic relationships. It is worth to note here that, for every combination with 3 human HBV sequences and the root, there were three possible topologies for each window in each run and the three topologies were not compatible with each other. We took only one of the possible topologies, i.e. the major triplet, for further analysis. The removed triplets were less common and inconsistent with the major phylogenetic relationship presented in the same analyzing window (see Results for further details, Figure 1). The remaining rooted triplets from all the analyzing windows in the same run were then pooled together to reconstruct a consensus tree using the rooted triplet consensus method . Ewing, et al. (2008) declared that the consensus method based on rooted triplets outperformed the extended majority rule consensus strategy. We constructed consensus phylogenetic relationships of HBV genotypes in different runs separately using different window sizes.
Evaluating the reliability of the reconstructed phylogenetic relationship
The reliability of the reconstructed phylogenetic relationship of HBV sequences can be evaluated by comparing the consensus phylogenetic relationship with phylogenetic trees of genome segments (local phylogenetic trees). Good consistency between them would indicate good reliability of the consensus phylogeny. In this study, multiple comparisons were conducted to achieve a thorough understanding of the reliability.
First the consistency of the reconstructed consensus phylogeny and local phylogenetic trees was investigated on a genome-segment level. For each genome segment, local neighbor-joining trees (involving all 151 taxa) were built using Ninja software with the aforementioned substitution model . We then dissected the local neighbor-joining trees and our consensus tree-like phylogenetic relationship into rooted triplets. For phylogenies with n taxa (including an outgroup), the proportion of compatible triplets between the local tree and consensus tree could be obtained by , where k is the total number of compatible triplets and is the number of total rooted triplets (n = 151 in this case). The proportions were calculated for all genome segments and then used as a measure for the agreement of reconstructed consensus phylogeny and local phylogenetic trees.
Second, the consistency of internal branches (nontrivial splits) of the consensus phylogenetic tree and local phylogenetic trees was evaluated by checking how often the nontrivial splits of the consensus tree were supported by nontrivial splits of local phylogenetic trees. For any given internal branch (with m children) of an n-taxa consensus tree (including an outgroup), the phylogenetic relationship was dissected into rooted triplets with a total number to form a consensus rooted triplet pool. The probability that a given rooted triplet from the consensus rooted triplet pool was supported by dissected rooted triplets of local phylogenetic tree could be estimated by , where y was the number of dissected rooted triplets of the local phylogenetic trees which shared the same phylogenetic relationships with their corresponding triplets of the consensus tree, and j was the total number of local neighbor-joining trees determined by the size of the sliding window and length of the moving step. The 95% CI of the estimation was obtained by a bootstrapping method in which local phylogenetic trees were randomly sampled with replacements to generate an artificial rooted triplet pool for the aforementioned evaluation.
Performance demonstration in the presence of recombination
Synthetic data was generated by introducing simulated genotype A/C recombinants to the raw data set that was used for aforementioned investigation of HBV phylogeny. For a pair of sequences, one from each of the two genotypes, we gave the recombination probability p. Expected frequency of recombinants in the sequence pool of genotype A, C, and A/C recombinant could be estimated as f = 1 - (1 - p)30 because 30 sequences of each genotype were included in the raw data set. We considered all possible pairs of the involving sequences of genotypes A and C to simulate the occurrence of recombination between the two genotypes. When a recombination occurred between a pair of sequences with probability p, location of the recombinant fragment was randomly chosen on the HBV genome, and length of the recombinant fragment was determined by the empirical length distribution of recombinants from Yang et al’s study . Because HBV genome is a circular molecular, we allowed recombinant fragment cover the junction of sequence end and start.
Phylogenetic relationship of the synthetic data was reconstructed by using ML method. Before the reconstruction, jModelTest2 was executed to choose the best-fit model from the 88 candidate models . Since GTR + I + G model was selected as the best-fit model, a ML tree was built using the ML method implemented in PALM package . The same synthetic data was also analyzed by our consensus method to produce a consensus tree. By given different probability of recombination p, we performed the data simulation and phylogeny reconstruction multiple times to achieve a thoughtful evaluation.
We thank anonymous reviewers for comments that improved the study and the manuscript. This work was supported by grants from National Natural Science Foundation of China (81100997 and 31171279 to Y.H.; 30890034 and 30625016 to L.J.). L.J. was also supported by Shanghai Leading Academic Discipline Project (B111) and the Center for Evolutionary Biology at Fudan University. Y.H. gratefully acknowledges the supports of SA-SIBS scholarship program and the Youth Innovation Promotion Association of Chinese Academy of Science.
- Lavanchy D: Hepatitis B virus epidemiology, disease burden, treatment, and current and emerging prevention and control measures. J Viral Hepat. 2004, 11: 97-107. 10.1046/j.1365-2893.2003.00487.x.PubMedView Article
- Jazayeri SM, Alavian SM, Carman WF: Hepatitis B virus: origin and evolution. J Viral Hepat. 2010, 17: 229-235. 10.1111/j.1365-2893.2009.01193.x.PubMedView Article
- Kurbanov F, Tanaka Y, Mizokami M: Geographical and genetic diversity of the human hepatitis B virus. Hepatol Res. 2010, 40: 14-30. 10.1111/j.1872-034X.2009.00601.x.PubMedView Article
- Yu H, Yuan Q, Ge S-X, Wang H-Y, Zhang Y-L, Chen Q-R, Zhang J, Chen P-J, Xia N-S: Molecular and phylogenetic analyses suggest an additional hepatitis B virus genotype “I. PLoS One. 2010, 5: e9297-10.1371/journal.pone.0009297.PubMed CentralPubMedView Article
- Tatematsu K, Tanaka Y, Kurbanov F, Sugauchi F, Mano S, Maeshiro T, Nakayoshi T, Wakuta M, Miyakawa Y, Mizokami M: A genetic variant of hepatitis B virus divergent from known human and ape genotypes isolated from a Japanese patient and provisionally assigned to new genotype J. J Virol. 2009, 83: 10538-10547. 10.1128/JVI.00462-09.PubMed CentralPubMedView Article
- Fares MA, Holmes EC: A revised evolutionary history of hepatitis B virus (HBV). J Mol Evol. 2002, 54: 807-814. 10.1007/s00239-001-0084-z.PubMedView Article
- Bollyky PL, Holmes EC: Reconstructing the complex evolutionary history of hepatitis B virus. J Mol Evol. 1999, 49: 130-141. 10.1007/PL00006526.PubMedView Article
- Bollyky PL, Rambaut A, Harvey PH, Holmes EC: Recombination between sequences of hepatitis B virus from different genotypes. J Mol Evol. 1996, 42: 97-102. 10.1007/BF02198834.PubMedView Article
- Morozov V, Pisareva M, Groudinin M: Homologous recombination between different genotypes of hepatitis B virus. Gene. 2000, 260: 55-65. 10.1016/S0378-1119(00)00424-8.PubMedView Article
- Takahashi K, Brotman B, Usuda S, Mishiro S, Prince AM: Full-genome sequence analyses of hepatitis B virus (HBV) strains recovered from chimpanzees infected in the wild: implications for an origin of HBV. Virology. 2000, 267: 58-64. 10.1006/viro.1999.0102.PubMedView Article
- Vieth S, Manegold C, Drosten C, Nippraschk T, Günther S: Sequence and phylogenetic analysis of hepatitis B virus genotype G isolated in Germany. Virus Genes. 2002, 24: 153-156. 10.1023/A:1014572600432.PubMedView Article
- Kidd-Ljunggren K, Miyakawa Y, Kidd AH: Genetic variability in hepatitis B viruses. J Gen Virol. 2002, 83: 1267-1280.PubMedView Article
- Alestig E, Hannoun C, Horal P, Lindh M: Phylogenetic origin of hepatitis B virus strains with precore C-1858 variant. J Clin Microbiol. 2001, 39: 3200-3203. 10.1128/JCM.39.9.3200-3203.2001.PubMed CentralPubMedView Article
- Simmonds P, Midgley S: Recombination in the genesis and evolution of hepatitis B virus genotypes. J Virol. 2005, 79: 15467-15476. 10.1128/JVI.79.24.15467-15476.2005.PubMed CentralPubMedView Article
- Yang J, Xing K, Deng R, Wang J, Wang X: Identification of Hepatitis B virus putative intergenotype recombinants by using fragment typing. J Gen Virol. 2006, 87: 2203-10.1099/vir.0.81752-0.PubMedView Article
- Shi W, Carr MJ, Dunford L, Zhu C, Hall WW, Higgins DG: Identification of novel inter-genotypic recombinants of human hepatitis B viruses by large-scale phylogenetic analysis. Virology. 2012, 427: 51-59. 10.1016/j.virol.2012.01.030.PubMedView Article
- Posada C: Intraspecific gene genealogies: trees grafting into networks. Trends Ecol Evol (Amst.). 2001, 16: 37-45. 10.1016/S0169-5347(00)02026-7.View Article
- Makarenkov V, Legendre P: From a phylogenetic tree to a reticulated network. J Comput Biol. 2004, 11: 195-212. 10.1089/106652704773416966.PubMedView Article
- Whelan S, Liò P, Goldman N: Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet. 2001, 17: 262-272. 10.1016/S0168-9525(01)02272-7.PubMedView Article
- Yang Z, Lauder IJ, Lin HJ: Molecular evolution of the hepatitis B virus genome. J Mol Evol. 1995, 41: 587-596.PubMed
- Huson DH, Bryant D: Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006, 23: 254-267.PubMedView Article
- Woolley SM, Posada D, Crandall KA: A comparison of phylogenetic network methods using computer simulation. PLoS One. 2008, 3: e1913-10.1371/journal.pone.0001913.PubMed CentralPubMedView Article
- Mihaescu R, Levy D, Pachter L: Why neighbor-joining works. Algorithmica. 2009, 54: 1-24. 10.1007/s00453-007-9116-4.View Article
- Martin DP, Williamson C, Posada D: RDP2: recombination detection and analysis from sequence alignments. Bioinformatics. 2005, 21: 260-262. 10.1093/bioinformatics/bth490.PubMedView Article
- Duffy S, Shackelton LA, Holmes EC: Rates of evolutionary change in viruses: patterns and determinants. Nat Rev Genet. 2008, 9: 267-276.PubMedView Article
- Bryant D: A classification of consensus methods for phylogenetics. DIMACS series in discrete mathematics and theoretical computer science. 2003, 61: 163-184.
- Lyons S, Sharp C, LeBreton M, Djoko CF, Kiyang JA, Lankester F, Bibila TG, Tamoufé U, Fair J, Wolfe ND, Simmonds P: Species association of hepatitis B virus (HBV) in non-human apes; evidence for recombination between gorilla and chimpanzee variants. PLoS One. 2012, 7: e33430-10.1371/journal.pone.0033430.PubMed CentralPubMedView Article
- Trinks J, Cuestas ML, Tanaka Y, Mathet VL, Minassian ML, Rivero CW, Benetucci JA, Gímenez ED, Segura M, Bobillo MC, Corach D, Ghiringhelli PD, Sánchez DO, Avila MM, Peralta LAM, Kurbanov F, Weissenbacher MC, Simmonds P, Mizokami M, Oubiña JR: Two simultaneous hepatitis B virus epidemics among injecting drug users and men who have sex with men in Buenos Aires, Argentina: characterization of the first D/A recombinant from the American continent. J Viral Hepat. 2008, 15: 827-838.PubMed
- Fang Z-L, Hué S, Sabin CA, Li G-J, Yang J-Y, Chen Q-Y, Fang K-X, Huang J, Wang X-Y, Harrison TJ: A complex hepatitis B virus (X/C) recombinant is common in Long An county, Guangxi and may have originated in southern China. J Gen Virol. 2011, 92: 402-411. 10.1099/vir.0.026666-0.PubMed CentralPubMedView Article
- Zhou B, Xiao L, Wang Z, Chang ET, Chen J, Hou J: Geographical and ethnic distribution of the HBV C/D recombinant on the Qinghai-Tibet Plateau. PLoS One. 2011, 6: e18708-10.1371/journal.pone.0018708.PubMed CentralPubMedView Article
- Zhou B, Wang Z, Yang J, Sun J, Li H, Tanaka Y, Mizokami M, Hou J: Novel evidence of HBV recombination in family cluster infections in western China. PLoS One. 2012, 7: e38241-10.1371/journal.pone.0038241.PubMed CentralPubMedView Article
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank. Nucleic Acids Res. 2011, 39: D32-D37. 10.1093/nar/gkq1079.PubMed CentralPubMedView Article
- Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinforma. 2004, 5: 113-10.1186/1471-2105-5-113.View Article
- Arauz-Ruiz P, Norder H, Robertson BH, Magnius LO: Genotype H: a new Amerindian genotype of hepatitis B virus revealed in Central America. J Gen Virol. 2002, 83: 2059-2073.PubMedView Article
- Castresana J: Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000, 17: 540-552. 10.1093/oxfordjournals.molbev.a026334.PubMedView Article
- Talavera G, Castresana J: Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007, 56: 564-577. 10.1080/10635150701472164.PubMedView Article
- Wheeler T: Large-scale neighbor-joining with ninja. Proceedings of the 9th International Workshop on Algorithms in Bioinformatics: 12-13 September 2009; Philadelphia. Edited by: Salzberg SL, Warnow T. 2009, Berlin Heidelberg: Springer, 375-389.View Article
- Ewing GB, Ebersberger I, Schmidt HA, Von Haeseler A: Rooted triple consensus and anomalous gene trees. BMC Evol Biol. 2008, 8: 118-10.1186/1471-2148-8-118.PubMed CentralPubMedView Article
- Darriba D, Taboada GL, Doallo R, Posada D: jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012, 9: 772-PubMed CentralPubMedView Article
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology. 2003, 52: 696-704. 10.1080/10635150390235520.PubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.