Maternal lineages were determined by a two-step procedure. The entire HVS-I (positions 16012-16400), the intermediate region (positions 16401-72) and part of HVS-II (positions 73-263) of the mtDNA control region were sequenced following the protocol described previously . A first alignment step with rCRS  was made using BioEdit v.126.96.36.199, to determine a preliminary haplogroup. Twenty-seven informative Single Nucleotid Polymorphisms (SNPs) on the coding region (positions 10873, 13276, 13789, 13470, 10810, 12950, 3594, 13710, 10400, 9818, 11899, 12049, 14088, 13485, 14034, 13803, 15939, 4158, 3693, 13958, 10086, 8618, 14905, 4218, 2352, 750, 7851) were typed by minisequencing SNaPshot® (PE, Applied Biosystems). All data were obtained on an ABI PRISM 3730 sequencer (PE, Applied Biosystems). The final haplogroup assignment was obtained by the most recent mtDNA phylogeny .
NRY were determined by two types of markers. Seventeen Short tandem repeats (STRs) were typed using the AmpFLSTR Yfiler
® kit (PE, Applied Biosystems). Twenty-four Unique Event polymorphisms (UEPs; SRY 10831, M213, M9, M70, M22, Tat, 92R7, M173, P25, M96, M35, M78, M81, M123, M34, M17, M18, M73, M37, M63, M126, M153, M160, SRY 2627) were typed following the protocol published previously . An additional set of nine UEPs (M33, U174, M191, M75, U209, M2, P2, M91, M60) has been designed to precise the haplogroup assignment. All data were read on an ABI PRISM 3730 sequencer and analysed with Genemapper v.4.0 (PE, Applied Biosystems). The YAP analysis has been obtained following Hammer and Horai . The haplogroup assignment follows the most recently updated NRY phylogeny .
The overall HTLV-I seroprevalence in the Noir Marron population in French Guiana is about 6% in women and 4% in men [13, 22–25]. Determination of HTLV-1 genomic subtype in env and LTR regions, which are used as markers of the migration of infected populations, was performed for 23 samples of HTLV-1 infected Noir Marron individuals representative of the main Noir Marron communities: Saramaka (n = 6), Ndjuka (n = 6), Aluku (n = 5) and Paramaka (n = 6) (Genbank accession number: GU725032-GU725054) and compared with the corresponding database available in Genbank.
High molecular-weight DNA was extracted from peripheral blood buffy-coats using the QIAamp DNA Blood Mini Kit (Qiagen GmbH, Hilden, Germany). All samples were firstly determined to contain amplifiable DNA after being amplified by PCR for human β-globin. Five hundred nanograms, quantified by spectrophotometry, of each DNA sample was then subjected to two series of PCR to obtain the complete long terminal repeat (LTR) (755-bp) and a 522-bp region of the env gene, as previously described [26, 27]. To prevent false-positive reactions, all pre- and post-PCR operations were performed in separate facilities. The complete LTR was obtained for eight individuals while the 522-bp Env fragment was obtained for the 23 samples tested. The amplified products of the appropriate size were cloned, sequenced and phylogenetically analysed as described [28, 29].
All summary statistics for mtDNA and NRY haplotype variation, Tajima's D and Fu's Fs tests were calculated using the ARLEQUIN 3.11 software package . Four databases were compiled from published studies, updated with data from the African populations of Benin and Ivory Coast analysed in the present study. The African mtDNA database was composed of 170 populations representing 8727 HVS-I and 3500 HVS-II haplotypes associated with their corresponding haplogroup assignment. The African NRY database was composed of 145 populations representing 8909 individuals typed for UEP informative for the haplogroup assignment and 1200 Y-STR profiles. As the Y- SNP haplogroup information is lacking in some African regions, it was statistically inferred from the Y-STR data available, as previously explained . For some analyses, the African databases were divided into nine groups according to their historical region of slavery described by H.S. Klein , the genetic coherence and published genetic studies [2, 31]: North Africa (Algeria, Canary Island, Egypt, Mauritania, Morocco), Windward Coast, Senegambia and Sierra Leone (Cabo Verde, Guinee-Bissau, Mali, Senegal, Sierra Leone), Gold Coast and Bight of Benin (Benin, Burkina Faso, Ivory Coast), Bight of Biafra (Cameroon, Central African Republic, Chad, Equatorial Guinea, Gabon, Nigeria, Niger, Sao Tome), South West Africa (Angola, Cabinda, Democratic Republic of Congo), South Africa (Botswana, Malawi, Namibia, Zambia, Zimbabwe), South East Africa (Mozambique), East Africa (Ethiopia, Kenya, Rwanda, Somalia, Uganda, Sudan, Tanzania), and Pygmies. The mtDNA and NRY databases of African American and urban hybridised populations were composed of 95 and 90 populations, respectively. Due to the heterogeneous resolution of haplogroup assignment among populations, forming a non-relevant database, only the admixture rates of continental populations (European, Amerindian and African), given in published studies, have been considered. Despite this bias, as African ancestry is relatively common in all African Americans [1, 32], their discrimination is mostly due to differential contribution of non-African gene pools. All populations considered in the present study are located on Figure 2 and complete references are available in Additional file 1.
Haplotype networks were generated for mtDNA haplogroups L2a* and L1c*, and for the NRY haplogroup E1b1a* via the median-joining algorithm of Network v.188.8.131.52 http://www.fluxus-engineering.com from the Noir Marron data and all African and African American comparable data. To obtain the most parsimonious networks the reticulation permissivity was set to zero. Data were pre-processed using the star contraction option in Network v.184.108.40.206 . For the mtDNA data, hypermutable sites were identified by post-processing using the Steiner (MP) algorithm within Network 220.127.116.11, and removed from the analysis . Because of the high level of reticulation in the E1b1a* sample, Y-STR loci were subdivided into three mutation rate classes based on observed STR allelic variance and weighted as follows: 4 (low) for DYS391, DYS392; 2 (intermediate) for DYS389I, DYS389II, DYS19, DYS393, DYS390; or 1 (high) for DYS385a/b .
Cross-population comparisons of maternal and paternal lineages based on the frequency of haplogroups or rates of continental ancestry common to all samples in the database were performed using ARLEQUIN 3.11 . The significance of Fst values is given for p-values under a threshold of 0.05. All results obtained for the comparison between the Noir Marron and each population of the database were graphically plotted on a map using Surfer v.8.0, using the location of each population given in the corresponding study. Factorial Correspondence Analysis (FCA) based on mtDNA and Y-chromosome haplogroup frequencies were performed using XLstat v.7.5.2. Analyses of molecular variance (AMOVA) were performed with ARLEQUIN 3.11 . Admixture estimates were calculated by two different methods. The first, based on haplotypic homology (up to 99% of homology), was calculated by the percentage of shared lineages (LS) between the Noir Marron and each compared group . Haplotype comparisons were performed from HVS-I mtDNA sequences (16030-16360) and NRY core haplotypes (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a/b) to obtain the most relevant results from the compiled databases. The second estimator, mY, was calculated with the ADMIX2.0 program . Both mtDNA-based and NRY-based estimates were calculated from haplogroup frequencies without taking into account molecular distances between haplogroups. The parental populations were chosen among the groups that presented a Fst value, obtained by the AMOVA for the comparison with the Noir Marron, lower than an threshold fixed at 0.1.
Concerning HTLV-1 Env and LTR phylogenetic analysis, the phylogenetic trees were generated using the Neighbor-Joining method performed in the PAUP v.4.0b10 program using representative HTLV-1 sequences available in Genbank, including four env sequences and one LTR sequence typed in Noir Marron individuals of French Guiana . The strains were aligned with the DAMBE v.4.2.13 program and the final alignment was submitted to the Modeltest v. 3.6 program to select, according to the Akaike Information Criterion (AIC), the best model to apply to phylogenetic analyses. Confidence levels were estimated with the distance NJBOOT program (1,000 replicates).