### F_{ST}

The F_{ST} statistic partitions the total genetic variance into within- and between- population components, thereby quantifying the extent of population differentiation [26]. An elevated F_{ST} at a given locus relative to the rest of the genome indicates a high degree of differentiation among populations, which may be indicative of positive selection, while a low F_{ST} is consistent with balancing or purifying selection, for example in highly conserved regions of the genome. Since individual SNP-based estimates of F_{ST} are highly variable and may therefore be unreliable indicators of differentiation at a genomic locus [27, 28], we calculated an average of F_{ST} values for all SNPs contained in varying window sizes, from 50 Kb to 1.6 Mb ("zooming out" by a factor of two), each centered on the HIV-1-associated SNP. Given that we are interested in relatively recent selection events, a window-based estimate of F_{ST} allows us to detect the hitchhiking of loci near the putatively selected variants. In addition, a window-based approach to F_{ST} seems preferable since an associated variant detected by GWAS may not represent the actual causal variant, but instead a proxy for a nearby variant. In order to account for variation in recombination patterns along the genome, we also perform the F_{ST} analyses using cM distance instead of Kb distance. We used the hg18/build 36 genetic map, retrieved from http://hgdp.uchicago.edu/Browser_tracks/Genetic_Maps/.

To calculate F_{ST}, we used the method of Weir and Cockerham [29]. We calculated a single global estimate of F_{ST} (based on all 7 population groups), as well as all 21 pairwise estimates of F_{ST}. F_{ST} was not calculated for a SNP if that SNP is monomorphic in the groups being compared. Negative values of F_{ST} were given a value of 0 because negative values are biologically meaningless.

To control for population-specific demographic effects on the genome, we compared the F_{ST} in the risk window to a null distribution of F_{ST} values calculated in randomly chosen windows. Specifically, for each risk SNP, we randomly chose 1000 equally sized (in bp) windows along the same chromosome, with similar genic/non-genic content (± 10%), since F_{ST} tends to be slightly higher for genic SNPs [30]. The genic/non-genic classification was performed according to the annotation provided by Sullivan et al. (https://slep.unc.edu/evidence/?tab=Downloads), which classifies a SNP based on whether it is in the transcribed region of a gene. SNP annotations were created using the TAMAL database [31], based chiefly on UCSC genome browser files [32], HapMap [33], and dbSNP [34].

For all 21 possible pairwise F

_{ST} comparisons, we obtained percentile ranks of the window centered on the risk SNP, compared to the 1000 randomly centered windows across chromosome 6, and use these to compare differentiation of the risk-SNP window across groups. However, in order to determine the significance of the difference between the risk-SNP window F

_{ST} between populations A and B and the same risk-SNP window F

_{ST} between populations A and C, while considering the difference in null chromosomal background (through the mean F

_{ST} of 1000 randomly chosen windows, excluding the risk-SNP window, as described above), we generated 1000 bootstrap resamples from the total sample of 938 individuals. For each of the bootstrap resamples we calculated the risk-SNP window F

_{ST} and the mean F

_{ST} of the 1000 random (non-risk) windows. We then tested the null hypothesis that the difference in risk-SNP window F

_{ST} is the same as the difference for the mean of the random windows, formulated as follows:

where (θ_{X, A-B}) is the F_{ST} for the risk window (X) between population A and B, θ_{X, A-C} is the F_{ST} for the risk window (X) between population A and C,
is the F_{ST} at regions other than the risk region X (
) between population A and B, and
is the F_{ST} at regions other than the risk region X (
) between population A and C. To do so, we assumed that the sample estimators of the four θ's utilized in the null hypothesis stated above are asymptotically unbiased estimators. Let us denote the sample estimators of the population parameters by the use of a circumflex and let:
. Under the null hypothesis and the assumption that the estimators are unbiased, E(*T*) = 0. Therefore, testing the null hypothesis that E(*T*) = 0 is a valid test of the null stated above. In an ordinary sample of independent observations (individuals) taken from the populations of interest, one can test E(*T*) = 0 using bootstrap methods. None of the above analyses depend on any of the
s or any quantities on which they are based being independent. The value of *T* from the observed sample *T*
_{
obs
}was calculated. One thousand bootstrap samples were taken from the original sample with replacement, and a value of *T* calculated in each bootstrap sample, denoted
for the *i*th bootstrap sample. A p-value to test the null hypothesis stated above was then calculated in two ways, conservatively as
, and asymptotically by letting
be the ordinary sample standard deviation of the *T**, and assuming that the
is asymptotically distributed as χ^{2} with 1 df. Due to computational limitations, we restricted this analysis to the rs2395029 SNP.

Given that we are interested in examining the top ten non-HLA variants as a set, as well as each one separately, we averaged the 10 percentiles of all pairwise F_{ST} comparisons. In order to obtain a group-specific F_{ST} for a given group, which we refer to as GSF_{ST}, we averaged the F_{ST} percentile ranks of the six pairwise comparisons that contain the group in question. For example, to obtain an estimate of GSF_{ST} for AFR, we took the average of the percentile F_{ST} of the following six pairwise comparisons: AFR-MID, AFR-SAS, AFR-EUR, AFR-EAS, AFR-OCE, and AFR-AME. Instead of considering many pairwise comparisons for each locus, GSF_{ST} allows us to examine how differentiated a given group is at a given locus compared to all other groups, and relative to the rest of the chromosome.

Because the associated loci in the HLA region were clustered in a relatively small region, we examined the mean F_{ST} of all SNPs in a sliding 400 Kb window with an overlap of 200 Kb, from 29.8 to 32.5 Mb on chromosome 6. We also examined the entire HLA region to determine if the pattern observed in the 29.8 to 32.5 Mb region is characteristic of the HLA region as a whole (28 to 35.6 Mb). To investigate the level of differentiation among all 53 populations, we constructed a matrix of F_{ST} percentiles for all possible pairwise comparisons for the 400 Kb region surrounding rs2395029.

### REHH

Extended haplotype homozygosity (EHH) is defined as the probability that two randomly chosen chromosomes carrying the core haplotype of interest are identical by descent, and the relative EHH (REHH) is the factor by which EHH decays on the tested core haplotype compared to that of other core haplotypes combined [35]. The REHH thus corrects for local variation in recombination rates. We obtained REHH values using Sweep software v1.1 [35], (downloaded from http://www.broadinstitute.org/mpg/sweep). For all associated SNPs, we used the same phased haplotype data as above, and examined REHH at haplotypes containing the HIV-1 risk SNP, and all haplotypes contained in the surrounding 400 Kb region (200 Kb in either direction). Core haplotypes were defined according to the definition of a haplotype block in Gabriel et al. [36], and REHH was measured 300 Kb and 150 Kb away from the core haplotype. For each region and each population group, we compared the REHH in the risk SNP region to the entire chromosome on which the region resides, to determine if the candidate region contains haplotypes with exceptionally high REHH, binning by haplotype frequency. We only considered core haplotypes with frequency greater than 5%.

We examined REHH in the region surrounding both the 'top hits' in the EA and AA study, and 200 Kb outward in either direction, and noted all instances in which a haplotype had a REHH value in the top 99.9^{th}, 99.5^{th}, 99^{th}, and 95^{th} percentile of the empirical distribution, binning by haplotype frequency. We use Q-Q plots to examine the distribution of observed vs. expected -log_{10} p-values for the REHH values in this region, for each of the population groups. For the wider HLA region that encompasses other GWAS top 'hits', we considered all haplotypes from 29.8 to 32.5 Mb on chromosome 6. We use the entire chromosome 6, as well as the entire HLA region as empirical distributions. We used a two-sided Fisher's exact test to determine whether there is an excess of instances of extreme REHH in one or more groups.