### Study population and phenotypes

In this study we analysed variation in the F_{3} generation of an intercross between the Large (LG/J) and Small (SM/J) inbred mouse strains [41] that were selected for either large or small body weight at 60 days of age [42]. These strains differ by 6–8 standard deviations in size and growth related traits [41] and therefore represent an excellent model system to study imprinting effects arising from genes regulating growth and development. The study population was generated by first mating ten males of the SM/J strain to ten females of the LG/J strain resulting in the F_{1} population of 52 individuals. These F_{1} individuals were randomly mated to produce 510 F_{2} animals, representing our parental generation. Random mating among F_{2} animals yielded 200 full-sib families of the F_{3} generation with a total of 1632 individuals. Further details of the husbandry are given in Vaughn et al. [23].

The traits analysed in this study are weekly body weights taken in mice 1 to 10 weeks of age. These measures were used to calculate two growth periods: pre-weaning (weeks 1–3) and post-weaning growth (weeks 3–10). Growth was calculated as the difference between weekly weights (i.e., gain in body weight) such that, for example, the growth from week 1 to week 3 is the difference between week 3 weight and week 1 weight. Pups were weighed weekly using a digital scale with an accuracy of 0.1 g. In addition, we analysed heart, kidney, liver, reproductive fatpad and spleen weights. To obtain these organ weights, mice were sacrificed after 70 days of age or after having produced and reared their offspring to weaning (at three weeks of age). At necropsy, all mice were first weighed to obtain an overall measure of body size and then dissected and their organs weighed to the nearest 0.01 gram using digital scales. The effects of age at necropsy, sex, and litter size at birth were removed from the data and the residuals used in the analysis prior to gene mapping [41].

### Genotyping

DNA was extracted from livers of both the F_{2} and F_{3} individuals using Qiagen DNeasy tissue kits and samples were scored for 384 SNP markers using the Golden Gate Assay by Illumina, San Diego, USA. These markers were previously found to be polymorphic between the two strains http://www.well.ox.ac.uk/mouse/INBREDS/. Fifteen loci had to be excluded from the analysis because they were not reliably scored. In addition, 16 loci were scored on the X chromosome, but were not included in this analysis because the statistical model for the X chromosome is currently unresolved. Thus, in total we analysed 353 loci across the 19 autosomes in this study. A genetic map of these markers in cM was produced using R/QTL and validated against the genome coordinate locations in the Ensembl database http://www.ensembl.org. The average map distance between markers in the F_{2} generation was 4 cM. Markers were reasonably evenly located throughout the genome except for several regions in which LG/J and SM/J strains have been found to be monomorphic [43].

### Haplotype recontruction

Both parental and offspring genotypes were used to reconstruct haplotypes for all mice using PedPhase [44], which produced a set of unordered haplotypes for the F_{2} parents and a set of ordered haplotypes (i.e. ordered by parent-of-origin of alleles) for their F_{3} offspring. Thus, we were able to distinguish the four possible genotypes at a given locus, *LL, SL, LS* or *SS* where the first allele refers to the paternally-derived allele and the second to the maternally-derived copy.

### Analysis of parent-of-origin-dependent effects

The four ordered genotypes at the marker loci (

*LL, LS, SL* and

*SS*) were assigned additive (

*a*), dominance (

*d*) and parent-of-origin (

*i*) genotypic index scores following Mantey et al. [

45]. In matrix form, these index scores are given by:

$\left[\begin{array}{c}\overline{LL}\\ \overline{LS}\\ \overline{SL}\\ \overline{SS}\end{array}\right]=\left[\begin{array}{cccc}1& 1& 0& 0\\ 1& 0& 1& 1\\ 1& 0& 1& -1\\ 1& -1& 0& 0\end{array}\right]\left[\begin{array}{c}r\\ a\\ d\\ i\end{array}\right]\text{yieldingestimatesoftheparameters}:\left[\begin{array}{c}r\\ a\\ d\\ i\end{array}\right]=\left[\begin{array}{c}{\scriptscriptstyle \frac{\overline{LL}}{2}}+{\scriptscriptstyle \frac{\overline{SS}}{2}}\\ {\scriptscriptstyle \frac{\overline{LL}}{2}}-{\scriptscriptstyle \frac{\overline{SS}}{2}}\\ {\scriptscriptstyle \frac{\overline{LS}}{2}}+{\scriptscriptstyle \frac{\overline{SL}}{2}}-{\scriptscriptstyle \frac{\overline{LL}}{2}}-{\scriptscriptstyle \frac{\overline{SS}}{2}}\\ {\scriptscriptstyle \frac{\overline{LS}}{2}}-{\scriptscriptstyle \frac{\overline{SL}}{2}}\end{array}\right]$

The vectors of genotypic means are $\overline{LL}$, $\overline{LS}$, $\overline{SL}$, $\overline{SS}$,, *r* is the reference point for the model (the mid-point between homozygotes), *a* is the additive genotypic value (half the difference between homozygotes), *d* is the dominance genotypic value (the difference between the mean of the heterozygotes and the mid-point of the homozygote means), and *i* is the parent-of-origin or imprinting genotypic value (half the difference between heterozygotes) (cf. [45]).

These index scores were used to build a model for a genome scan for loci showing significant sex by parent-of-origin-dependent interaction effects (i.e. sex by *i* effects) using the Proc mixed Procedure in SAS 9.1; SAS Institute, Cary, NC, USA using maximum likelihood (see [46] for details). In this model, growth traits or organ weights are the dependent variables and sex, the additive, dominance and imprinting index scores as well as their interactions with sex were the fixed effects and family was a random effect. The mixed model with the fixed genetic effects and random family effect was used to scan the genome to produce a probability distribution for the overall effect of the locus as well as the sex by imprinting interaction effect. The probability was generated by comparing the -2 res log likelihood computed by SAS for the model with the fixed genetic effects and their interaction with sex with a reduced model that did not include these six genetic effects. The difference in the -2 res log likelihoods of the two models is chi-square distributed with 6 degrees of freedom (representing the fact that the two models differ by 6 fixed effects). These probability values were then transformed to a log probability ratio (LPR) in order to make them comparable to the LOD scores commonly seen in QTL analyses (LPR = -log_{10} [probability]).

Parent-of-origin effects can also appear as a result of maternal genetic effects rather than genomic imprinting [47]. Maternal genetic effects occur if the mother's genotype affects the offspring's phenotype beyond the effects of her genetic contribution to offspring phenotypes. Maternal effects can result in the appearance of parent-of-origin dependent effects because homozygous mothers can produce only one type of heterozygous offspring. By analogy, sex-dependent maternal effects can result in the appearance of sex by *i* effects. Therefore, we tested all loci with a significant sex by *i* interaction effect to confirm that the effect was not caused by a sex-dependent maternal genetic effect [cf. [47]]. Using our mixed model framework, we have advanced on our previous approach by testing whether the sex by *i* effect is dependent on whether individuals are reared by homozygous mothers (where a maternal effect could result in the appearance of a parent-of-origin dependent effect) or heterozygous mothers (where all four ordered genotypes of offspring come from the same genotype of mother and hence maternal effects cannot result in the appearance of a parent-of-origin dependent effect). A parent-of-origin dependent effect was attributed to a maternal effect if the sex by *i* effect depended significantly on the type of mother (i.e., there was a significant three-way interaction between sex, *i* and the type of mother, classed as heterozygote or homozygote).

### Significance testing of QTL

Our significance testing approach needed to take account of the autocorrelation of siblings within the F_{3} families. Therefore, we first calculated chromosome-wise and genome-wise threshold LOD scores for the sex-by-imprinting interaction in separate permutation procedures [48] for each of the traits ensuring that the specific family structure in this generation was maintained. For each trait, we achieved this by first calculating deviations of each individual from its family mean and randomly permuting these deviations within each family. We then randomly permuted all F_{3} family means and reconstructed new values for each individual by adding its permuted deviation to its new mean. Using these new values for each individual, we ran a canonical correlation analysis and computed the highest LPR score on each chromosome. This procedure was repeated 1000 times, and 5% chromosome-wise threshold values were obtained from the 50^{th} highest values generated for each chromosome. The 5% genome-wise threshold value for each trait was obtained from the 50^{th} highest value among the 1000 highest LPR values across all 19 chromosomes in each permutation run [48]. First simulations revealed that the genome- and chromosome-wise thresholds were very similar to those obtained using the effective number of markers method based on the eigenvalues of the marker correlation matrix [49]. Since the latter method allows for direct computation of the thresholds for all traits, whereas the simulation required significant computing time for each trait, we used the thresholds obtained using the effective number of markers. This method calculates the number of independent tests in a genome or chromosome scan and uses the "effective number of markers" in a Bonferonni correction. We used both the conservative genome-wise significance threshold as well as the chromosome-wise thresholds because this approach has been shown to give overall the best results by increasing the discovery of true positives while at the same time avoiding problems using the false discovery rate in gene mapping experiments [50].

After loci were identified using the genome- or chromosome-wide significance thresholds for the interaction model we used post-hoc tests to characterize the phenotypic patterns caused by genomic imprinting. We included all significant effects of QTL using a protected test where pleiotropic effects of QTL are included whenever the effect of a locus on other traits is significant at the pointwise (*p* < 0.05; LPR > 1.3) level. Thus, while we apply the stringent genome- or chromosome-wide threshold for QTL detection to minimize type I errors, we characterize the distribution of pleiotropic effects across the entire set of traits to minimize type II errors. The imprinting patterns were determined using the mixed model approach and are given by the relationship between the additive (*a*), dominance (*d*) and parent-of-origin genotype value (*i*) [36]. We distinguish a total of three different imprinting patterns: parental (paternal or maternal) expression, bipolar dominance and polar dominance, see above [36].