1
Supplementary information
1
In explorative analysis, evolutionary rates were estimated using a strict or relaxed (uncorrelated lognormal)
2
clock model, a Bayesian skyline plot or constant population size model, and using a partitioned (1
st
+2
nd
, 3
rd
)
3
or non-partitioned model (Table S9). All analyses were performed using the HKY nucleotide substitution
4
model. We also compared the estimates obtained when analyzing each individual separately (unlinked) or
5
simultaneously as two different progressor groups as implemented in the recently described HPM
6
incorporating fixed effects(1). Analyses using a Bayesian skyline plot demographic model did not converge.
7
Thus, only the constant size demographic model was used for further analysis. The relaxed clock nucleotide
8
models did not converge well for all individuals and was therefore discarded for further analyses. Based on
9
these results we proceeded with the strict clock codon and nucleotide models and with the relaxed clock
10
codon model in subsequent analysis. In explorative analyses, we did not find any significant differences
11
between the estimated evolutionary rates when comparing the unlinked model with the HPM or between the
12
partitioned and non-partitioned nucleotide models (data not shown). Thus, subsequent analysis were
13
performed using the HPM, a constant size model, and both the codon and non-partitioned nucleotide models
14
(referred to as simply nucleotide model in the main manuscript).
15
16
Codon substitution rates were estimated using both a strict and relaxed clock with the relaxed clock
17
generally estimating a significantly faster codon substitution rate than the strict clock. However, the
18
differences between the relaxed and strict clocks were similar for all patients, as were the differences
19
between the groups (Tables S4 and S6).
20
21
Convergence with high effective sample sizes (ESSs) was reached for all datasets using the strict clock
22
codon model, except for individual DL2051 that displayed a binomial posterior rate distribution. To study if
23
this could have influenced the observed differences between the progressor groups, we reanalyzed our data
24
for all individuals using either (1) only the sample states of the binomial posterior rate distribution resulting
25