E Romero-Severson, H Skar, I Bulla, J Albert and T Leitner,
Molecular biology and evolution, Sep 2014
Pathogen phylogenies are often used to infer spread among hosts. There is, however, not an exact match between the pathogen phylogeny and the host transmission history. Here, we examine in detail the limitations of this relationship. First, all splits in a pathogen phylogeny of more than 1 host occur within hosts, not at the moment of transmission, predating the transmission events as described by the pretransmission interval. Second, the order in which nodes in a phylogeny occur may be reflective of the within-host dynamics rather than epidemiologic relationships. To investigate these phenomena, motivated by within-host diversity patterns, we developed a two-phase coalescent model that includes a transmission bottleneck followed by linear outgrowth to a maximum population size followed by either stabilization or decline of the population. The model predicts that the pretransmission interval shrinks compared with predictions based on constant population size or a simple transmission bottleneck. Because lineages coalesce faster in a small population, the probability of a pathogen phylogeny to resemble the transmission history depends on when after infection a donor transmits to a new host. We also show that the probability of inferring the incorrect order of multiple transmissions from the same host is high. Finally, we compare time of HIV-1 infection informed by genetic distances in phylogenies to independent biomarker data, and show that, indeed, the pretransmission interval biases phylogeny-based estimates of when transmissions occurred. We describe situations where caution is needed not to misinterpret which parts of a phylogeny that may indicate outbreaks and tight transmission clusters.