Philippe Lemey @philippe_Lemey, Andrew Rambaut @arambaut, Trevor Bedford @trvrb, Nuno Faria, Filip Bielejec @Filip_Bielejec, Guy Baele @guy_baele, Colin A. Russell, Derek J. Smith, Oliver G. Pybus, Dirk Brockmann, Marc A. Suchard
Information on global human movement patterns is central to spatial epidemiological models used to predict the behavior of influenza and other infectious diseases. Yet it remains difficult to test which modes of dispersal drive pathogen spread at various geographic scales using standard epidemiological data alone. Evolutionary analyses of pathogen genome sequences increasingly provide insights into the spatial dynamics of influenza viruses, but to date they have largely neglected the wealth of information on human mobility, mainly because no statistical framework exists within which viral gene sequences and empirical data on host movement can be combined. Here, we address this problem by applying a phylogeographic approach to elucidate the global spread of human influenza subtype H3N2 and assess its ability to predict the spatial spread of human influenza A viruses worldwide. Using a framework that estimates the migration history of human influenza while simultaneously testing and quantifying a range of potential predictive variables of spatial spread, we show that the global dynamics of influenza H3N2 are driven by air passenger flows, whereas at more local scales spread is also determined by processes that correlate with geographic distance. Our analyses further confirm a central role for mainland China and Southeast Asia in maintaining a source population for global influenza diversity. By comparing model output with the known pandemic expansion of H1N1 during 2009, we demonstrate that predictions of influenza spatial spread are most accurate when data on human mobility and viral evolution are integrated. In conclusion, the global dynamics of influenza viruses are best explained by combining human mobility data with the spatial information inherent in sampled viral genomes. The integrated approach introduced here offers great potential for epidemiological surveillance through phylogeographic reconstructions and for improving predictive models of disease control.
Methods highlight:
Migration rates between populations in the SIR model are defined according to four scenarios, as follows: (A) equal rates, (B) rates proportional to the amount of air travel occurring between them (in terms of the number of passengers moving from one population to another), (C) rates proportional to Markov jump estimates based on a standard phylogeographic model (undertaken with and without BSSVS to reduce the number of rate parameters) and (D) a GLM model that only considers air travel as a predictor. To compare the spread of influenza under these simulated models to recorded H1N1 pandemic spread, we measure the relative correspondence between the mean peak times (across 100 simulations) and the observed peak times for all locations except Mexico (based on World Health Organization data; Text S1). Correspondence was measured using the Spearman’s rank correlation coefficient, and tested with associated -values obtained using a permutation test (Text S1), as well as using the mean average error (MAE; in days). We consider the Spearman’s rank correlation coefficients to be more appropriate for our comparison because they are more robust to outliers, which are clearly present in the observed peaks. Therefore, the scaling of between-population coupling c for the various migration matrices was also adjusted so as to maximize Spearman’s rank correlation.
A really neat result:
Our analysis reveals that many potential predictors of global influenza virus spread are not associated with viral lineage movement, specifically, geographical proximity, demography and economic measures, antigenic divergence, epidemiological synchronity and seasonality do not yield noticeable support (Fig. 2). Instead, we find consistent and strong evidence that air passenger flow is the dominant driver of the global dissemination of H3N2 influenza viruses. This is reflected in both the estimated size of the effect of this variable ( on a log scale) and the statistical support for its inclusion in the model (posterior probability >0.93 and Bayes factor >760).