skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Reconstructing historic and modern potato late blight outbreaks using text analytics
Abstract In 1843, a hitherto unknown plant pathogen entered the US and spread to potato fields in the northeast. By 1845, the pathogen had reached Ireland leading to devastating famine. Questions arose immediately about the source of the outbreaks and how the disease should be managed. The pathogen, now known asPhytophthora infestans, still continues to threaten food security globally. A wealth of untapped knowledge exists in both archival and modern documents, but is not readily available because the details are hidden in descriptive text. In this work, we (1) used text analytics of unstructured historical reports (1843–1845) to map US late blight outbreaks; (2) characterized theories on the source of the pathogen and remedies for control; and (3) created modern late blight intensity maps using Twitter feeds. The disease spread from 5 to 17 states and provinces in the US and Canada between 1843 and 1845. Crop losses, Andean sources of the pathogen, possible causes and potential treatments were discussed. Modern disease discussion on Twitter included near-global coverage and local disease observations. Topic modeling revealed general disease information, published research, and outbreak locations. The tools described will help researchers explore and map unstructured text to track and visualize pandemics.  more » « less
Award ID(s):
2200038
PAR ID:
10527140
Author(s) / Creator(s):
; ; ; ;
Corporate Creator(s):
Editor(s):
NA
Publisher / Repository:
NAture
Date Published:
Journal Name:
Scientific Reports
Volume:
14
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Phytophthora infestans is a major oomycete plant pathogen, responsible for potato late blight, which led to the Irish Potato Famine from 1845–1852. Since then, potatoes resistant to this disease have been bred and deployed worldwide. Their resistance (R) genes recognize pathogen effectors responsible for virulence and then induce a plant response stopping disease progression. However, most deployed R genes are quickly overcome by the pathogen. We use targeted sequencing of effector and R genes on herbarium specimens to examine the joint evolution in both P. infestans and potato from 1845–1954. Currently relevant effectors are historically present in P. infestans, but with alternative alleles compared tomodern reference genomes. The historic FAM-1 lineage has the virulent Avr1 allele and the ability to break the R1 resistance gene before breeders deployed it in potato. The FAM-1 lineage is diploid, but later, triploid US-1 lineages appear. We show that pathogen virulence genes and host resistance genes have undergone significant changes since the Famine, from both natural and artificial selection. 
    more » « less
  2. Abstract Two mapping populations were developed from crosses of the Asianindicarice (Oryza sativaL.) cultivar ‘Dee Geo Woo Gen’ (DGWG; PI 699210 Parent, PI 699212 Parent) and two weedy rice ecotypes, an early‐flowering straw hull (SH) biotype AR‐2000‐1135‐01 (PI 699209 Parent) collected in Arkansas and a late‐flowering black hull (BHA) biotype MS‐1996‐9 (PI 699211 Parent) collected in Mississippi. The weed and crop‐based rice recombinant inbred line (RIL) mapping populations have been used to identify genomic regions associated with weedy traits as well as resistance to sheath blight and rice blast diseases. The mapping population consists of 185 (DGWG/SH; Reg. no. MP‐9, NSL 541035 MAP) and 234 (BHA/DGWG; Reg. no. MP‐10, NSL 541036 MAP) F8RILs, of which 175 (DGWG/SH) and 224 (BHA/DGWG) were used to construct two linkage maps using single nucleotide polymorphic markers to identify weedy traits, sheath blight, and blast resistance loci. These mapping populations and related datasets represent a valuable resource for basic rice evolutionary genomic research and applied marker‐assisted breeding efforts in disease resistance. 
    more » « less
  3. An important part of infectious disease management is predicting factors that influence disease outbreaks, such asR, the number of secondary infections arising from an infected individual. EstimatingRis particularly challenging for environmentally transmitted pathogens given time lags between cases and subsequent infections. Here, we calculatedRforBacillus anthracisinfections arising from anthrax carcass sites in Etosha National Park, Namibia. Combining host behavioural data, pathogen concentrations and simulation models, we show thatRis spatially and temporally variable, driven by spore concentrations at death, host visitation rates and early preference for foraging at infectious sites. While spores were detected up to a decade after death, most secondary infections occurred within 2 years. Transmission simulations under scenarios combining site infectiousness and host exposure risk under different environmental conditions led to dramatically different outbreak dynamics, from pathogen extinction (R< 1) to explosive outbreaks (R> 10). These transmission heterogeneities may explain variation in anthrax outbreak dynamics observed globally, and more generally, the critical importance of environmental variation underlying host–pathogen interactions. Notably, our approach allowed us to estimate the lethal dose of a highly virulent pathogen non-invasively from observational studies and epidemiological data, useful when experiments on wildlife are undesirable or impractical. 
    more » « less
  4. Abstract Emerging infectious diseases can have devastating effects on host communities, causing population collapse and species extinctions. The timing of novel pathogen arrival into naïve species communities can have consequential effects that shape the trajectory of epidemics through populations. Pathogen introductions are often presumed to occur when hosts are highly mobile. However, spread patterns can be influenced by a multitude of other factors including host body condition and infectiousness.White‐nose syndrome (WNS) is a seasonal emerging infectious disease of bats, which is caused by the fungal pathogenPseudogymnoascus destructans. Within‐site transmission ofP. destructansprimarily occurs over winter; however, the influence of bat mobility and infectiousness on the seasonal timing of pathogen spread to new populations is unknown. We combined data on host population dynamics and pathogen transmission from 22 bat communities to investigate the timing of pathogen arrival and the consequences of varying pathogen arrival times on disease impacts.We found that midwinter arrival of the fungus predominated spread patterns, suggesting that bats were most likely to spreadP.destructanswhen they are highly infectious, but have reduced mobility. In communities whereP. destructanswas detected in early winter, one species suffered higher fungal burdens and experienced more severe declines than at sites where the pathogen was detected later in the winter, suggesting that the timing of pathogen introduction had consequential effects for some bat communities. We also found evidence of source–sink population dynamics over winter, suggesting some movement among sites occurs during hibernation, even though bats at northern latitudes were thought to be fairly immobile during this period. Winter emergence behaviour symptomatic of white‐nose syndrome may further exacerbate these winter bat movements to uninfected areas.Our results suggest that low infectiousness during host migration may have reduced the rate of expansion of this deadly pathogen, and that elevated infectiousness during winter plays a key role in seasonal transmission. Furthermore, our results highlight the importance of both accurate estimation of the timing of pathogen spread and the consequences of varying arrival times to prevent and mitigate the effects of infectious diseases. 
    more » « less
  5. Abstract Our understanding of ecological processes is built on patterns inferred from data. Applying modern analytical tools such as machine learning to increasingly high dimensional data offers the potential to expand our perspectives on these processes, shedding new light on complex ecological phenomena such as pathogen transmission in wild populations. Here, we propose a novel approach that combines data mining with theoretical models of disease dynamics. Using rodents as an example, we incorporate statistical differences in the life history features of zoonotic reservoir hosts into pathogen transmission models, enabling us to bound the range of dynamical phenomena associated with hosts, based on their traits. We then test for associations between equilibrium prevalence, a key epidemiological metric and data on human outbreaks of rodent‐borne zoonoses, identifying matches between empirical evidence and theoretical predictions of transmission dynamics. We show how this framework can be generalized to other systems through a rubric of disease models and parameters that can be derived from empirical data. By linking life history components directly to their effects on disease dynamics, our mining‐modelling approach integrates machine learning and theoretical models to explore mechanisms in the macroecology of pathogen transmission and their consequences for spillover infection to humans. 
    more » « less