skip to main content

Title: Using dynamic time warping algorithms and spatiotemporal analyses of swine condemnations for syndromic surveillance
Objective: Slaughterhouse data has recently been used to enhance animal disease surveillance in many countries, however has been largely underused for syndromic surveillance in the United States. We characterize spatiotemporal patterns and system dynamics of whole carcass swine condemnations in the US. We illustrate the value of data mining and machine learning approaches to more cost-effectively identify: emerging trends by condemnation reason, areas and time periods with higher than predicted condemnation rates, and regions or time periods with similar trends. Methods: Swine slaughter and condemnation data from 2005-2016 were obtained for slaughterhouses inspected by the Food Safety and Inspection Service (FSIS). Time series of condemnation rates by condemnation reason, type of pig, state and month were generated. Data time warping (DTW) and hierarchical clustering methods were used to identify states with similar patterns in the rate of condemnation cases by cause and type of pig. Spatiotemporal scan statistics were used to identify states and months with significantly higher number of condemnation cases than expected. Clusters were compared to historic infectious disease outbreaks in the swine industry. Results: Between 2005-2016, 1,109,300 whole swine carcasses were condemned. The top causes for condemnation were abscess/pyemia, septicemia, pneumonia, icterus, and peritonitis, respectively. DTW and cluster analysis revealed clear spatiotemporal patterns in the rate of condemnations, many with a strong seasonal component. Several clusters were detected in timeframes where widespread outbreaks had occurred. Conclusions: Timely evaluation of spatiotemporal patterns in swine condemnations may provide critical information in predicting disease outbreaks. Identification of spatiotemporal hot spots can direct investigation of primary on-farm risk factors contributing to condemnation. Risk mitigation through targeted decision-making and improved management practices can minimize carcass condemnations and animal losses, improving economic efficiency, profitability and sustainability of the US swine industry  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
100th Conference of Research Workers in Animal Diseases
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Porcine reproductive and respiratory syndrome virus (PRRSV) remains widely distributed across the U.S. swine industry. Between-farm movements of animals and transportation vehicles, along with local transmission are the primary routes by which PRRSV is spread. Given the farm-to-farm proximity in high pig production areas, local transmission is an important pathway in the spread of PRRSV; however, there is limited understanding of the role local transmission plays in the dissemination of PRRSV, specifically, the distance at which there is increased risk for transmission from infected to susceptible farms. We used a spatial and spatiotemporal kernel density approach to estimate PRRSV relative risk and utilized a Bayesian spatiotemporal hierarchical model to assess the effects of environmental variables, between-farm movement data and on-farm biosecurity features on PRRSV outbreaks. The maximum spatial distance calculated through the kernel density approach was 15.3 km in 2018, 17.6 km in 2019, and 18 km in 2020. Spatiotemporal analysis revealed greater variability throughout the study period, with significant differences between the different farm types. We found that downstream farms (i.e., finisher and nursery farms) were located in areas of significant-high relative risk of PRRSV. Factors associated with PRRSV outbreaks were farms with higher number of access points to barns, higher numbers of outgoing movements of pigs, and higher number of days where temperatures were between 4°C and 10°C. Results obtained from this study may be used to guide the reinforcement of biosecurity and surveillance strategies to farms and areas within the distance threshold of PRRSV positive farms. 
    more » « less
  2. Abstract

    The pork industry is an essential part of the global food system, providing a significant source of protein for people around the world. A major factor restraining productivity and compromising animal wellbeing in the pork industry is disease outbreaks in pigs throughout the production process: widespread outbreaks can lead to losses as high as 10% of the U.S. pig population in extreme years. In this study, we present a machine learning model to predict the emergence of infection in swine production systems throughout the production process on a daily basis, a potential precursor to outbreaks whose detection is vital for disease prevention and mitigation. We determine features that provide the most value in predicting infection, which include nearby farm density, historical test rates, piglet inventory, feed consumption during the gestation period, and wind speed and direction. We utilize these features to produce a generalizable machine learning model, evaluate the model’s ability to predict outbreaks both seven and 30 days in advance, allowing for early warning of disease infection, and evaluate our model on two swine production systems and analyze the effects of data availability and data granularity in the context of our two swine systems with different volumes of data. Our results demonstrate good ability to predict infection in both systems with a balanced accuracy of$$85.3\%$$85.3%on any disease in the first system and balanced accuracies (average prediction accuracy on positive and negative samples) of$$58.5\%$$58.5%,$$58.7\%$$58.7%,$$72.8\%$$72.8%and$$74.8\%$$74.8%on porcine reproductive and respiratory syndrome, porcine epidemic diarrhea virus, influenza A virus, andMycoplasma hyopneumoniaein the second system, respectively, using the six most important predictors in all cases. These models provide daily infection probabilities that can be used by veterinarians and other stakeholders as a benchmark to more timely support preventive and control strategies on farms.

    more » « less
  3. Abstract

    Disease surveillance systems worldwide face increasing pressure to maintain and distribute data in usable formats supplemented with effective visualizations to enable actionable policy and programming responses. Annual reports and interactive portals provide access to surveillance data and visualizations depicting temporal trends and seasonal patterns of diseases. Analyses and visuals are typically limited to reporting the annual time series and the month with the highest number of cases per year. Yet, detecting potential disease outbreaks and supporting public health interventions requires detailed spatiotemporal comparisons to characterize spatiotemporal patterns of illness across diseases and locations. The Centers for Disease Control and Prevention’s (CDC) FoodNet Fast provides population-based foodborne-disease surveillance records and visualizations for select counties across the US. We offer suggestions on how current FoodNet Fast data organization and visual analytics can be improved to facilitate data interpretation, decision-making, and communication of features related to trend and seasonality. The resulting compilation, or analecta, of 436 visualizations of records and codes are openly available online.

    more » « less
  4. Abstract Objectives

    Respiratory syncytial virus (RSV) is a significant cause of pediatric hospitalizations. This article aims to utilize multisource data and leverage the tensor methods to uncover distinct RSV geographic clusters and develop an accurate RSV prediction model for future seasons.

    Materials and Methods

    This study utilizes 5-year RSV data from sources, including medical claims, CDC surveillance data, and Google search trends. We conduct spatiotemporal tensor analysis and prediction for pediatric RSV in the United States by designing (i) a nonnegative tensor factorization model for pediatric RSV diseases and location clustering; (ii) and a recurrent neural network tensor regression model for county-level trend prediction using the disease and location features.


    We identify a clustering hierarchy of pediatric diseases: Three common geographic clusters of RSV outbreaks were identified from independent sources, showing an annual RSV trend shifting across different US regions, from the South and Southeast regions to the Central and Northeast regions and then to the West and Northwest regions, while precipitation and temperature were found as correlative factors with the coefficient of determination R2≈0.5, respectively. Our regression model accurately predicted the 2022-2023 RSV season at the county level, achieving R2≈0.3 mean absolute error MAE < 0.4 and a Pearson correlation greater than 0.75, which significantly outperforms the baselines with P-values <.05.


    Our proposed framework provides a thorough analysis of RSV disease in the United States, which enables healthcare providers to better prepare for potential outbreaks, anticipate increased demand for services and supplies, and save more lives with timely interventions.

    more » « less
  5. Aboelhadid, Shawky M (Ed.)
    The COVID-19 pandemic has caused over 500 million cases and over six million deaths globally. From these numbers, over 12 million cases and over 250 thousand deaths have occurred on the African continent as of May 2022. Prevention and surveillance remains the cornerstone of interventions to halt the further spread of COVID-19. Google Health Trends (GHT), a free Internet tool, may be valuable to help anticipate outbreaks, identify disease hotspots, or understand the patterns of disease surveillance. We collected COVID-19 case and death incidence for 54 African countries and obtained averages for four, five-month study periods in 2020–2021. Average case and death incidences were calculated during these four time periods to measure disease severity. We used GHT to characterize COVID-19 incidence across Africa, collecting numbers of searches from GHT related to COVID-19 using four terms: ‘coronavirus’, ‘coronavirus symptoms’, ‘COVID19’, and ‘pandemic’. The terms were related to weekly COVID-19 case incidences for the entire study period via multiple linear and weighted linear regression analyses. We also assembled 72 variables assessing Internet accessibility, demographics, economics, health, and others, for each country, to summarize potential mechanisms linking GHT searches and COVID-19 incidence. COVID-19 burden in Africa increased steadily during the study period. Important increases for COVID-19 death incidence were observed for Seychelles and Tunisia. Our study demonstrated a weak correlation between GHT and COVID-19 incidence for most African countries. Several variables seemed useful in explaining the pattern of GHT statistics and their relationship to COVID-19 including: log of average weekly cases, log of cumulative total deaths, and log of fixed total number of broadband subscriptions in a country. Apparently, GHT may best be used for surveillance of diseases that are diagnosed more consistently. Overall, GHT-based surveillance showed little applicability in the studied countries. GHT for an ongoing epidemic might be useful in specific situations, such as when countries have significant levels of infection with low variability. Future studies might assess the algorithm in different epidemic contexts. 
    more » « less