skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Prefiltered component‐based greedy (PreCoG) scan method
The spatial distribution of disease cases can provide important insights into disease spread and its potential risk factors. Identifying disease clusters correctly can help us discover new risk factors and inform interventions to control and prevent the spread of disease as quickly as possible. In this study, we propose a novel scan method, the Prefiltered Component‐based Greedy (PreCoG) scan method, which efficiently and accurately detects irregularly shaped clusters using a prefiltered component‐based algorithm. The PreCoG scan method's flexibility allows it to perform well in detecting both regularly and irregularly‐shaped clusters. Additionally, it is fast to apply while providing high power, sensitivity, and positive predictive value for the detected clusters compared to other scan methods. To confirm the effectiveness of the PreCoG method, we compare its performance to many other scan methods. Additionally, we have implemented this method in thesmercR package to make it publicly available to other researchers. Our proposed PreCoG scan method presents a unique and innovative process for detecting disease clusters and can improve the accuracy of disease surveillance systems.  more » « less
Award ID(s):
1915277
PAR ID:
10535137
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistics in Medicine
Volume:
43
Issue:
21
ISSN:
0277-6715
Format(s):
Medium: X Size: p. 4113-4130
Size(s):
p. 4113-4130
Sponsoring Org:
National Science Foundation
More Like this
  1. The detection of disease clusters in spatial data analysis plays a crucial role in public health, while the circular scan method is widely utilized for this purpose, accurately identifying non-circular (irregular) clusters remains challenging and reduces detection accuracy. To overcome this limitation, various extensions have been proposed to effectively detect arbitrarily shaped clusters. In this paper, we combine the strengths of two well-known methods, the flexible and elliptic scan methods, which are specifically designed for detecting irregularly shaped clusters. We leverage the unique characteristics of these methods to create candidate zones capable of accurately detecting irregularly shaped clusters, along with a modified likelihood ratio test statistic. By inheriting the advantages of the flexible and elliptic methods, our proposed approach represents a practical addition to the existing repertoire of spatial data analysis techniques. 
    more » « less
  2. Abstract Correctly and quickly identifying disease patterns and clusters is a vital aspect of public health and epidemiology so that disease outbreaks can be mitigated as effectively as possible. The circular scan method is one of the most commonly used methods for detecting disease outbreaks and clusters in retrospective and prospective disease surveillance. The circular scan method requires a population upper bound in order to construct the set of candidate zones to be scanned, which is usually set to 50% of the total population. The performance of the circular scan method is affected by the choice of the population upper bound, and choosing an upper bound different from the default value can improve the method's performance. Recently, the Gini coefficient based on the Lorenz curve, which was originally used in economics, was proposed to determine a better population upper bound. We present the elbow method, a new method for choosing the population upper bound, which seeks to address some of the limitations of the Gini‐based method while improving the performance of the circular scan method over the default value. To evaluate the performance of the proposed approach, we evaluate the sensitivity and positive predictive value of the circular scan method for publicly‐available benchmark data for the default value, the Gini coefficient method, and the elbow method. 
    more » « less
  3. Objective: Slaughterhouse data has recently been used to enhance animal disease surveillance in many countries, however has been largely underused for syndromic surveillance in the United States. We characterize spatiotemporal patterns and system dynamics of whole carcass swine condemnations in the US. We illustrate the value of data mining and machine learning approaches to more cost-effectively identify: emerging trends by condemnation reason, areas and time periods with higher than predicted condemnation rates, and regions or time periods with similar trends. Methods: Swine slaughter and condemnation data from 2005-2016 were obtained for slaughterhouses inspected by the Food Safety and Inspection Service (FSIS). Time series of condemnation rates by condemnation reason, type of pig, state and month were generated. Data time warping (DTW) and hierarchical clustering methods were used to identify states with similar patterns in the rate of condemnation cases by cause and type of pig. Spatiotemporal scan statistics were used to identify states and months with significantly higher number of condemnation cases than expected. Clusters were compared to historic infectious disease outbreaks in the swine industry. Results: Between 2005-2016, 1,109,300 whole swine carcasses were condemned. The top causes for condemnation were abscess/pyemia, septicemia, pneumonia, icterus, and peritonitis, respectively. DTW and cluster analysis revealed clear spatiotemporal patterns in the rate of condemnations, many with a strong seasonal component. Several clusters were detected in timeframes where widespread outbreaks had occurred. Conclusions: Timely evaluation of spatiotemporal patterns in swine condemnations may provide critical information in predicting disease outbreaks. Identification of spatiotemporal hot spots can direct investigation of primary on-farm risk factors contributing to condemnation. Risk mitigation through targeted decision-making and improved management practices can minimize carcass condemnations and animal losses, improving economic efficiency, profitability and sustainability of the US swine industry 
    more » « less
  4. Introduction:Detecting water contamination in community housing is crucial for protecting public health. Early detection enables timely action to prevent waterborne diseases and ensures equitable access to safe drinking water. Traditional methods recommended by the Environmental Protection Agency (EPA) rely on collecting water samples and conducting lab tests, which can be both time-consuming and costly. Methods:To address these limitations, this study introduces a Graph Attention Network (GAT) to predict lead contamination in drinking water. The GAT model leverages publicly available municipal records and housing information to model interactions between homes and identify contamination patterns. Each house is represented as a node, and relationships between nodes are analyzed to provide a clearer understanding of contamination risks within the community. Results:Using data from Flint, Michigan, the model demonstrated higher performance compared to traditional methods. Specifically, the GAT achieved an accuracy of 0.80, precision of 0.71, and recall of 0.93, outperforming XGBoost, a classical machine learning algorithm, which had an accuracy of 0.70, precision of 0.66, and recall of 0.67. Discussion:In addition to its predictive capabilities, the GAT model identifies key factors contributing to lead contamination, enabling more precise targeting of at-risk areas. This approach offers a practical tool for policymakers and public health officials to assess and mitigate contamination risks, ultimately improving community health and safety. 
    more » « less
  5. Abstract Species distribution modeling (SDM) has become an increasingly common approach to explore questions about ecology, geography, outbreak risk, and global change as they relate to infectious disease vectors. Here, we conducted a systematic review of the scientific literature, screening 563 abstracts and identifying 204 studies that used SDMs to produce distribution estimates for mosquito species. While the number of studies employing SDM methods has increased markedly over the past decade, the overwhelming majority used a single method (maximum entropy modeling; MaxEnt) and focused on human infectious disease vectors or their close relatives. The majority of regional models were developed for areas in Africa and Asia, while more localized modeling efforts were most common for North America and Europe. Findings from this study highlight gaps in taxonomic, geographic, and methodological foci of current SDM literature for mosquitoes that can guide future efforts to study the geography of mosquito-borne disease risk. Graphical Abstract 
    more » « less