skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Estimating the optimal population upper bound for scan methods in retrospective disease surveillance
Abstract Correctly and quickly identifying disease patterns and clusters is a vital aspect of public health and epidemiology so that disease outbreaks can be mitigated as effectively as possible. The circular scan method is one of the most commonly used methods for detecting disease outbreaks and clusters in retrospective and prospective disease surveillance. The circular scan method requires a population upper bound in order to construct the set of candidate zones to be scanned, which is usually set to 50% of the total population. The performance of the circular scan method is affected by the choice of the population upper bound, and choosing an upper bound different from the default value can improve the method's performance. Recently, the Gini coefficient based on the Lorenz curve, which was originally used in economics, was proposed to determine a better population upper bound. We present the elbow method, a new method for choosing the population upper bound, which seeks to address some of the limitations of the Gini‐based method while improving the performance of the circular scan method over the default value. To evaluate the performance of the proposed approach, we evaluate the sensitivity and positive predictive value of the circular scan method for publicly‐available benchmark data for the default value, the Gini coefficient method, and the elbow method.  more » « less
Award ID(s):
1915277
PAR ID:
10446475
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Biometrical Journal
Volume:
63
Issue:
8
ISSN:
0323-3847
Page Range / eLocation ID:
p. 1633-1651
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The spatial distribution of disease cases can provide important insights into disease spread and its potential risk factors. Identifying disease clusters correctly can help us discover new risk factors and inform interventions to control and prevent the spread of disease as quickly as possible. In this study, we propose a novel scan method, the Prefiltered Component‐based Greedy (PreCoG) scan method, which efficiently and accurately detects irregularly shaped clusters using a prefiltered component‐based algorithm. The PreCoG scan method's flexibility allows it to perform well in detecting both regularly and irregularly‐shaped clusters. Additionally, it is fast to apply while providing high power, sensitivity, and positive predictive value for the detected clusters compared to other scan methods. To confirm the effectiveness of the PreCoG method, we compare its performance to many other scan methods. Additionally, we have implemented this method in thesmercR package to make it publicly available to other researchers. Our proposed PreCoG scan method presents a unique and innovative process for detecting disease clusters and can improve the accuracy of disease surveillance systems. 
    more » « less
  2. The detection of disease clusters in spatial data analysis plays a crucial role in public health, while the circular scan method is widely utilized for this purpose, accurately identifying non-circular (irregular) clusters remains challenging and reduces detection accuracy. To overcome this limitation, various extensions have been proposed to effectively detect arbitrarily shaped clusters. In this paper, we combine the strengths of two well-known methods, the flexible and elliptic scan methods, which are specifically designed for detecting irregularly shaped clusters. We leverage the unique characteristics of these methods to create candidate zones capable of accurately detecting irregularly shaped clusters, along with a modified likelihood ratio test statistic. By inheriting the advantages of the flexible and elliptic methods, our proposed approach represents a practical addition to the existing repertoire of spatial data analysis techniques. 
    more » « less
  3. ABSTRACT We explore unsupervised machine learning for galaxy morphology analyses using a combination of feature extraction with a vector-quantized variational autoencoder (VQ-VAE) and hierarchical clustering (HC). We propose a new methodology that includes: (1) consideration of the clustering performance simultaneously when learning features from images; (2) allowing for various distance thresholds within the HC algorithm; (3) using the galaxy orientation to determine the number of clusters. This set-up provides 27 clusters created with this unsupervised learning that we show are well separated based on galaxy shape and structure (e.g. Sérsic index, concentration, asymmetry, Gini coefficient). These resulting clusters also correlate well with physical properties such as the colour–magnitude diagram, and span the range of scaling relations such as mass versus size amongst the different machine-defined clusters. When we merge these multiple clusters into two large preliminary clusters to provide a binary classification, an accuracy of $$\sim 87{{\ \rm per\ cent}}$$ is reached using an imbalanced data set, matching real galaxy distributions, which includes 22.7 per cent early-type galaxies and 77.3 per cent late-type galaxies. Comparing the given clusters with classic Hubble types (ellipticals, lenticulars, early spirals, late spirals, and irregulars), we show that there is an intrinsic vagueness in visual classification systems, in particular galaxies with transitional features such as lenticulars and early spirals. Based on this, the main result in this work is not how well our unsupervised method matches visual classifications and physical properties, but that the method provides an independent classification that may be more physically meaningful than any visually based ones. 
    more » « less
  4. Abstract Patients with neuromuscular disease fail to produce necessary muscle force and have trouble maintaining joint moment required to perform activities of daily living. Measuring muscle force values in patients with neuromuscular disease is important but challenging. Electromyography (EMG) can be used to obtain muscle activation values, which can be converted to muscle forces and joint torques. Surface electrodes can measure activations of superficial muscles, but fine-wire electrodes are needed for deep muscles, although it is invasive and require skilled personnel and preparation time. EMG-driven modeling with surface electrodes alone could underestimate the net torque. In this research, authors propose a methodology to predict muscle activations from deeper muscles of the upper extremity. This method finds missing muscle activation one at a time by combining an EMG-driven musculoskeletal model and muscle synergies. This method tracks inverse dynamics joint moments to determine synergy vector weights and predict muscle activation of selected shoulder and elbow muscles of a healthy subject. In addition, muscle-tendon parameter values (optimal fiber length, tendon slack length, and maximum isometric force) have been personalized to the experimental subject. The methodology is tested for a wide range of rehabilitation tasks of the upper extremity across multiple healthy subjects. Results show this methodology can determine single unmeasured muscle activation up to Pearson's correlation coefficient (R) of 0.99 (root mean squared error, RMSE = 0.001) and 0.92 (RMSE = 0.13) for the elbow and shoulder muscles, respectively, for one degree-of-freedom (DoF) tasks. For more complicated five DoF tasks, activation prediction accuracy can reach up to R = 0.71 (RMSE = 0.29). 
    more » « less
  5. Objective: Slaughterhouse data has recently been used to enhance animal disease surveillance in many countries, however has been largely underused for syndromic surveillance in the United States. We characterize spatiotemporal patterns and system dynamics of whole carcass swine condemnations in the US. We illustrate the value of data mining and machine learning approaches to more cost-effectively identify: emerging trends by condemnation reason, areas and time periods with higher than predicted condemnation rates, and regions or time periods with similar trends. Methods: Swine slaughter and condemnation data from 2005-2016 were obtained for slaughterhouses inspected by the Food Safety and Inspection Service (FSIS). Time series of condemnation rates by condemnation reason, type of pig, state and month were generated. Data time warping (DTW) and hierarchical clustering methods were used to identify states with similar patterns in the rate of condemnation cases by cause and type of pig. Spatiotemporal scan statistics were used to identify states and months with significantly higher number of condemnation cases than expected. Clusters were compared to historic infectious disease outbreaks in the swine industry. Results: Between 2005-2016, 1,109,300 whole swine carcasses were condemned. The top causes for condemnation were abscess/pyemia, septicemia, pneumonia, icterus, and peritonitis, respectively. DTW and cluster analysis revealed clear spatiotemporal patterns in the rate of condemnations, many with a strong seasonal component. Several clusters were detected in timeframes where widespread outbreaks had occurred. Conclusions: Timely evaluation of spatiotemporal patterns in swine condemnations may provide critical information in predicting disease outbreaks. Identification of spatiotemporal hot spots can direct investigation of primary on-farm risk factors contributing to condemnation. Risk mitigation through targeted decision-making and improved management practices can minimize carcass condemnations and animal losses, improving economic efficiency, profitability and sustainability of the US swine industry 
    more » « less