skip to main content


Title: Combining data sources to elucidate spatial patterns in recreational catch and effort: fisheries-dependent data and local ecological knowledge applied to the South Florida bonefish fishery
Award ID(s):
1832229 1237517
NSF-PAR ID:
10122377
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Environmental Biology of Fishes
Volume:
102
Issue:
2
ISSN:
0378-1909
Page Range / eLocation ID:
299 to 317
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Two important considerations in clinical research studies are proper evaluations of internal and external validity. While randomized clinical trials can overcome several threats to internal validity, they may be prone to poor external validity. Conversely, large prospective observational studies sampled from a broadly generalizable population may be externally valid, yet susceptible to threats to internal validity, particularly confounding. Thus, methods that address confounding and enhance transportability of study results across populations are essential for internally and externally valid causal inference, respectively. These issues persist for another problem closely related to transportability known as data‐fusion. We develop a calibration method to generate balancing weights that address confounding and sampling bias, thereby enabling valid estimation of the target population average treatment effect. We compare the calibration approach to two additional doubly robust methods that estimate the effect of an intervention on an outcome within a second, possibly unrelated target population. The proposed methodologies can be extended to resolve data‐fusion problems that seek to evaluate the effects of an intervention using data from two related studies sampled from different populations. A simulation study is conducted to demonstrate the advantages and similarities of the different techniques. We also test the performance of the calibration approach in a motivating real data example comparing whether the effect of biguanides vs sulfonylureas—the two most common oral diabetes medication classes for initial treatment—on all‐cause mortality described in a historical cohort applies to a contemporary cohort of US Veterans with diabetes.

     
    more » « less
  2. Abstract

    This study presents an initial framework describing factors that could affect respondents’ decisions to link their survey data with their public Twitter data. It also investigates two types of factors, those related to the individual and to the design of the consent request. Individual-level factors include respondents’ attitudes towards helpful behaviours, privacy concerns and social media engagement patterns. The design factor focuses on the position of the consent request within the interview. These investigations were conducted using data that was collected from a web survey on a sample of Twitter users selected from an adult online probability panel in the United States. The sample was randomly divided into two groups, those who received the consent to link request at the beginning of the survey, and others who received the request towards the end of the survey. Privacy concerns, measures of social media engagement and consent request placement were all found to be related to consent to link. The findings have important implications for designing future studies that aim at linking social media data with survey data.

     
    more » « less
  3. Pooling and sharing data increases and distributes its value. But since data cannot be revoked once shared, scenarios that require controlled release of data for regulatory, privacy, and legal reasons default to not sharing. Because selectively controlling what data to release is difficult, the few data-sharing consortia that exist are often built around data-sharing agreements resulting from long and tedious one-off negotiations. We introduce Data Station, a data escrow designed to enable the formation of data-sharing consortia. Data owners share data with the escrow knowing it will not be released without their consent. Data users delegate their computation to the escrow. The data escrow relies on delegated computation to execute queries without releasing the data first. Data Station leverages hardware enclaves to generate trust among participants, and exploits the centralization of data and computation to generate an audit log. We evaluate Data Station on machine learning and data-sharing applications while running on an untrusted intermediary. In addition to important qualitative advantages, we show that Data Station: i) outperforms federated learning baselines in accuracy and runtime for the machine learning application; ii) is orders of magnitude faster than alternative secure data-sharing frameworks; and iii) introduces small overhead on the critical path. 
    more » « less
  4. Viegas, Domingos Xavier (Ed.)
    Data likelihood of fire detection is the probability of the observed detection outcome given the state of the fire spread model. We derive fire detection likelihood of satellite data as a function of the fire arrival time on the model grid. The data likelihood is constructed by a combination of the burn model, the logistic regression of the active fires detections, and the Gaussian distribution of the geolocation error. The use of the data likelihood is then demonstrated by an estimation of the ignition point of a wildland fire by the maximization of the likelihood of MODIS and VIIRS data over multiple possible ignition points. 
    more » « less