skip to main content


Title: SNAPS: Sensor Analytics Point Solutions for Detection and Decision Support Systems
In this review, we discuss the role of sensor analytics point solutions (SNAPS), a reduced complexity machine-assisted decision support tool. We summarize the approaches used for mobile phone-based chemical/biological sensors, including general hardware and software requirements for signal transduction and acquisition. We introduce SNAPS, part of a platform approach to converge sensor data and analytics. The platform is designed to consist of a portfolio of modular tools which may lend itself to dynamic composability by enabling context-specific selection of relevant units, resulting in case-based working modules. SNAPS is an element of this platform where data analytics, statistical characterization and algorithms may be delivered to the data either via embedded systems in devices, or sourced, in near real-time, from mist, fog or cloud computing resources. Convergence of the physical systems with the cyber components paves the path for SNAPS to progress to higher levels of artificial reasoning tools (ART) and emerge as data-informed decision support, as a service for general societal needs. Proof of concept examples of SNAPS are demonstrated both for quantitative data and qualitative data, each operated using a mobile device (smartphone or tablet) for data acquisition and analytics. We discuss the challenges and opportunities for SNAPS, centered around the value to users/stakeholders and the key performance indicators users may find helpful, for these types of machine-assisted tools.  more » « less
Award ID(s):
1805512 1511953
NSF-PAR ID:
10177453
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Sensors
Volume:
19
Issue:
22
ISSN:
1424-8220
Page Range / eLocation ID:
4935
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this manuscript, we discuss relevant socioeconomic factors for developing and implementing sensor analytic point solutions (SNAPS) as point-of-care tools to serve impoverished communities. The distinct economic, environmental, cultural, and ethical paradigms that affect economically disadvantaged users add complexity to the process of technology development and deployment beyond the science and engineering issues. We begin by contextualizing the environmental burden of disease in select low-income regions around the world, including environmental hazards at work, home, and the broader community environment, where SNAPS may be helpful in the prevention and mitigation of human exposure to harmful biological vectors and chemical agents. We offer examples of SNAPS designed for economically disadvantaged users, specifically for supporting decision-making in cases of tuberculosis (TB) infection and mercury exposure. We follow-up by discussing the economic challenges that are involved in the phased implementation of diagnostic tools in low-income markets and describe a micropayment-based systems-as-a-service approach (pay-a-penny-per-use—PAPPU), which may be catalytic for the adoption of low-end, low-margin, low-research, and the development SNAPS. Finally, we provide some insights into the social and ethical considerations for the assimilation of SNAPS to improve health outcomes in marginalized communities. 
    more » « less
  2. Abstract

    Quantifying movement and demographic events of free‐ranging animals is fundamental to studying their ecology, evolution and conservation. Technological advances have led to an explosion in sensor‐based methods for remotely observing these phenomena. This transition to big data creates new challenges for data management, analysis and collaboration.

    We present the Movebank ecosystem of tools used by thousands of researchers to collect, manage, share, visualize, analyse and archive their animal tracking and other animal‐borne sensor data. Users add sensor data through file uploads or live data streams and further organize and complete quality control within the Movebank system. All data are harmonized to a data model and vocabulary. The public can discover, view and download data for which they have been given access to through the website, the Animal Tracker mobile app or by API. Advanced analysis tools are available through the EnvDATA System, the MoveApps platform and a variety of user‐developed applications. Data owners can share studies with select users or the public, with options for embargos, licenses and formal archiving in a data repository.

    Movebank is used by over 3,100 data owners globally, who manage over 6 billion animal location and sensor measurements across more than 6,500 studies, with thousands of active tags sending over 3 million new data records daily. These data underlie >700 published papers and reports. We present a case study demonstrating the use of Movebank to assess life‐history events and demography, and engage with citizen scientists to identify mortalities and causes of death for a migratory bird.

    A growing number of researchers, government agencies and conservation organizations use Movebank to manage research and conservation projects and to meet legislative requirements. The combination of historic and new data with collaboration tools enables broad comparative analyses and data acquisition and mapping efforts. Movebank offers an integrated system for real‐time monitoring of animals at a global scale and represents a digital museum of animal movement and behaviour. Resources and coordination across countries and organizations are needed to ensure that these data, including those that cannot be made public, remain accessible to future generations.

     
    more » « less
  3. It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies. 
    more » « less
  4. null (Ed.)
    A quiet revolution is afoot in the field of law. Technical systems employing algorithms are shaping and displacing professional decision making, and they are disrupting and restructuring relationships between law firms, lawyers, and clients. Decision-support systems marketed to legal professionals to support e-discovery—generally referred to as “technology assisted review” (TAR)—increasingly rely on “predictive coding”: machine-learning techniques to classify and predict which of the voluminous electronic documents subject to litigation should be withheld or produced to the opposing side. These systems and the companies offering them are reshaping relationships between lawyers and clients, introducing new kinds of professionals into legal practice, altering the discovery process, and shaping how lawyers construct knowledge about their cases and professional obligations. In the midst of these shifting relationships—and the ways in which these systems are shaping the construction and presentation of knowledge—lawyers are grappling with their professional obligations, ethical duties, and what it means for the future of legal practice. Through in-depth, semi-structured interviews of experts in the e-discovery technology space—the technology company representatives who develop and sell such systems to law firms and the legal professionals who decide whether and how to use them in practice—we shed light on the organizational structures, professional rules and norms, and technical system properties that are shaping and being reshaped by predictive coding systems. Our findings show that AI-supported decision systems such as these are reconfiguring professional work practices. In particular, they highlight concerns about potential loss of professional agency and skill, limited understanding and thereby both over- and under reliance on decision-support systems, and confusion about responsibility and accountability as new kinds of technical professionals and technologies are brought into legal practice. The introduction of predictive coding systems and the new professional and organizational arrangements they are ushering into legal practice compound general concerns over the opacity of technical systems with specific concerns about encroachments on the construction of expert knowledge, liability frameworks, and the potential (mis)alignment of machine reasoning with professional logic and ethics. Based on our findings, we conclude that predictive coding tools—and likely other algorithmic systems lawyers use to construct knowledge and reason about legal practice— challenge the current model for evaluating whether and how tools are appropriate for legal practice. As tools become both more complex and more consequential, it is unreasonable to rely solely on legal professionals—judges, law firms, and lawyers—to determine which technologies are appropriate for use. The legal professionals we interviewed report relying on the evaluation and judgment of a range of new technical experts within law firms and, increasingly, third-party vendors and their technical experts. This system for choosing technical systems upon which lawyers rely to make professional decisions—e.g., whether documents are responsive, or whether the standard of proportionality has been met—is no longer sufficient. As the tools of medicine are reviewed by appropriate experts before they are put out for consideration and adoption by medical professionals, we argue that the legal profession must develop new processes for determining which algorithmic tools are fit to support lawyers’ decision making. Relatedly, because predictive coding systems are used to produce lawyers’ professional judgment, we argue they must be designed for contestability— providing greater transparency, interaction, and configurability around embedded choices to ensure decisions about how to embed core professional judgments, such as relevance and proportionality, remain salient and demand engagement from lawyers, not just their technical experts. 
    more » « less
  5. Abstract. Advances in ambient environmental monitoring technologies are enabling concerned communities and citizens to collect data to better understand their local environment and potential exposures. These mobile, low-cost tools make it possible to collect data with increased temporal and spatial resolution, providing data on a large scale with unprecedented levels of detail. This type of data has the potential to empower people to make personal decisions about their exposure and support the development of local strategies for reducing pollution and improving health outcomes. However, calibration of these low-cost instruments has been a challenge. Often, a sensor package is calibrated via field calibration. This involves colocating the sensor package with a high-quality reference instrument for an extended period and then applying machine learning or other model fitting technique such as multiple linear regression to develop a calibration model for converting raw sensor signals to pollutant concentrations. Although this method helps to correct for the effects of ambient conditions (e.g., temperature) and cross sensitivities with nontarget pollutants, there is a growing body of evidence that calibration models can overfit to a given location or set of environmental conditions on account of the incidental correlation between pollutant levels and environmental conditions, including diurnal cycles. As a result, a sensor package trained at a field site may provide less reliable data when moved, or transferred, to a different location. This is a potential concern for applications seeking to perform monitoring away from regulatory monitoring sites, such as personal mobile monitoring or high-resolution monitoring of a neighborhood. We performed experiments confirming that transferability is indeed a problem and show that it can be improved by collecting data from multiple regulatory sites and building a calibration model that leverages data from a more diverse data set. We deployed three sensor packages to each of three sites with reference monitors (nine packages total) and then rotated the sensor packages through the sites over time. Two sites were in San Diego, CA, with a third outside of Bakersfield, CA, offering varying environmental conditions, general air quality composition, and pollutant concentrations. When compared to prior single-site calibration, the multisite approach exhibits better model transferability for a range of modeling approaches. Our experiments also reveal that random forest is especially prone to overfitting and confirm prior results that transfer is a significant source of both bias and standard error. Linear regression, on the other hand, although it exhibits relatively high error, does not degrade much in transfer. Bias dominated in our experiments, suggesting that transferability might be easily increased by detecting and correcting for bias. Also, given that many monitoring applications involve the deployment of many sensor packages based on the same sensing technology, there is an opportunity to leverage the availability of multiple sensors at multiple sites during calibration to lower the cost of training and better tolerate transfer. We contribute a new neural network architecture model termed split-NN that splits the model into two stages, in which the first stage corrects for sensor-to-sensor variation and the second stage uses the combined data of all the sensors to build a model for a single sensor package. The split-NN modeling approach outperforms multiple linear regression, traditional two- and four-layer neural networks, and random forest models. Depending on the training configuration, compared to random forest the split-NN method reduced error 0 %–11 % for NO2 and 6 %–13 % for O3. 
    more » « less