skip to main content


Title: Preserving Addressability Upon GC-Triggered Data Movements on Non-Volatile Memory
This article points out an important threat that application-level Garbage Collection (GC) creates to the use of non-volatile memory (NVM). Data movements incurred by GC may invalidate the pointers to objects on NVM and, hence, harm the reusability of persistent data across executions. The article proposes the concept of movement-oblivious addressing (MOA), and develops and compares three novel solutions to materialize the concept for solving the addressability problem. It evaluates the designs on five benchmarks and a real-world application. The results demonstrate the promise of the proposed solutions, especially hardware-supported Multi-Level GPointer, in addressing the problem in a space- and time-efficient manner.  more » « less
Award ID(s):
1717425 2107068
NSF-PAR ID:
10358551
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
ACM Transactions on Architecture and Code Optimization
Volume:
19
Issue:
2
ISSN:
1544-3566
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Rationale

    Silicone wristbands have emerged as valuable passive samplers for monitoring of personal exposure to environmental contaminants in the rapidly developing field ofexposomics. Once deployed, silicone wristbands collect and hold a wealth of chemical information that can be interrogated using high‐resolution mass spectrometry (HRMS) to provide a broad coverage of chemical mixtures.

    Methods

    Gas chromatography coupled to Orbitrap™ mass spectrometry (GC/Orbitrap™ MS) was used to simultaneously perform suspect screening (using in‐house database) and unknown screening (using vendor databases) of extracts from wristbands worn by volunteers. The goal of this study was to optimize a workflow that allows detection of low levels of priority pollutants, with high reliability. In this regard, a data processing workflow for GC/Orbitrap™ MS was developed using a mixture of 123 environmentally relevant standards consisting of pesticides, flame retardants, organophosphate esters, and polycyclic aromatic hydrocarbons as test compounds.

    Results

    The optimized unknown screening workflow using a search index threshold of 750 resulted in positive identification of 70 analytes in validation samples, and a reduction in the number of false positives by over 50%. An average of 26 compounds with high confidence identification, 7 level 1 compounds and 19 level 2 compounds, were observed in worn wristbands. The data were further analyzed via suspect screening and retrospective suspect screening to identify an additional 36 compounds.

    Conclusions

    This study provides three important findings: (1) a clear evidence of the importance of sample cleanup in addressing complex sample matrices for unknown analysis, (2) a valuable workflow for the identification of unknown contaminants in silicone wristband samplers using electron ionization HRMS data, and (3) a novel application of GC/Orbitrap™ MS for the unknown analysis of organic contaminants that can be used in exposomics studies.

     
    more » « less
  2. To process real-world datasets, modern data-parallel systems often require extremely large amounts of memory, which are both costly and energy inefficient. Emerging non-volatile memory (NVM) technologies offer high capacity compared to DRAM and low energy compared to SSDs. Hence, NVMs have the potential to fundamentally change the dichotomy between DRAM and durable storage in Big Data processing. However, most Big Data applications are written in managed languages and executed on top of a managed runtime that already performs various dimensions of memory management. Supporting hybrid physical memories adds a new dimension, creating unique challenges in data replacement. This article proposes Panthera, a semantics-aware, fully automated memory management technique for Big Data processing over hybrid memories. Panthera analyzes user programs on a Big Data system to infer their coarse-grained access patterns, which are then passed to the Panthera runtime for efficient data placement and migration. For Big Data applications, the coarse-grained data division information is accurate enough to guide the GC for data layout, which hardly incurs overhead in data monitoring and moving. We implemented Panthera in OpenJDK and Apache Spark. Based on Big Data applications’ memory access pattern, we also implemented a new profiling-guided optimization strategy, which is transparent to applications. With this optimization, our extensive evaluation demonstrates that Panthera reduces energy by 32–53% at less than 1% time overhead on average. To show Panthera’s applicability, we extend it to QuickCached, a pure Java implementation of Memcached. Our evaluation results show that Panthera reduces energy by 28.7% at 5.2% time overhead on average. 
    more » « less
  3. Much of environmental law and policy rests on an unspoken premise that accomplishing environmental goals may not require addressing root causes of environmental problems. For example, rather than regulating risks directly, society may adopt warnings that merely avoid risk, and rather than limiting plastic use and reducing plastic waste, society may adopt recycling programs. Such approaches may be well-intended and may come at a relatively low economic or political cost. However, they often prove ineffective or even harmful, and they may mislead society into believing that further responses are unnecessary. This Article proposes the concept of “too-easy solutions” to describe these approaches. Too easy solutions can be classified into three subcategories: fig leaves—policy approaches that appear to do something about a problem without necessarily solving it; pipe dreams—policy approaches that are adopted with the good faith expectation of solving the problem but are inherently flawed; and myopic solutions—approaches that address part of the problem but may impede its overall resolution. Too-easy solutions analysis can serve as a powerful mechanism for evaluating policies and improving decisionmaking in the environmental arena and other areas as well. 
    more » « less
  4. It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies. 
    more » « less
  5. Climate change is generating sufficient risk for nation‐states and citizens throughout the Arctic to warrant potentially radical geoengineering solutions. Currently, geoengineering solutions such as surface albedo modification or aerosol deployment are in the early stages of testing and development. Due to the scale of deployments necessary to enact change, and their preliminary nature, these methods are likely to result in unforeseen consequences. These consequences may range in severity from local ecosystem impacts to large scale changes in available solar energy. The Arctic is an area that is experiencing rapid change, increased development, and exploratory interest, and proposed solutions have the potential to produce new risks to both natural and human systems. This article examines potential security and ethical considerations of geoengineering solutions in the Arctic from the perspectives of securitization, consequentialism, and risk governance ap‐ proaches, and argues that proactive and preemptive frameworks at the international level, and es‐ pecially the application of risk governance approaches, will be needed to prevent or limit negative consequences resulting from geoengineering efforts. Utilizing the unique structures already present in Arctic governance provides novel options for addressing these concerns from both the perspec‐ tive of inclusive governance and through advancing the understanding of uncertainty analysis and precautionary principles. 
    more » « less