skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Verifying Visibility-Based Weak Consistency
Multithreaded programs generally leverage efficient and thread-safe concurrent objects like sets, key-value maps, and queues. While some concurrent-object operations are designed to behave atomically, each witnessing the atomic effects of predecessors in a linearization order, others forego such strong consistency to avoid complex control and synchronization bottlenecks. For example, contains (value) methods of key-value maps may iterate through key-value entries without blocking concurrent updates, to avoid unwanted performance bottlenecks, and consequently overlook the effects of some linearization-order predecessors. While such weakly-consistent operations may not be atomic, they still offer guarantees, e.g., only observing values that have been present. In this work we develop a methodology for proving that concurrent object implementations adhere to weak-consistency specifications. In particular, we consider (forward) simulation-based proofs of implementations against relaxed-visibility specifications, which allow designated operations to overlook some of their linearization-order predecessors, i.e., behaving as if they never occurred. Besides annotating implementation code to identify linearization points, i.e., points at which operations’ logical effects occur, we also annotate code to identify visible operations, i.e., operations whose effects are observed; in practice this annotation can be done automatically by tracking the writers to each accessed memory location. We formalize our methodology over a general notion of transition systems, agnostic to any particular programming language or memory model, and demonstrate its application, using automated theorem provers, by verifying models of Java concurrent object implementations.  more » « less
Award ID(s):
1816936
PAR ID:
10195725
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Lecture notes in computer science
Volume:
12075
ISSN:
0302-9743
Page Range / eLocation ID:
280-307
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The Transactional Data Structure Library (TDSL) methodology improves the programmability and performance of concurrent software by making it possible for programmers to compose multiple concurrent data structure operations into coarse-grained transactions. Like transactional memory, TDSL enables arbitrarily many operations on arbitrarily many data structures to appear to other threads as a single atomic, isolated transaction. Like concurrent data structures, the individual operations on a TDSL data structure are optimized to avoid artificial contention. We introduce techniques for reducing false conflicts in TDSL implementations. Our approach allows expressing the postconditions of operations entirely via semantic properties, instead of through low-level structural properties. Our design is general enough to support lists, deques, ordered and unordered maps, and vectors. It supports richer programming interfaces than are available in existing TDSL implementations. It is also capable of precise memory management, which is necessary in low-level languages like C++. 
    more » « less
  2. Large-scale software verification relies critically on the use of compositional languages, semantic models, specifications, and verification techniques. Recent work on certified abstraction layers synthesizes game semantics, the refinement calculus, and algebraic effects to enable the composition of heterogeneous components into larger certified systems. However, in existing models of certified abstraction layers, compositionality is restricted by the lack of encapsulation of state. In this paper, we present a novel game model for certified abstraction layers where the semantics of layer interfaces and implementations are defined solely based on their observable behaviors. Our key idea is to leverage Reddy's pioneer work on modeling the semantics of imperative languages not as functions on global states but as objects with their observable behaviors. We show that a layer interface can be modeled as an object type (i.e., a layer signature) plus an object strategy. A layer implementation is then essentially a regular map, in the sense of Reddy, from an object with the underlay signature to that with the overlay signature. A layer implementation is certified when its composition with the underlay object strategy implements the overlay object strategy. We also describe an extension that allows for non-determinism in layer interfaces. After formulating layer implementations as regular maps between object spaces, we move to concurrency and design a notion of concurrent object space, where sequential traces may be identified modulo permutation of independent operations. We show how to express protected shared object concurrency, and a ticket lock implementation, in a simple model based on regular maps between concurrent object spaces. 
    more » « less
  3. Recent work has shown how to augment any CAS-based concurrent data structure to support taking a snapshot of the current memory state. Taking the snapshot, as well as loads and CAS (Compare and Swap) operations, take constant time. Importantly, such snapshotting can be used to easily implement linearizable queries, such as range queries, over any part of a data structure. In this paper, we make two significant improvements over this approach. The first improvement removes a subtle and hard to reason about restriction that was needed to avoid a level of indirection on pointers. We introduce an approach, which we refer to as indirection-on-need, that removes the restriction, but yet almost always avoids indirection. The second improvement is to efficiently support snapshotting with lock-free locks. This requires supporting an idempotent CAS. We show a particularly simple solution to the problem that leverages the data structures used for snapshotting. Based on these ideas we implemented an easy-to-use C++ library, verlib, centered around a versioned pointer type. The library works with lock (standard or lock-free) and CAS based algorithms, or any combination. Converting existing concurrent data-structures to use the library takes minimal effort. We present results for experiments that use verlib to convert state-of-the-art data structures for ordered maps (a B-tree), radix-ordered maps (an ART-tree), and unordered maps (an optimized hash table) to be snapshottable. The snapshottable versions perform almost as well as the original versions and far outperform any previous implementations that support atomic range queries. 
    more » « less
  4. Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures. 
    more » « less
  5. Every numerical general relativistic investigation starts from the solution of the initial value equations at a given time. Astrophysically relevant initial values for different systems lead to distinct sets of equations that obey specific assumptions tied to the particular problem. Therefore, a robust and efficient solver for a variety of strongly gravitating sources is needed. In this work, we present the OpenMP version of the Compact Object CALculator (COCAL) on shared memory processors. We performed extensive profiling of the core COCAL modules in order to identify bottlenecks in efficiency, which we addressed. Using modest resources, the new parallel code achieves speedups of approximately one order of magnitude relative to the original serial COCAL code, which is crucial for parameter studies of computationally expensive systems such as magnetized neutron stars, as well as its further development towards more realistic scenarios. As a novel example of our new code, we compute a binary quark system where each companion has a dimensionless spin of 0.43 aligned with the orbital angular momentum. 
    more » « less