skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Ordering operations for generic replicated data types using version trees
Data replication facilitates availability and recovery in a distributed environment. However, concurrent updates to multiple replicas result in divergence of data. Conflict-Free Replicated Data Types (CRDTs) are abstract data types that provide a principled approach to asynchronously reconcile this divergence. We propose a different perspective on the divergence of data, whereby we treat data divergences as versions of the data. That is, instead of treating it only as a problem that needs to be solved, we consider it also to be a feature that provides a way to track versioning and evolution of data. Versioning information is helpful in multiple scenarios, such as provenance tracking and system debugging. Doing so allows us to leverage concepts such as the version tree found in the literature for persistent (versioned) data structures. We show that many techniques used in CRDTs to order elements can be derived from version trees, which predates CRDTs by more than 20 years. Using version trees for maintaining order and append-only logs for storage, we propose a method to ensure convergence of arbitrary data types, while maintaining information related to the evolution of data.  more » « less
Award ID(s):
1703560 2107101 2027977
PAR ID:
10334287
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Workshop on Principles and Practice of Consistency for Distributed Data
Page Range / eLocation ID:
39 to 46
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we investigate extensions for Conflict-Free Replicated Data Types (CRDTs) that permit their use in failure-prone, heterogeneous, resource-constrained, distributed, multi-tier (cloud/edge/device) cloud deployments such as the Internet-of-Things (IoT), while addressing multiple CRDT limitations. Specifically, we employ distributed logging to implement robust, strong eventual consistency of replicas. Our approach also enables uniform reversal of operations and precludes the requirement of exactly-once delivery and idempotence imposed by operation-based CRDTs. Moreover, it exposes CRDT versions for use in debugging and history-based programming. We evaluate our approach for commonly used CRDTs and show that it enables higher operation throughput (up to 1.8x) versus conventional CRDTs for the workloads we consider. 
    more » « less
  2. In this paper, we investigate how to automatically persist versioned data structures in distributed settings (e.g. cloud + edge) using append-only storage. By doing so, we facilitate resiliency by enabling program state to survive program activations and termination, and program-level data structures and their version information to be accessed programmatically by multiple clients (for replay, provenance tracking, debugging, and coordination avoidance, and more). These features are useful in distributed, failure-prone contexts such as those for heterogeneous and pervasive Internet of Things (IoT) deployments. We prototype our approach within an open-source, distributed operating system for IoT. Our results show that it is possible to achieve algorithmic complexities similar to those of in-memory versioning but in a distributed setting. 
    more » « less
  3. Baldan, Paolo; de_Paiva, Valeria (Ed.)
    We describe ongoing work that models conflict-free replicated data types (CRDTs) from a coalgebraic point of view. CRDTs are data structures designed for replication across multiple physical locations in a distributed system. We show how to model a CRDT at the local replica level using a novel coalgebraic semantics for CRDTs. We believe this is the first step towards presenting a unified theory for specifying and verifying CRDTs and replicated state machines. As a case study, we consider emulation of CRDTs in terms of coalgebra. 
    more » « less
  4. Relational parametricity was first introduced by Reynolds for System F. Although System F provides a strong model for the type systems at the core of modern functional programming languages, it lacks features of daily programming practice such as complex data types. In order to reason parametrically about such objects, Reynolds’ seminal ideas need to be generalized to extensions of System F. Here, we explore such a generalization for the extension of System F by Generalized Algebraic Data Types (GADTs) as found in Haskell. Although GADTs generalize Algebraic Data Types (ADTs) — i.e., simple recursive types such as lists, trees, etc. — we show that naively extending the parametric treatment of these recursive types is not enough to tackle GADTs. We propose a tentative workaround for this issue, borrowing ideas from the categorical semantics of GADTs known as “functorial completion”. We discuss some applications, as well as some limitations, of this solution. 
    more » « less
  5. One of the most fundamental goals of modern biology is to achieve a deep understanding of the origin and maintenance of biodiversity. It has been observed that in some mixed-species animal societies, there appears to be a drive towards some degree of phenotypic trait matching, such as similar coloration or patterning. Here we build on these observations and hypothesize that selection in mixed-species animal societies, such as mixed-species bird flocks, may drive diversification, potentially leading to speciation. We review evidence for possible convergent evolution and even outright mimicry in flocks from southwestern China, where we have observed several cases in which species and subspecies differ from their closest relatives in traits that match particular flock types. However, understanding whether this is phenotypic matching driven by convergence, and whether this divergence has promoted biodiversity, requires testing multiple facets of this hypothesis. We propose a series of steps that can be used to tease apart alternative hypotheses to build our understanding of the potential role of convergence in diversification in participants of mixed-species societies. Even if our social convergence/divergence hypothesis is not supported, the testing at each step should help highlight alternative processes that may affect mixed-species flocks, trait evolution and possible convergence. This article is part of the theme issue ‘Mixed-species groups and aggregations: shaping ecological and behavioural patterns and processes’. 
    more » « less