skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modular quantitative monitoring
In real-time decision making and runtime monitoring applications, declarative languages are commonly used as they facilitate modular high-level specifications with the compiler guaranteeing evaluation over data streams in an efficient and incremental manner. We introduce the model ofData Transducersto allowmodularcompilation of queries over streaming data. A data transducer maintains a finite set of data variables and processes a sequence of tagged data values by updating its variables using an allowed set of operations. The model allows unambiguous nondeterminism, exponentially succinct control, and combining values from parallel threads of computation. The semantics of the model immediately suggests an efficient streaming algorithm for evaluation. The expressiveness of data transducers coincides withstreamable regular transductions, a robust and streamable class of functions characterized by MSO-definable string-to-DAG transformations with no backward edges. We show that the novel features of data transducers, unlike previously studied transducers, make them as succinct as traditional imperative code for processing data streams, but the structuring of the transition function permits modular compilation. In particular, we show that operations such as parallel composition, union, prefix-sum, and quantitative analogs of combinators for unambiguous parsing, can be implemented by natural and succinct constructions on data transducers. To illustrate the benefits of such modularity in compilation, we define a new language for quantitative monitoring, QRE-Past, that integrates features of past-time temporal logic and quantitative regular expressions. While this combination allows a natural specification of a cardiac arrhythmia detection algorithm in QRE-Past, compilation of QRE-Past specifications into efficient monitors comes for free thanks to succinct constructions on data transducers.  more » « less
Award ID(s):
1763514
PAR ID:
10602889
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Association for Computing Machinery (ACM)
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
3
Issue:
POPL
ISSN:
2475-1421
Format(s):
Medium: X Size: p. 1-31
Size(s):
p. 1-31
Sponsoring Org:
National Science Foundation
More Like this
  1. Real-time decision making in emerging IoT applications typically relies on computing quantitative summaries of large data streams in an efficient and incremental manner. To simplify the task of programming the desired logic, we propose StreamQRE, which provides natural and high-level constructs for processing streaming data. Our language has a novel integration of linguistic constructs from two distinct programming paradigms: streaming extensions of relational query languages and quantitative extensions of regular expressions. The former allows the programmer to employ relational constructs to partition the input data by keys and to integrate data streams from different sources, while the latter can be used to exploit the logical hierarchy in the input stream for modular specifications. We first present the core language with a small set of combinators, formal semantics, and a decidable type system. We then show how to express a number of common patterns with illustrative examples. Our compilation algorithm translates the high-level query into a streaming algorithm with precise complexity bounds on per-item processing time and total memory footprint. We also show how to integrate approximation algorithms into our framework. We report on an implementation in Java, and evaluate it with respect to existing high-performance engines for processing streaming data. Our experimental evaluation shows that (1) StreamQRE allows more natural and succinct specification of queries compared to existing frameworks, (2) the throughput of our implementation is higher than comparable systems (for example, two-to-four times greater than RxJava), and (3) the approximation algorithms supported by our implementation can lead to substantial memory savings. 
    more » « less
  2. Modular verification tools allow programmers to compositionally specify and prove function specifications. When using a modular verifier, proving a specification about a functionfrequires additional specifications for the functions called byf. With existing state of the art tools, programmers must manually write the specifications for callee functions. We present a counterexample guided algorithm to automatically infer these specifications. The algorithm is parameterized over a verifier, counterexample generator, and constraint guided synthesizer. We show that if each of these three components is sound and complete over a finite set of possible specifications, our algorithm is sound and complete as well. Additionally, we introducesize-boundedsynthesis functions, which extends our completeness result to an infinite set of possible specifications. In particular, we describe a size-bounded synthesis function for linear integer arithmetic constraints. We conclude with an evaluation demonstrating our technique on a variety of benchmarks. When using a modular verifier, proving a specification about a functionfrequires additional specifications for the functions called byf. With existing state of the art tools, programmers must manually write the specifications for callee functions. We present a counterexample guided algorithm to automatically infer these specifications. The algorithm is parameterized over a verifier, counterexample generator, and constraint guided synthesizer. We show that if each of these three components is sound and complete over a finite set of possible specifications, our algorithm is sound and complete as well. Additionally, we introducesize-boundedsynthesis functions, which extends our completeness result to an infinite set of possible specifications. In particular, we describe a size-bounded synthesis function for linear integer arithmetic constraints. We conclude with an evaluation demonstrating our technique on a variety of benchmarks. 
    more » « less
  3. We investigate online monitoring algorithms over dense-time and continuous-time signals for properties written in metric temporal logic (MTL). We consider an abstract algebraic semantics based on complete lattices. This semantics includes as special cases the standard Boolean (qualitative) semantics and the widely-used real-valued robustness (quantitative) semantics. Our semantics also extends to truth values that are partially ordered and allows the modeling of uncertainty in satisfaction. We propose a compositional approach for the construction of online monitors that transform exact representations of piecewise constant (dense-time and continuous-time) signals. These monitors are based on a class of infinite-state deterministic signal transducers that (1) are allowed to produce the output signal with some bounded delay relative to the input signal, and (2) do not introduce unbounded variability in the output signal. A key ingredient of our monitoring framework is an efficient algorithm for sliding-window aggregation over dense-time signals. We have implemented and experimentally evaluated our monitoring framework by comparing it to the recently proposed online monitoring tools Reelay and RTAMT. 
    more » « less
  4. Stream clustering is an important data mining technique to capture the evolving patterns in real-time data streams. Today’s data streams, e.g., IoT events and Web clicks, are usually high-speed and contain dynamically-changing patterns. Existing stream clustering algorithms usually follow an online-offline paradigm with a one-record-at-a-time update model, which was designed for running in a single machine. These stream clustering algorithms, with this sequential update model, cannot be efficiently parallelized and fail to deliver the required high throughput for stream clustering. In this paper, we present DistStream, a distributed framework that can effectively scale out online-offline stream clustering algorithms. To parallelize these algorithms for high throughput, we develop a mini-batch update model with efficient parallelization approaches. To maintain high clustering quality, DistStream’s mini-batch update model preserves the update order in all the computation steps during parallel execution, which can reflect the recent changes for dynamically-changing streaming data. We implement DistStream atop Spark Streaming, as well as four representative stream clustering algorithms based on DistStream. Our evaluation on three real-world datasets shows that DistStream-based stream clustering algorithms can achieve sublinear throughput gain and comparable (99%) clustering quality with their single-machine counterparts. 
    more » « less
  5. While mixed integer linear programming (MILP) solvers are routinely used to solve a wide range of important science and engineering problems, it remains a challenging task for end users to write correct and efficient MILP constraints, especially for problems specified using the inherently non-linear Boolean logic operations. To overcome this challenge, we propose a syntax guided synthesis (SyGuS) method capable of generating high-quality MILP constraints from the specifications expressed using arbitrary combinations of Boolean logic operations. At the center of our method is an extensible domain specification language (DSL) whose expressiveness may be improved by adding new integer variables as decision variables, together with an iterative procedure for synthesizing linear constraints from non-linear Boolean logic operations using these integer variables. To make the synthesis method efficient, we also propose an over-approximation technique for soundly proving the correctness of the synthesized linear constraints, and an under-approximation technique for safely pruning away the incorrect constraints. We have implemented and evaluated the method on a wide range of benchmark specifications from statistics, machine learning, and data science applications. The experimental results show that the method is efficient in handling these benchmarks, and the quality of the synthesized MILP constraints is close to, or higher than, that of manually-written constraints in terms of both compactness and solving time. 
    more » « less