Space-efficient Query Evaluation over Probabilistic Event Streams
Real-time decision making in IoT applications relies upon space-efficient evaluation of queries over streaming data. To model the uncertainty in the classification of data being processed, we consider the model of probabilistic strings --- sequences of discrete probability distributions over a finite set of events, and initiate the study of space complexity of streaming computation for different classes of queries over such probabilistic strings. We first consider the problem of computing the probability that a word, sampled from the distribution defined by the probabilistic string read so far, is accepted by a given deterministic finite automaton. We show that this regular pattern matching problem can be solved using space that is only poly-logarithmic in the string length (and polynomial in the size of the DFA) if we are allowed a multiplicative approximation error. Then we show how to generalize this result to quantitative queries specified by additive cost register automata --- these are automata that map strings to numerical values using finite control and registers that get updated using linear transformations. Finally, we consider the case when updates in such an automaton involve tests, and in particular, when there is a counter variable that can be either incremented or decremented but more »
Authors:
; ; ;
Award ID(s):
Publication Date:
NSF-PAR ID:
10163320
Journal Name:
LICS'20: 35th Annual ACM/IEEE Symposium on Logic in Computer Science
Page Range or eLocation-ID:
74 to 87
5. Motivated by applications in wireless networks and the Internet of Things, we consider a model of n nodes trying to reach consensus with high probability on their majority bit. Each node i is assigned a bit at time 0 and is a finite automaton with m bits of memory (i.e.,$2m$states) and a Poisson clock. When the clock of i rings, i can choose to communicate and is then matched to a uniformly chosen node j. The nodes j and i may update their states based on the state of the other node. Previous work has focused on minimizing the time to consensus and the probability of error, while our goal is minimizing the number of communications. We show that, when$m>3⁡log⁡log⁡log(n)$, consensus can be reached with linear communication cost, but this is impossible if$m. A key step is to distinguish when nodes can become aware of knowing the majority bit and stop communicating. We show that this is impossible if their memory is too low.