skip to main content

Title: On the complexity of bidirected interleaved Dyck-reachability
Many program analyses need to reason about pairs of matching actions, such as call/return, lock/unlock, or set-field/get-field. The family of Dyck languages { D k }, where D k has k kinds of parenthesis pairs, can be used to model matching actions as balanced parentheses. Consequently, many program-analysis problems can be formulated as Dyck-reachability problems on edge-labeled digraphs. Interleaved Dyck-reachability (InterDyck-reachability), denoted by D k ⊙ D k -reachability, is a natural extension of Dyck-reachability that allows one to formulate program-analysis problems that involve multiple kinds of matching-action pairs. Unfortunately, the general InterDyck-reachability problem is undecidable. In this paper, we study variants of InterDyck-reachability on bidirected graphs , where for each edge ⟨ p , q ⟩ labeled by an open parenthesis ”( a ”, there is an edge ⟨ q , p ⟩ labeled by the corresponding close parenthesis ”) a ”, and vice versa . Language-reachability on a bidirected graph has proven to be useful both (1) in its own right, as a way to formalize many program-analysis problems, such as pointer analysis, and (2) as a relaxation method that uses a fast algorithm to over-approximate language-reachability on a directed graph. However, unlike its directed counterpart, the complexity more » of bidirected InterDyck-reachability still remains open. We establish the first decidable variant (i.e., D 1 ⊙ D 1 -reachability) of bidirected InterDyck-reachability. In D 1 ⊙ D 1 -reachability, each of the two Dyck languages is restricted to have only a single kind of parenthesis pair. In particular, we show that the bidirected D 1 ⊙ D 1 problem is in PTIME. We also show that when one extends each Dyck language to involve k different kinds of parentheses (i.e., D k ⊙ D k -reachability with k ≥ 2), the problem is NP-hard (and therefore much harder). We have implemented the polynomial-time algorithm for bidirected D 1 ⊙ D 1 -reachability. D k ⊙ D k -reachability provides a new over-approximation method for bidirected D k ⊙ D k -reachability in the sense that D k ⊙ D k -reachability can first be relaxed to bidirected D 1 ⊙ D 1 -reachability, and then the resulting bidirected D 1 ⊙ D 1 -reachability problem is solved precisely. We compare this D 1 ⊙ D 1 -reachability-based approach against another known over-approximating D k ⊙ D k -reachability algorithm. Surprisingly, we found that the over-approximation approach based on bidirected D 1 ⊙ D 1 -reachability computes more precise solutions, even though the D 1 ⊙ D 1 formalism is inherently less expressive than the D k ⊙ D k formalism. « less
Authors:
; ;
Award ID(s):
1917924
Publication Date:
NSF-PAR ID:
10283406
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
5
Issue:
POPL
Page Range or eLocation-ID:
1 to 28
ISSN:
2475-1421
Sponsoring Org:
National Science Foundation
More Like this
  1. Motivated by an attempt to understand the formation and development of (human) language, we introduce a "distributed compression" problem. In our problem a sequence of pairs of players from a set of K players are chosen and tasked to communicate messages drawn from an unknown distribution Q. Arguably languages are created and evolve to compress frequently occurring messages, and we focus on this aspect. The only knowledge that players have about the distribution Q is from previously drawn samples, but these samples differ from player to player. The only common knowledge between the players is restricted to a common priormore »distribution P and some constant number of bits of information (such as a learning algorithm). Letting T_eps denote the number of iterations it would take for a typical player to obtain an eps-approximation to Q in total variation distance, we ask whether T_eps iterations suffice to compress the messages down roughly to their entropy and give a partial positive answer. We show that a natural uniform algorithm can compress the communication down to an average cost per message of O(H(Q) + log (D(P || Q) + O(1)) in $\tilde{O}(T_eps)$ iterations while allowing for O(eps)-error, where D(. || .) denotes the KL-divergence between distributions. For large divergences this compares favorably with the static algorithm that ignores all samples and compresses down to H(Q) + D(P || Q) bits, while not requiring (T_eps . K) iterations that it would take players to develop optimal but separate compressions for each pair of players. Along the way we introduce a "data-structural" view of the task of communicating with a natural language and show that our natural algorithm can also be implemented by an efficient data structure, whose storage is comparable to the storage requirements of Q and whose query complexity is comparable to the lengths of the message to be compressed. Our results give a plausible mathematical analogy to the mechanisms by which human languages get created and evolve, and in particular highlights the possibility of coordination towards a joint task (agreeing on a language) while engaging in distributed learning.« less
  2. Consider an algorithm performing a computation on a huge random object (for example a random graph or a "long" random walk). Is it necessary to generate the entire object prior to the computation, or is it possible to provide query access to the object and sample it incrementally "on-the-fly" (as requested by the algorithm)? Such an implementation should emulate the random object by answering queries in a manner consistent with an instance of the random object sampled from the true distribution (or close to it). This paradigm is useful when the algorithm is sub-linear and thus, sampling the entire objectmore »up front would ruin its efficiency. Our first set of results focus on undirected graphs with independent edge probabilities, i.e. each edge is chosen as an independent Bernoulli random variable. We provide a general implementation for this model under certain assumptions. Then, we use this to obtain the first efficient local implementations for the Erdös-Rényi G(n,p) model for all values of p, and the Stochastic Block model. As in previous local-access implementations for random graphs, we support Vertex-Pair and Next-Neighbor queries. In addition, we introduce a new Random-Neighbor query. Next, we give the first local-access implementation for All-Neighbors queries in the (sparse and directed) Kleinberg’s Small-World model. Our implementations require no pre-processing time, and answer each query using O(poly(log n)) time, random bits, and additional space. Next, we show how to implement random Catalan objects, specifically focusing on Dyck paths (balanced random walks on the integer line that are always non-negative). Here, we support Height queries to find the location of the walk, and First-Return queries to find the time when the walk returns to a specified location. This in turn can be used to implement Next-Neighbor queries on random rooted ordered trees, and Matching-Bracket queries on random well bracketed expressions (the Dyck language). Finally, we introduce two features to define a new model that: (1) allows multiple independent (and even simultaneous) instantiations of the same implementation, to be consistent with each other without the need for communication, (2) allows us to generate a richer class of random objects that do not have a succinct description. Specifically, we study uniformly random valid q-colorings of an input graph G with maximum degree Δ. This is in contrast to prior work in the area, where the relevant random objects are defined as a distribution with O(1) parameters (for example, n and p in the G(n,p) model). The distribution over valid colorings is instead specified via a "huge" input (the underlying graph G), that is far too large to be read by a sub-linear time algorithm. Instead, our implementation accesses G through local neighborhood probes, and is able to answer queries to the color of any given vertex in sub-linear time for q ≥ 9Δ, in a manner that is consistent with a specific random valid coloring of G. Furthermore, the implementation is memory-less, and can maintain consistency with non-communicating copies of itself.« less
  3. We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edge-weighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on non-bipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at mostmore »2k + 1 or weight-increasing paths of length at most 2k. The choice of k = 2 leads to a 2/3-approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3-approximate maximum cardinality matching to reduce its runtime in practice. A 1/2-approximation algorithm may be obtained using k = 1, which is faster than the 2/3-approximation algorithm but it computes lower weights. The 2/3-approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2-approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2-approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2-approximation algorithm) while the 2/3-approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are well-suited for parallel implementation since they can process vertices to match in any order. The 1/2- and 2/3-approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is live-lock free.« less
  4. Background: Because of prior experience solving well-structured problems that have single, correct answers, students often struggle to direct their own design work and may not understand the need to frame ill-structured design problems. Purpose: Framing agency—defined as making decisions that are consequential to framing design problems and learning through this process—sheds light on students’ treatment of design problems; by framing, we mean the various actions designers take to understand, define, and bound the problem. Using the construct framing agency, we sought to characterize design team discourse to detect whether students treated design problems as ill- or well-structured and examine themore »consequences of this treatment. Method: Data were collected through extended participant observation of a capstone design course in a biomedical engineering program at a large research university. Data included audio and video records of design team meetings over the course of framing and solving industry-sponsored problems. For this paper, we analyzed three cases using sociolinguistic content analysis to characterize framing agency and compared the cases to illuminate the nuances of framing agency. Results: All teams faced impasses; one team navigated the impasse by framing the problem, whereas the others treated the problem as given. We identified markers of agency in students’ discourse, including tentative language, personal pronouns, and sharing ownership. Conclusions: Framing agency clarifies the kinds of learning experiences students need in order to overcome past experiences dominated by solving archetypical well-structured problems with predetermined solutions.« less
  5. Motivated by the use of high speed circuit switches in large scale data centers, we consider the problem of circuit switch scheduling. In this problem we are given demands between pairs of servers and the goal is to schedule at every time step a matching between the servers while maximizing the total satisfied demand over time. The crux of this scheduling problem is that once one shifts from one matching to a different one a fixed delay delta is incurred during which no data can be transmitted. For the offline version of the problem we present a (1-(1/e)-epsilon) approximation ratiomore »(for any constant epsilon >0). Since the natural linear programming relaxation for the problem has an unbounded integrality gap, we adopt a hybrid approach that combines the combinatorial greedy with randomized rounding of a different suitable linear program. For the online version of the problem we present a (bi-criteria) ((e-1)/(2e-1)-epsilon)-competitive ratio (for any constant epsilon >0 ) that exceeds time by an additive factor of O(delta/epsilon). We note that no uni-criteria online algorithm is possible. Surprisingly, we obtain the result by reducing the online version to the offline one.« less