NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Incremental topological ordering and cycle detection with predictions

Mccauley, Samuel; Moseley, Benjamin; Niaparast, Aidin; Singh, Shikha (January 2025, PMLR)
Salakhutdinov, Ruslan; Kolter, Zico; Heller, Katherine; Weller, Adrian; Oliver, Nuria; Scarlett, Jonathan; Berkenkamp, Felix (Ed.)
This paper leverages the framework of algorithms-with-predictions to design data structures for two fundamental dynamic graph problems: incremental topological ordering and cycle detection. In these problems, the input is a directed graph on n nodes, and the m edges arrive one by one. The data structure must maintain a topological ordering of the vertices at all times and detect if the newly inserted edge creates a cycle. The theoretically best worst-case algorithms for these problems have high update cost (polynomial in n and m). In practice, greedy heuristics (that recompute the solution from scratch each time) perform well but can have high update cost in the worst case. In this paper, we bridge this gap by leveraging predictions to design a learned new data structure for the problems. Our data structure guarantees consistency, robustness, and smoothness with respect to predictions--that is, it has the best possible running time under perfect predictions, never performs worse than the best-known worst-case methods, and its running time degrades smoothly with the prediction error. Moreover, we demonstrate empirically that predictions, learned from a very small training dataset, are sufficient to provide significant speed-ups on real datasets.
more » « less
Free, publicly-accessible full text available January 3, 2026
Verifiable Crowd Computing: Coping with bounded rationality

https://doi.org/10.1016/j.tcs.2024.114631

Dong, Lu; Mosteiro, Miguel A; Singh, Shikha (July 2024, Theoretical Computer Science)

Full Text Available
Online List Labeling with Predictions

McCauley, Samuel; Moseley, Benjamin; Niaparast, Aidin; Singh, Shikha (May 2024, Curran Associates Inc.)

A growing line of work shows how learned predictions can be used to break through worst-cast barriers to improve the running time of an algorithm. However, incorporating predictions into data structures with strong theoretical guarantees remains underdeveloped. This paper takes a step in this direction by showing that predictions can be leveraged in the fundamental online list labeling problem. In the problem, n items arrive over time and must be stored in sorted order in an array of size Θ(n). The array slot of an element is its label and the goal is to maintain sorted order while minimizing the total number of elements moved (i.e., relabeled). We design a new list labeling data structure and bound its performance in two models. In the worst-case learning-augmented model, we give guarantees in terms of the error in the predictions. Our data structure provides strong theoretical guarantees— it is optimal for any prediction error and guarantees the best-known worst-case bound even when the predictions are entirely erroneous. We also consider a stochastic error model and bound the performance in terms of the expectation and variance of the error. Finally, the theoretical results are demonstrated empirically. In particular, we show that our data structure performs well on numerous real datasets, including temporal datasets where predictions are constructed from elements that arrived in the past (as is typically done in a practical use case).
more » « less
Full Text Available
Timely Reporting of Heavy Hitters Using External Memory

https://doi.org/10.1145/3472392

Singh, Shikha; Pandey, Prashant; Bender, Michael A.; Berry, Jonathan W.; Farach-Colton, Martín; Johnson, Rob; Kroeger, Thomas M.; Phillips, Cynthia A. (December 2021, ACM Transactions on Database Systems)

Given an input stream S of size N , a ɸ-heavy hitter is an item that occurs at least ɸN times in S . The problem of finding heavy-hitters is extensively studied in the database literature. We study a real-time heavy-hitters variant in which an element must be reported shortly after we see its T = ɸ N-th occurrence (and hence it becomes a heavy hitter). We call this the Timely Event Detection ( TED ) Problem. The TED problem models the needs of many real-world monitoring systems, which demand accurate (i.e., no false negatives) and timely reporting of all events from large, high-speed streams with a low reporting threshold (high sensitivity). Like the classic heavy-hitters problem, solving the TED problem without false-positives requires large space (Ω (N) words). Thus in-RAM heavy-hitters algorithms typically sacrifice accuracy (i.e., allow false positives), sensitivity, or timeliness (i.e., use multiple passes). We show how to adapt heavy-hitters algorithms to external memory to solve the TED problem on large high-speed streams while guaranteeing accuracy, sensitivity, and timeliness. Our data structures are limited only by I/O-bandwidth (not latency) and support a tunable tradeoff between reporting delay and I/O overhead. With a small bounded reporting delay, our algorithms incur only a logarithmic I/O overhead. We implement and validate our data structures empirically using the Firehose streaming benchmark. Multi-threaded versions of our structures can scale to process 11M observations per second before becoming CPU bound. In comparison, a naive adaptation of the standard heavy-hitters algorithm to external memory would be limited by the storage device’s random I/O throughput, i.e., ≈100K observations per second.
more » « less
Full Text Available
Telescoping Filter: A Practical Adaptive Filter

https://doi.org/10.4230/LIPIcs.ESA.2021.60

Lee, David J.; McCauley, Samuel; Singh, Shikha; Stein, Max (August 2021, Leibniz international proceedings in informatics)
Mutzel, Petra; Pagh, Rasmus; Herman, Grzegorz (Ed.)
Full Text Available
Microteaching: Semantics, Definition of a Computer, Running Times, Fractal Trees, Classes as Encapsulation, and P vs NP

https://doi.org/10.1145/3408877.3432582

Lewis, Colleen M.; Fisler, Kathi; Hinz, Jenny; Malan, David J.; Paley, Joshua E.; Pérez-Quiñones, Manuel A.; Singh, Shikha (March 2021, SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education)
null (Ed.)
SIGCSE is packed with teaching insights and inspiration. However, we get these insights and inspiration from hearing our colleagues talk about their teaching. Why not just watch them teach? This session does exactly that. Six exceptional educators will present their favorite piece of innovative lecture content just as they would to their students. The moderator, Colleen Lewis, will describe the central pedagogical move within the innovation and how this connects to education research. The goal of the session is to inspire SIGCSE attendees by highlighting innovative instruction by exceptional educators. The specific content of the innovative instruction may be applicable for some attendees, and the discussion of the underlying pedagogical move within each innovation can be applied across the attendees' teaching.
more » « less
Full Text Available

Search for: All records