Search for: All records

Creators/Authors contains: "Berry, Jonathan"

« Prev Next »

Total Resources

6

Resource Type
Conference Paper

3

Conference Proceeding

0

Dataset

0

Journal Article

3

Workshop Report

0

Availability
Full Text / Resource Available

6

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automatic HBM Management: Models and Algorithms

https://doi.org/10.1145/3490148.3538570

DeLayo, Daniel ; Zhang, Kenny ; Agrawal, Kunal ; Bender, Michael A. ; Berry, Jonathan W. ; Das, Rathish ; Moseley, Benjamin ; Phillips, Cynthia A. ( July 2022 , Proc. 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA))

Full Text Available
Timely Reporting of Heavy Hitters Using External Memory

https://doi.org/10.1145/3472392

Singh, Shikha ; Pandey, Prashant ; Bender, Michael A. ; Berry, Jonathan W. ; Farach-Colton, Martín ; Johnson, Rob ; Kroeger, Thomas M. ; Phillips, Cynthia A. ( December 2021 , ACM Transactions on Database Systems)

Given an input stream S of size N , a ɸ-heavy hitter is an item that occurs at least ɸN times in S . The problem of finding heavy-hitters is extensively studied in the database literature. We study a real-time heavy-hitters variant in which an element must be reported shortly after we see its T = ɸ N-th occurrence (and hence it becomes a heavy hitter). We call this the Timely Event Detection ( TED ) Problem. The TED problem models the needs of many real-world monitoring systems, which demand accurate (i.e., no false negatives) and timely reporting of all events from large, high-speed streams with a low reporting threshold (high sensitivity). Like the classic heavy-hitters problem, solving the TED problem without false-positives requires large space (Ω (N) words). Thus in-RAM heavy-hitters algorithms typically sacrifice accuracy (i.e., allow false positives), sensitivity, or timeliness (i.e., use multiple passes). We show how to adapt heavy-hitters algorithms to external memory to solve the TED problem on large high-speed streams while guaranteeing accuracy, sensitivity, and timeliness. Our data structures are limited only by I/O-bandwidth (not latency) and support a tunable tradeoff between reporting delay and I/O overhead. With a small bounded reporting delay, our algorithms incur only a logarithmic I/O overhead. We implement and validate our data structures empirically using the Firehose streaming benchmark. Multi-threaded versions of our structures can scale to process 11M observations per second before becoming CPU bound. In comparison, a naive adaptation of the standard heavy-hitters algorithm to external memory would be limited by the storage device’s random I/O throughput, i.e., ≈100K observations per second.
more » « less
Full Text Available
Timely Reporting of Heavy Hitters using External Memory

Singh, Shikha ; Pandey, Prashant ; Bender, Michael A. ; Berry, Jonathan W. ; Farach-Colton, Mart\'\i ; Johnson, Rob ; Kroeger, Thomas M. ; Phillips, Cynthia A. ( January 2021 , ACM transactions on database systems)
null (Ed.)
Full Text Available
How to Manage High-Bandwidth Memory Automatically

https://doi.org/10.1145/3350755.3400233

Das, Rathish ; Agrawal, Kunal ; Bender, Michael A. ; Berry, Jonathan ; Moseley, Benjamin ; Phillips, Cynthia A. ( July 2020 , Symposium on Parallelism in Algorithms and Architectures)

Full Text Available
Optimizing for KNL Usage Modes When Data Doesn’t Fit in MCDRAM

https://doi.org/10.1145/3225058.3225116

Butcher, Neil ; Olivier, Stephen L. ; Berry, Jonathan ; Hammond, Simon D. ; Kogge, Peter M. ( August 2018 , International Conference on Parallel Processing)

Technologies such as Multi-Channel DRAM (MCDRAM) or High Bandwidth Memory (HBM) provide significantly more bandwidth than conventional memory. This trend has raised questions about how applications should manage data transfers between levels.This paper focuses on evaluating different usage modes of the MCDRAM in Intel Knights Landing (KNL) manycore processors. We evaluate these usage modes with a sorting kernel and a sortingbased streaming benchmark. We develop a performance model for the benchmark and use experimental evidence to demonstrate the correctness of the model. The model projects near-optimal numbers of copy threads for memory bandwidth bound computations. We demonstrate on KNL up to a 1.9X speedup for sort when the problem does not fit in MCDRAM over an OpenMP GNU sort that does not use MCDRAM.
more » « less
Full Text Available
Making social networks more human: A topological approach

https://doi.org/10.1002/sam.11420

Berry, Jonathan W. ; Phillips, Cynthia A. ; Saia, Jared ( July 2019 , Statistical Analysis and Data Mining: The ASA Data Science Journal)

Abstract
A key problem in social network analysis is to identify nonhuman interactions. State‐of‐the‐art bot‐detection systems like Botometer train machine‐learning models on user‐specific data. Unfortunately, these methods do not work on data sets in which only topological information is available. In this paper, we propose a new, purely topological approach. Our method removes edges that connect nodes exhibiting strong evidence of non‐human activity from publicly available electronic‐social‐network datasets, including, for example, those in the Stanford Network Analysis Project repository (SNAP). Our methodology is inspired by classic work in evolutionary psychology by Dunbar that posits upper bounds on the total strength of the set of social connections in which a single human can be engaged. We model edge strength with Easley and Kleinberg's topological estimate; label nodes as “violators” if the sum of these edge strengths exceeds a Dunbar‐inspired bound; and then remove the violator‐to‐violator edges. We run our algorithm on multiple social networks and show that our Dunbar‐inspired bound appears to hold for social networks, but not for nonsocial networks. Our cleaning process classifies 0.04% of the nodes of the Twitter‐2010 followers graph as violators, and we find that more than 80% of these violator nodes have Botometer scores of 0.5 or greater. Furthermore, after we remove the roughly 15 million violator‐violator edges from the 1.2‐billion‐edge Twitter‐2010 follower graph, 34% of the violator nodes experience a factor‐of‐two decrease in PageRank. PageRank is a key component of many graph algorithms such as node/edge ranking and graph sparsification. Thus, this artificial inflation would bias algorithmic output, and result in some incorrect decisions based on this output.

more » « less