skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Analyzing Scientific Data Sharing Patterns for In-network Data Caching
The volume of data moving through a network increases with new scientific experiments and simulations. Network bandwidth requirements also increase proportionally to deliver data within a certain time frame. We observe that a significant portion of the popular dataset is transferred multiple times to different users as well as to the same user for various reasons. In-network data caching for the shared data has shown to reduce the redundant data transfers and consequently save network traffic volume. In addition, overall application performance is expected to improve with in-network caching because access to the locally cached data results in lower latency. This paper shows how much data was shared over the study period, how much network traffic volume was consequently saved, and how much the temporary in-network caching increased the scientific application performance. It also analyzes data access patterns in applications and the impacts of caching nodes on the regional data repository. From the results, we observed that the network bandwidth demand was reduced by nearly a factor of 3 over the study period.  more » « less
Award ID(s):
2030508 1836650 1148698 1541349
PAR ID:
10296564
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
SNTA '21: Proceedings of the 2021 on Systems and Network Telemetry and Analytics
Page Range / eLocation ID:
9 to 16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Much of today's traffic flows between datacenters over private networks. The operators of those networks have access to detailed traffic profiles with performance goals that need to be met as efficiently as possible, e.g., realizing latency guarantees with minimal network bandwidth. Of particular interest is the extent to which traffic (re)shaping can be of benefit. The paper focuses on the most basic network configuration, namely, a single link network, with extensions to more general, multi-node networks discussed in a companion paper. The main results are in the form of optimal solutions for different types of schedulers of varying complexity. They demonstrate how judicious traffic shaping can help lower complexity schedulers perform nearly as well as more complex ones. 
    more » « less
  2. In-network caching constitutes a promising approach to reduce traffic loads and alleviate congestion in both wired and wireless networks. In this paper, we study the joint caching and routing problem in congestible networks of arbitrary topology (JoCRAT) as a generalization of previous efforts in this particular field. We show that JoCRAT extends many previous problems in the caching literature that are intractable even with specific topologies and/or assumed unlimited bandwidth of communications. To handle this significant but challenging problem, we develop a novel approximation algorithm with guaranteed performance bound based on a randomized rounding technique. Evaluation results demonstrate that our proposed algorithm achieves nearoptimal performance over a broad array of synthetic and real networks, while significantly outperforming the state-of-the-art methods. 
    more » « less
  3. Recent work shows that programmable switches can effectively detect attack traffic, such as denial-of-service attacks in the midst of high-volume network traffic. However, these techniques primarily rely on sampling or sketch-based data structures, which can only be used to approximate the characteristics of dominant flows in the network. As a result, such techniques are unable to effectively detect low-volume attacks that stealthily add only a few packets to the network. Our work explores how the combination of programmable switches, Smart network interface cards, and hosts can enable fine-grained analysis of every flow in a network, even those with only a small number of packets. We focus on analyzing packets at the start of each flow, as those packets often can help indicate whether a flow is benign or suspicious. We propose a unified architecture that spans the full programmable dataplane to take advantage of the strengths of each type of device. We are developing new filter data structures to efficiently track flows on the switch, dataplane-based communication protocols to quickly coordinate between devices, and caching approaches on the SmartNIC that help minimize the traffic load reaching the host. Our preliminary prototype can handle the full pipe bandwidth of 1.4 Tbps of traffic entering the Tofino switch, forward only 20 Gbps to the SmartNIC, and minimize the traffic load to 5 Gbps reaching the host due to our efficient flow filter, packet batching, and SmartNIC-based cache. 
    more » « less
  4. null (Ed.)
    A continuing trend in many scientific disciplines is the growth in the volume of data collected by scientific instruments and the desire to rapidly and efficiently distribute this data to the scientific community. Transferring these large data sets to a geographically distributed research community consumes significant network bandwidth. As both the data volume and number of subscribers grows, reliable network multicast is a promising approach to reduce the rate of growth of the bandwidth needed to support efficient data distribution. In prior work, we identified a need for reliable network multicast: scientists engaged in atmospheric research subscribing to meteorological file-streams. Specifically, the University Cooperation Atmospheric Research (UCAR) uses the Local Data Manager (LDM) to disseminate data. This work describes a trial deployment of a multicast-enabled LDM, in which eight university campuses are connected via corresponding regional Research-and-Education Networks (RENs) and Internet2. Using this deployment, we evaluated the new version of LDM, LDM7, which uses network multicast with a reliable transport protocol, and leverages Layer-2 (L2) multipoint Virtual LAN (VLANIMPLS). A performance monitoring system was deployed to collect real-time performance of LDM7, which showed that our proof-of-concept prototype worked significantly better than the current production LDM, LDM6, in two ways: (i) LDM7 can distribute file streams faster than LDM6. With six subscribers, an almost 22-fold improvement was observed with LDM7 at 100 Mbps. And (ii) to achieve a similar performance, LDM7 significantly reduces the need for bandwidth, which reduced the bandwidth requirement by about 90% over LDM6 to achieve 20 Mbps average throughput across four subscribers. 
    more » « less
  5. We present BurstZ, a bandwidth-efficient accelerator platform for scientific computing. While accelerators such as GPUs and FPGAs provide enormous computing capabilities, their effectiveness quickly deteriorates once the working set becomes larger than the on-board memory capacity, causing the performance to become bottlenecked either by the communication bandwidth between the host and the accelerator. Compression has not been very useful in solving this issue due to the difficulty of efficiently compressing floating point numbers, which scientific data often consists of. Most compression algorithms are either ineffective with floating point numbers, or has a high performance overhead. BurstZ is an FPGA-based accelerator platform which addresses the bandwidth issue via a novel hardware-optimized floating point compression algorithm, which we call sZFP. We demonstrate that BurstZ can completely remove the communication bottleneck for accelerators, using a 3D stencil-code accelerator implemented on a prototype BurstZ implementation. Evaluated against hand-optimized implementations of stencil code accelerators of the same architecture, our BurstZ prototype outperformed an accelerator without compression by almost 4X, and even an accelerator with enough memory for the entire dataset by over 2X. BurstZ improved communication efficiency so much, our prototype was even able to outperform the upper limit projected performance of an optimized stencil core with ideal memory access characteristics, by over 2X. 
    more » « less