skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 18, 2025

Title: Shape-shifting Elephants: Multi-modal Transport for Integrated Research Infrastructure
Data Acquisition (DAQ) workloads form an important class of scientific network traffic that by its nature (1) flows across different research infrastructure, including remote instruments and supercomputer clusters, (2) has ever-increasing through-put demands, and (3) has ever-increasing integration demands—for example, observations at one instrument could trigger a reconfiguration of another instrument. Today’s DAQ transfers rely on UDP and (heavily tuned) TCP, but this is driven by convenience rather than suitability. The mismatch between Internet transport protocols and scientific workloads becomes more stark with the steady increase in link capacities, data generation, and integration across research infrastructure. This position paper argues the importance of developing specialized transport protocols for DAQ workloads. It proposes a new transport feature for this kind of elephant flow: multi-modality involves the network actively configuring the transport protocol to change how DAQ flows are processed across different underlying networks that connect scientific research infrastructure. Multi-modality is a layering violation that is proposed as a pragmatic technique for DAQ transport protocol design. It takes advantage of programmable network hardware that is increasingly being deployed in scientific research infrastructure. The paper presents an initial evaluation through a pilot study that includes a Tofino2 switch and Alveo FPGA cards, and using data from a particle detector.  more » « less
Award ID(s):
2346499
PAR ID:
10601200
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400712722
Page Range / eLocation ID:
308 to 317
Format(s):
Medium: X
Location:
Irvine CA USA
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents the motivation and design of MTP, a new offload-friendly message transport protocol. Existing transport protocols like TCP, MPTCP, and UDP/Quic all have key limitations when used in a network that may potentially offload computation from end-servers into NICs, switches, and other network devices. To enable important new in-network computing use cases and correct congestion control in the face of ever changing network paths and application replicas, MTP introduces a new message transport protocol design and pathlet congestion control, a new approach where end-hosts explicitly communicate messaging information to network devices and network devices explicitly communicate network path and congestion information back to end-hosts. 
    more » « less
  2. In the past decade, GPUs have become an important resource for compute-intensive, general-purpose GPU applications such as machine learning, big data analysis, and large-scale simulations. In the future, with the explosion of machine learning and big data, application demands will keep increasing, resulting in more data and computation being pushed to GPUs. However, due to the slowing of Moore’s Law and rising manufacturing costs, it is becoming more and more challenging to add compute resources into a single GPU device to improve its throughput. As a result, spreading work across multiple GPUs is popular in data-centric and scientific applications. For example, Facebook uses 8 GPUs per server in their recent machine learning platform. However, research infrastructure has not kept pace with this trend: most GPU hardware simulators, including gem5, only support a single GPU. Thus, it is hard to study interference between GPUs, communication between GPUs, or work scheduling across GPUs. Our research group has been working to address this shortcoming by adding multi-GPU support to gem5. Here, we discuss the changes that were needed, which included updating the emulated driver, GPU components, and coherence protocol. 
    more » « less
  3. Science is being conducted in an era of information abundance. The rate at which science data is generated is increasing, both in volume and variety. This phenomenon is transforming how science is thought of and practiced. This transformation is being shaped by new scientific instruments that are being designed and deployed that will dramatically increase the need for large, real-time data transfers among scientists throughout the world. One such instrument is the Square Kilometer Array (SKA) being built in South Africa that will transmit approximately 160Gbps of data from each radio dish to a central processor. This paper describes a collaborative effort to respond to the demands of big data scientific instruments through the development of an international software defined exchange point (SDX) that will meet the network provisioning needs for science applications. This paper discusses the challenges of end-to-end path provisioning across multiple research and education networks using OpenFlow/SDN technologies. Furthermore, it refers to the AtlanticWave-SDX, a project at Florida International University and the Georgia Institute of Technology, funded by the US National Science Foundation (NSF), along with support from Brazil’s NREN, Rede Nacional de Ensino e Pesquisa (RNP, and the Academic Network of Sao Paulo (ANSP). Future work explores the feasibility of establishing an SDX in West Africa, in collaboration with regional African RENs, based on the planned availability of submarine cable spectrum for use by research and education communities. 
    more » « less
  4. null (Ed.)
    Recent advances in cyber-infrastructure have enabled digital data sharing and ubiquitous network connectivity between scientific instruments and cloud-based storage infrastructure for uploading, storing, curating, and correlating of large amounts of materials and semiconductor fabrication data and metadata. However, there is still a significant number of scientific instruments running on old operating systems that are taken offline and cannot connect to the cloud infrastructure, due to security and network performance concerns. In this paper, we propose BRACELET - an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance & security challenges that scientific instruments face when they are connected to the cloud through public network. With BRACELET, we put a networked edge device, called cloudlet, in between the scientific instruments and the cloud as the middle tier of a three-tier hierarchy. The cloudlet will shape and protect the data traffic from scientific instruments to the cloud, and will play a foundational role in keeping the instruments connected throughout its lifetime, and continuously providing the otherwise missing performance and security features for the instrument as its operating system ages. 
    more » « less
  5. Future wireless networks need to support the increasing demands for high data rates and improved coverage. One promising solution is sectorization, where an infrastructure node (e.g., a base station) is equipped with multiple sectors employing directional communication. Although the concept of sectorization is not new, it is critical to fully understand the potential of sectorized networks, such as the rate gain achieved when multiple sectors can be simultaneously activated. In this paper, we focus on sectorized wireless networks, where sectorized infrastructure nodes with beam-steering capabilities form a multi-hop mesh network for data forwarding and routing. We present a sectorized node model and characterize the capacity region of these sectorized networks. We define the flow extension ratio and the corresponding sectorization gain, which quantitatively measure the performance gain introduced by node sectorization as a function of the network flow. Our objective is to find the optimal sectorization of each node that achieves the maximum flow extension ratio, and thus the sectorization gain. Towards this goal, we formulate the corresponding optimization problem and develop an efficient distributed algorithm that obtains the node sectorization under a given network flow with an approximation ratio of 2/3. Through extensive simulations, we evaluate the sectorization gain and the performance of the proposed algorithm in various network scenarios with varying network flows. The simulation results show that the approximate sectorization gain increases sublinearly as a function of the number of sectors per node. 
    more » « less