skip to main content


Title: A Comprehensive Tutorial on Science DMZ
Science and engineering applications are now generating data at an unprecedented rate. From large facilities such as the Large Hadron Collider to portable DNA sequencing devices, these instruments can produce hundreds of terabytes in short periods of time. Researchers and other professionals rely on networks to transfer data between sensing locations, instruments, data storage devices, and computing systems. While general-purpose networks, also referred to as enterprise networks, are capable of transporting basic data, such as e-mails and Web content, they face numerous challenges when transferring terabyte- and petabyte-scale data. At best, transfers of science data on these networks may last days or even weeks. In response to this challenge, the Science Demilitarized Zone (Science DMZ) has been proposed. The Science DMZ is a network or a portion of a network designed to facilitate the transfer of big science data. The main elements of the Science DMZ include: 1) specialized end devices, referred to as data transfer nodes (DTNs), built for sending/receiving data at a high speed over wide area networks; 2) high-throughput, friction-free paths connecting DTNs, instruments, storage devices, and computing systems; 3) performance measurement devices to monitor end-to-end paths over multiple domains; and 4) security policies and enforcement mechanisms tailored for high-performance environments. Despite the increasingly important role of Science DMZs, the literature is still missing a guideline to provide researchers and other professionals with the knowledge to broaden the understanding and development of Science DMZs. This paper addresses this gap by presenting a comprehensive tutorial on Science DMZs. The tutorial reviews fundamental network concepts that have a large impact on Science DMZs, such as router architecture, TCP attributes, and operational security. Then, the tutorial delves into protocols and devices at different layers, from the physical cyberinfrastructure to application-layer tools and security appliances, that must be carefully considered for the optimal operation of Science DMZs. This paper also contrasts Science DMZs with general-purpose networks, and presents empirical results and use cases applicable to current and future Science DMZs.  more » « less
Award ID(s):
1829698
NSF-PAR ID:
10119181
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE Communications surveys and tutorials
Volume:
21
Issue:
2
ISSN:
1553-877X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Science DMZs are specialized networks that enable large-scale distributed scientific research, providing efficient and guaranteed performance while transferring large amounts of data at high rates. The high-speed performance of a Science DMZ is made viable via data transfer nodes (DTNs), therefore they are a critical point of failure. DTNs are usually monitored with network intrusion detection systems (NIDS). However, NIDS do not consider system performance data, such as network I/O interrupts and context switches, which can also be useful in revealing anomalous system performance potentially arising due to external network based attacks or insider attacks. In this paper, we demonstrate how system performance metrics can be applied towards securing a DTN in a Science DMZ network. Specifically, we evaluate the effectiveness of system performance data in detecting TCP-SYN flood attacks on a DTN using DBSCAN (a density-based clustering algorithm) for anomaly detection. Our results demonstrate that system interrupts and context switches can be used to successfully detect TCP-SYN floods, suggesting that system performance data could be effective in detecting a variety of attacks not easily detected through network monitoring alone. 
    more » « less
  2. This paper describes the deployment of a private cloud and the development of virtual laboratories and companion material to teach and train engineering students and Information Technology (IT) professionals in high-throughput networks and cybersecurity. The material and platform, deployed at the University of South Carolina, are also used by other institutions to support regular academic courses, self-pace training of professional IT staff, and workshops across the country. The private cloud is used to deploy scenarios consisting of high-speed networks (up to 50 Gbps), multi-domain environments emulating internetworks, and infrastructures under cyber-attacks using live traffic. For regular academic courses, the virtual laboratories have been adopted by institutions in different states to supplement theoretical material with hands-on activities in IT, electrical engineering, and computer science programs. Topics include Local Area Networks (LANs), congestion-control algorithms, performance tools used to emulate wide area networks (WANs) and their attributes (packet loss, reordering, corruption, latency, jitter, etc.), data transfer applications for high-speed networks, queueing delay and buffer size in routers and switches, active monitoring of multi-domain systems, high-performance cybersecurity tools such as Zeek’s intrusion detection systems, and others. The training platform has been also used by IT professionals from more than 30 states, for self-pace training. The material provides training on topics beyond general-purpose network, which are usually overlooked by practitioners and researchers. The virtual laboratories and companion material have also been used in workshops organized across the country. Workshops are co-organized with organizations that operate large backbone networks connecting research centers and national laboratories, and colleges and universities conducting teaching and research activities. 
    more » « less
  3. HPC networks and campus networks are beginning to leverage various levels of network programmability ranging from programmable network configuration (e.g., NETCONF/YANG, SNMP, OF-CONFIG) to software-based controllers (e.g., OpenFlow Controllers) to dynamic function placement via network function virtualization (NFV). While programmable networks offer new capabilities, they also make the network more difficult to debug. When applications experience unexpected network behavior, there is no established method to investigate the cause in a programmable network and many of the conventional troubleshooting debugging tools (e.g., ping and traceroute) can turn out to be completely useless. This absence of troubleshooting tools that support programmability is a serious challenge for researchers trying to understand the root cause of their networking problems. This paper explores the challenges of debugging an all-campus science DMZ network that leverages SDN-based network paths for high-performance flows. We propose Flow Tracer, a light-weight, data-plane-based debugging tool for SDN-enabled networks that allows end users to dynamically discover how the network is handling their packets. In particular, we focus on solving the problem of identifying an SDN path by using actual packets from the flow being analyzed as opposed to existing expensive approaches where either probe packets are injected into the network or actual packets are duplicated for tracing purposes. Our simulation experiments show that Flow Tracer has negligible impact on the performance of monitored flows. Moreover, our tool can be extended to obtain further information about the actual switch behavior, topology, and other flow information without privileged access to the SDN control plane. 
    more » « less
  4. The Internet of Things (IoT) is a network of sensors that helps collect data 24/7 without human intervention. However, the network may suffer from problems such as the low battery, heterogeneity, and connectivity issues due to the lack of standards. Even though these problems can cause several performance hiccups, security issues need immediate attention because hackers access vital personal and financial information and then misuse it. These security issues can allow hackers to hijack IoT devices and then use them to establish a Botnet to launch a Distributed Denial of Service (DDoS) attack. Blockchain technology can provide security to IoT devices by providing secure authentication using public keys. Similarly, Smart Contracts (SCs) can improve the performance of the IoT–blockchain network through automation. However, surveyed work shows that the blockchain and SCs do not provide foolproof security; sometimes, attackers defeat these security mechanisms and initiate DDoS attacks. Thus, developers and security software engineers must be aware of different techniques to detect DDoS attacks. In this survey paper, we highlight different techniques to detect DDoS attacks. The novelty of our work is to classify the DDoS detection techniques according to blockchain technology. As a result, researchers can enhance their systems by using blockchain-based support for detecting threats. In addition, we provide general information about the studied systems and their workings. However, we cannot neglect the recent surveys. To that end, we compare the state-of-the-art DDoS surveys based on their data collection techniques and the discussed DDoS attacks on the IoT subsystems. The study of different IoT subsystems tells us that DDoS attacks also impact other computing systems, such as SCs, networking devices, and power grids. Hence, our work briefly describes DDoS attacks and their impacts on the above subsystems and IoT. For instance, due to DDoS attacks, the targeted computing systems suffer delays which cause tremendous financial and utility losses to the subscribers. Hence, we discuss the impacts of DDoS attacks in the context of associated systems. Finally, we discuss Machine-Learning algorithms, performance metrics, and the underlying technology of IoT systems so that the readers can grasp the detection techniques and the attack vectors. Moreover, associated systems such as Software-Defined Networking (SDN) and Field-Programmable Gate Arrays (FPGA) are a source of good security enhancement for IoT Networks. Thus, we include a detailed discussion of future development encompassing all major IoT subsystems. 
    more » « less
  5. null (Ed.)
    The science DMZ is a specialized network model developed to guarantee secure and efficient transfer of data for large-scale distributed research. To enable a high level of performance, the Science DMZ includes dedicated data transfer nodes (DTNs). Protecting these DTNs is crucial to maintaining the overall security of the network and the data, and insider attacks are a major threat. Although some limited network intrusion detection systems (NIDS) are deployed to monitor DTNs, this alone is not sufficient to detect insider threats. Monitoring for abnormal system behavior, such as unusual sequences of system calls, is one way to detect insider threats. However, the relatively predictable behavior of the DTN suggests that we can also detect unusual activity through monitoring system performance, such as CPU and disk usage, along with network activity. In this paper, we introduce a potential insider attack scenario, and show how readily available system performance metrics can be employed to detect data tampering within DTNs, using DBSCAN clustering to actively monitor for unexpected behavior. 
    more » « less