skip to main content


Title: Dynamically Creating Custom SDN High-Speed Network Paths for Big Data Science Flows
Existing campus network infrastructure is not designed to effectively handle the transmission of big data sets. Performance degradation in these networks is often caused by middleboxes -- appliances that enforce campus-wide policies by deeply inspecting all traffic going through the network (including big data transmissions). We are developing a Software-Defined Networking (SDN) solution for our campus network that grants privilege to science flows by dynamically calculating routes that bypass certain middleboxes to avoid the bottlenecks they create. Using the global network information provided by an SDN controller, we are developing graph databases approaches to compute custom paths that not only bypass middleboxes to achieve certain requirements (e.g., latency, bandwidth, hop-count) but also insert rules that modify packets hop-by-hop to create the illusion of standard routing/forward despite the fact that packets are being rerouted. In some cases, additional functionality needs to be added to the path using network function virtualization (NFV) techniques (e.g., NAT). To ensure that path computations are run on an up-to-date snapshot of the topology, we introduce a versioning mechanism that allows for lazy topology updates that occur only when "important" network changes take place and are requested by big data flows.  more » « less
Award ID(s):
1642134 1541426 1541380
NSF-PAR ID:
10039957
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Practice & Experience in Advanced Research Computing Conference (PEARC 2017)
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. HPC networks and campus networks are beginning to leverage various levels of network programmability ranging from programmable network configuration (e.g., NETCONF/YANG, SNMP, OF-CONFIG) to software-based controllers (e.g., OpenFlow Controllers) to dynamic function placement via network function virtualization (NFV). While programmable networks offer new capabilities, they also make the network more difficult to debug. When applications experience unexpected network behavior, there is no established method to investigate the cause in a programmable network and many of the conventional troubleshooting debugging tools (e.g., ping and traceroute) can turn out to be completely useless. This absence of troubleshooting tools that support programmability is a serious challenge for researchers trying to understand the root cause of their networking problems. This paper explores the challenges of debugging an all-campus science DMZ network that leverages SDN-based network paths for high-performance flows. We propose Flow Tracer, a light-weight, data-plane-based debugging tool for SDN-enabled networks that allows end users to dynamically discover how the network is handling their packets. In particular, we focus on solving the problem of identifying an SDN path by using actual packets from the flow being analyzed as opposed to existing expensive approaches where either probe packets are injected into the network or actual packets are duplicated for tracing purposes. Our simulation experiments show that Flow Tracer has negligible impact on the performance of monitored flows. Moreover, our tool can be extended to obtain further information about the actual switch behavior, topology, and other flow information without privileged access to the SDN control plane. 
    more » « less
  2. A key concept of software-defined networking (SDN) is separation of the control and data plane. This idea provides several benefits, including fine-grained network control and monitoring, and the ability to deploy new services in a limited scope. Unfortunately, it is often cost-prohibitive for enterprises (and universities in particular) to upgrade their existing networks to wholly SDN-capable networks all at once. A compromise solution is to deploy SDN capabilities incrementally in the network. The challenge then is to take full advantage of SDN-based services throughout the network, in an integrated fashion rather than in a few "islands" of SDN support. At the University of Kentucky, SDN has been integrated into the campus network for several years. In this paper, we describe two aspects of this challenge, along with our solution approaches. One is the general reluctance of campus network administrations to allow novel or experimental (SDN-based) services in the production network. The other is how to extend such services throughout the legacy part of the network. For the former, we lay out a set of principles designed to ensure that the production service is not harmed. For the latter, we use policy based routing and a graph database to extend our previously-described VIP Lanes service. Our simulation results in a campus-like topology testbed show that we can provide a host with custom path service even if it is connected to a legacy router. 
    more » « less
  3. Calandrino, Joseph A. ; Troncoso, Carmela (Ed.)
    The arms race between Internet freedom advocates and censors has catalyzed the emergence of sophisticated blocking techniques and directed significant research emphasis toward the development of automated censorship measurement and evasion tools based on packet manipulation. However, we observe that the probing process of censorship middleboxes using state-of-the-art evasion tools can be easily fingerprinted by censors, necessitating detection-resilient probing techniques. We validate our hypothesis by developing a real-time detection approach that utilizes Machine Learning (ML) to detect flow-level packet-manipulation and an algorithm for IP-level detection based on Threshold Random Walk (TRW). We then take the first steps toward detection-resilient censorship evasion by presenting DeResistor, a system that facilitates detection-resilient probing for packet-manipulation-based censorship-evasion. DeResistor aims to defuse detection logic employed by censors by performing detection-guided pausing of censorship evasion attempts and interleaving them with normal user-driven network activity. We evaluate our techniques by leveraging Geneva, a state-of-the-art evasion strategy generator, and validate them against 11 simulated censors supplied by Geneva, while also testing them against real-world censors (i.e., China’s Great Firewall (GFW), India and Kazakhstan). From an adversarial perspective, our proposed real-time detection method can quickly detect clients that attempt to probe censorship middle-boxes with manipulated packets after inspecting only two probing flows. From a defense perspective, DeResistor is effective at shielding Geneva training from detection while enabling it to narrow the search space to produce less detectable traffic. Importantly, censorship evasion strategies generated using DeResistor can attain a high success rate from different vantage points against the GFW (up to 98%) and 100% in India and Kazakhstan. Finally, we discuss detection countermeasures and extensibility of our approach to other censor-probing-based tools. 
    more » « less
  4. The emergence of big data has created new challenges for researchers transmitting big data sets across campus networks to local (HPC) cloud resources, or over wide area networks to public cloud services. Unlike conventional HPC systems where the network is carefully architected (e.g., a high speed local interconnect, or a wide area connection between Data Transfer Nodes), today's big data communication often occurs over shared network infrastructures with many external and uncontrolled factors influencing performance. This paper describes our efforts to understand and characterize the performance of various big data transfer tools such as rclone, cyberduck, and other provider-specific CLI tools when moving data to/from public and private cloud resources. We analyze the various parameter settings available on each of these tools and their impact on performance. Our experimental results give insights into the performance of cloud providers and transfer tools, and provide guidance for parameter settings when using cloud transfer tools. We also explore performance when coming from HPC DTN nodes as well as researcher machines located deep in the campus network, and show that emerging SDN approaches such as the VIP Lanes system can deliver excellent performance even from researchers' machines. 
    more » « less
  5. As a tool to infer the internal state of a network that cannot be measured directly (e.g., the Internet and all-optical networks), network tomography has been extensively studied under the assumption that the measurements truthfully reflect the end-to-end performance of measurement paths, which makes the resulting solutions vulnerable to manipulated measurements. In this work, we investigate the impact of manipulated measurements via a recently proposed attack model called the \emph{stealthy DeGrading of Service (DGoS) attack}, which aims at maximally degrading path performances without exposing the manipulated links to network tomography. While existing studies on this attack assume that network tomography only measures the paths actively used for data transfer (by passively recording the performance of data packets), our model allows network tomography to measure a larger set of paths, e.g., by sending probes on some paths not carrying data flows. By developing and analyzing the optimal attack strategy, we quantify the maximum damage of such an attack and shed light on possible defenses. 
    more » « less