In industrial applications, Machine Learning (ML) services are often deployed on cloud infrastructure and require a transfer of the input data over a network, which is susceptible to Quality of Service (QoS) degradation. In this paper we investigate the robustness of industrial ML classifiers towards varying Data Quality (DQ) due to degradation in network QoS. We define the robustness of an ML model as the ability to maintain a certain level of performance under variable levels of DQ at its input. We employ the classification accuracy as the performance metric for the ML classifiers studied. The POWDER testbed is utilized to create an experimental setup consisting of a real-world wireless network connecting two nodes. We transfer multiple video and image files between the two nodes under varying degrees of packet loss and varying buffer sizes to create degraded data. We then evaluate the performance of AWS Rekognition, a commercial ML tool for on-demand object detection, on corrupted video and image data. We also evaluate the performance of YOLOv7 to compare the performance of a commercial and an open-source model. As a result we demonstrate that even a slight degree of packet loss, 1% for images and 0.2% for videos, can have a drastic impact on the classification performance of the system. We discuss the possible ways to make industrial ML systems more robust to network QoS degradation.
more »
« less
Judicious QoS using Cloud Overlays
We revisit the long-standing problem of providing network QoS to applications, and propose the concept of judicious QoS -- combining the cheaper, best effort IP service with the cloud, which offers a highly reliable infrastructure and the ability to add in-network services, albeit at higher cost. Our proposed J-QoS framework offers a range of reliability services with different cost vs. delay trade-offs, including: i) a forwarding service that forwards packets over the cloud overlay, ii) a caching service, which stores packets inside the cloud and allows them to be pulled in case of packet loss or disruption on the Internet, and iii) a novel coding service that provides the least expensive packet recovery option by combining packets of multiple application streams and sending a small number of coded packets across the more expensive cloud paths. We demonstrate the feasibility of these services using measurements from RIPE Atlas and a live deployment on PlanetLab. We also consider case studies on how J-QoS works with services up and down the network stack, including Skype video conferencing, TCP-based web transfers and cellular access networks.
more »
« less
- Award ID(s):
- 1815016
- PAR ID:
- 10198309
- Date Published:
- Journal Name:
- Computer communication review
- ISSN:
- 0146-4833
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A majority of today's cloud services are independently operated by individual cloud service providers. In this approach, the locations of cloud resources are strictly constrained by the distribution of cloud service providers' sites. As the popularity and scale of cloud services increase, we believe this traditional paradigm is about to change toward further federated services, a.k.a., multi-cloud, due to the improved performance, reduced cost of compute, storage and network resources, as well as increased user demands. In this paper, we present COMET, a lightweight, distributed storage system for managing metadata on large scale, federated cloud infrastructure providers, end users, and their applications (e.g. HTCondor Cluster or Hadoop Cluster). We showcase use case from NSF's, Chameleon, ExoGENI and JetStream research cloud testbeds to show the effectiveness of COMET design and deployment.more » « less
-
Cloud virtualization and multi-tenant networking provide Infrastructure as a Service (IaaS) providers a new and innovative way to offer on-demand services to their customers, such as easy provisioning of new applications and better resource efficiency and scalability. However, existing data-intensive intelligent applications require more powerful processors, higher bandwidth and lower-latency networking service. In order to boost the performance of computing and networking services, as well as reduce the overhead of software virtualization, we propose a new data center network design based on OpenStack. Specifically, we map the OpenStack networking services to the hardware switch and utilize hardware-accelerated L2 switch and L3 routing to solve the software limitations, as well as achieve software-like scalability and flexibility. We design our prototype system via the Arista Software-Defined-Networking (SDN) switch and provide an automatic script which abstracts the service layer that decouples OpenStack from the physical network infrastructure, thereby providing vendor-independence. We have evaluated the performance improvement in terms of bandwidth, delay, and system resource utilization using various tools and under various Quality-of-Service (QoS) constraints. Our solution demonstrates improved cloud scaling and network efficiency via only one touch point to control all vendors' devices in the data center.more » « less
-
With the advent of 5G, supporting high-quality game streaming applications on edge devices has become a reality. This is evidenced by a recent surge in cloud gaming applications on mobile devices. In contrast to video streaming applications, interactive games require much more compute power for supporting improved rendering (such as 4K streaming) with the stipulated frames-per second (FPS) constraints. This in turn consumes more battery power in a power-constrained mobile device. Thus, the state-of-the-art gaming applications suffer from lower video quality (QoS) and/or energy efficiency. While there has been a plethora of recent works on optimizing game streaming applications, to our knowledge, there is no study that systematically investigates themore » « less
design pairs on the end-to-end game streaming pipeline across the cloud, network, and edge devices to understand the individual contributions of the different stages of the pipeline for improving the overall QoS and energy efficiency. In this context, this paper presents a comprehensive performance and power analysis of the entire game streaming pipeline consisting of the server/cloud side, network, and edge. Through extensive measurements with a high-end workstation mimicking the cloud end, an open-source platform (Moonlight-GameStreaming) emulating the edge device/mobile platform, and two network settings (WiFi and 5G) we conduct a detailed measurement-based study with seven representative games with different characteristics. We characterize the performance in terms of frame latency, QoS, bitrate, and energy consumption for different stages of the gaming pipeline. Our study shows that the rendering stage and the encoding stage at the cloud end are the bottlenecks to support 4K streaming. While 5G is certainly more suitable for supporting enhanced video quality with 4K streaming, it is more expensive in terms of power consumption compared to WiFi. Further, fluctuations in 5G network quality can lead to huge frame drops thus affecting QoS, which needs to be addressed by a coordinated design between the edge device and the server. Finally, the network interface and the decoder units in a mobile platform need more energy-efficient design to support high quality games at a lower cost. These observations should help in designing more cost-effective future cloud gaming platforms. -
Traditionally, network monitoring and analytics systems rely on aggregation (e.g., flow records) or sampling to cope with high packet rates. This has the downside that, in doing so, we lose data granularity and accu- racy, and, in general, limit the possible network analytics we can perform. Recent proposals leveraging software- defined networking or programmable hardware provide more fine-grained, per-packet monitoring but are still based on the fundamental principle of data reduction in the network, before analytics. In this paper, we pro- vide a first step towards a cloud-scale, packet-level mon- itoring and analytics system based on stream processing entirely in software. Software provides virtually unlim- ited programmability and makes modern ( e.g.,machine-learning) network analytics applications possible. We identify unique features of network analytics applica- tions which enable the specialization of stream process- ing systems. As a result, an evaluation with our pre- liminary implementation shows that we can scale up to several million packets per second per core and together with load balancing and further optimizations, the vision of cloud-scale per-packet network analytics is possible.more » « less