skip to main content


Title: COMET: Distributed Metadata Service for Multi-cloud Experiments
A majority of today's cloud services are independently operated by individual cloud service providers. In this approach, the locations of cloud resources are strictly constrained by the distribution of cloud service providers' sites. As the popularity and scale of cloud services increase, we believe this traditional paradigm is about to change toward further federated services, a.k.a., multi-cloud, due to the improved performance, reduced cost of compute, storage and network resources, as well as increased user demands. In this paper, we present COMET, a lightweight, distributed storage system for managing metadata on large scale, federated cloud infrastructure providers, end users, and their applications (e.g. HTCondor Cluster or Hadoop Cluster). We showcase use case from NSF's, Chameleon, ExoGENI and JetStream research cloud testbeds to show the effectiveness of COMET design and deployment.  more » « less
Award ID(s):
1826997
NSF-PAR ID:
10158250
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2019 IEEE 27th International Conference on Network Protocols (ICNP)
Page Range / eLocation ID:
1 to 2
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Developments in large scale computing environments have led to design of workflows that rely on containers and analytics platform that are well supported by the commercial cloud. The National Science Foundation also envisions a future in science and engineering that includes commercial cloud service providers (CSPs) such as Amazon Web Services, Azure and Google Cloud. These twin forces have made researchers consider the commercial cloud as an alternative option to current high performance computing (HPC) environments. Training and knowledge on how to migrate workflows, cost control, data management, and system administration remain some of the commonly listed concerns with adoption of cloud computing. In an effort to ameliorate this situation, CSPs have developed online and in-person training platforms to help address this problem. Scalability, ability to impart knowledge, evaluating knowledge gain, and accreditation are the core concepts that have driven this approach. Here, we present a review of our experience using Google’s Qwiklabs online platform for remote and in-person training from the perspective of a HPC user. For this study, we completed over 50 online courses, earned five badges and attended a one-day session. We identify the strengths of the approach, identify avenues to refine them, and consider means to further community engagement. We further evaluate the readiness of these resources for a cloud-curious researcher who is familiar with HPC. Finally, we present recommendations on how the large scale computing community can leverage these opportunities to work with CSPs to assist researchers nationally and at their home institutions. 
    more » « less
  2. While permissioned blockchains enable a family of data center applications, existing systems suffer from imbalanced loads across compute and memory, exacerbating the underutilization of cloud resources. This paper presents FlexChain , a novel permissioned blockchain system that addresses this challenge by physically disaggregating CPUs, DRAM, and storage devices to process different blockchain workloads efficiently. Disaggregation allows blockchain service providers to upgrade and expand hardware resources independently to support a wide range of smart contracts with diverse CPU and memory demands. Moreover, it ensures efficient resource utilization and hence prevents resource fragmentation in a data center. We have explored the design of XOV blockchain systems in a disaggregated fashion and developed a tiered key-value store that can elastically scale its memory and storage. Our design significantly speeds up the execution stage. We have also leveraged several techniques to parallelize the validation stage in FlexChain to further improve the overall blockchain performance. Our evaluation results show that FlexChain can provide independent compute and memory scalability, while incurring at most 12.8% disaggregation overhead. FlexChain achieves almost identical throughput as the state-of-the-art distributed approaches with significantly lower memory and CPU consumption for compute-intensive and memory-intensive workloads respectively. 
    more » « less
  3. In the last few years, Cloud computing technology has benefited many organizations that have embraced it as a basis for revamping the IT infrastructure. Cloud computing utilizes Internet capabilities in order to use other computing resources. Amazon Web Services (AWS) is one of the most widely used cloud providers that leverages the endless computing capabilities that the cloud technology has to offer. AWS is continuously evolving to offer a variety of services, including but not limited to, infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service. Among the other important services offered by AWS is Video Surveillance as a Service (VSaaS) that is a hosted cloud-based video surveillance service. Even though this technology is complex and widely used, some security experts have pointed out that some of its vulnerabilities can be exploited in launching attacks aimed at cloud technologies. In this paper, we present a holistic security analysis of cloud-based video surveillance systems by examining the vulnerabilities, threats, and attacks that these technologies are susceptible to. We illustrate our findings by implementing several of these attacks on a test bed representing an AWS-based video surveillance system. The main contributions of our paper are: (1) we provided a holistic view of the security model of cloud based video surveillance summarizing the underlying threats, vulnerabilities and mitigation techniques (2) we proposed a novel taxonomy of attacks targeting such systems (3) we implemented several related attacks targeting cloud-based video surveillance system based on an AWS test environment and provide some guidelines for attack mitigation. The outcome of the conducted experiments showed that the vulnerabilities of the Internet Protocol (IP) and other protocols granted access to unauthorized VSaaS files. We aim that our proposed work on the security of cloud-based video surveillance systems will serve as a reference for cybersecurity researchers and practitioners who aim to conduct research in this field. 
    more » « less
  4. Abstract

    In current infrastructure-as-a service (IaaS) cloud services, customers are charged for the usage of computing/storage resources only, but not the network resource. The difficulty lies in the fact that it is nontrivial to allocate network resource to individual customers effectively, especially for short-lived flows, in terms of both performance and cost, due to highly dynamic environments by flows generated by all customers. To tackle this challenge, in this paper, we propose an end-to-end Price-Aware Congestion Control Protocol (PACCP) for cloud services. PACCP is a network utility maximization (NUM) based optimal congestion control protocol. It supports three different classes of services (CoSes), i.e., best effort service (BE), differentiated service (DS), and minimum rate guaranteed (MRG) service. In PACCP, the desired CoS or rate allocation for a given flow is enabled by properly setting a pair of control parameters, i.e., a minimum guaranteed rate and a utility weight, which in turn, determines the price paid by the user of the flow. Two pricing models, i.e., a coarse-grained VM-Based Pricing model (VBP) and a fine-grained Flow-Based Pricing model (FBP), are proposed. The optimality of PACCP is verified by both large scale simulation and small testbed implementation. The price-performance consistency of PACCP are evaluated using real datacenter workloads. The results demonstrate that PACCP provides minimum rate guarantee, high bandwidth utilization and fair rate allocation, commensurate with the pricing models.

     
    more » « less
  5. null (Ed.)
    Internet of Things (IoT) devices are becoming increasingly prevalent in our environment, yet the process of programming these devices and processing the data they produce remains difficult. Typically, data is processed on device, involving arduous work in low level languages, or data is moved to the cloud, where abundant resources are available for Functions as a Service (FaaS) or other handlers. FaaS is an emerging category of flexible computing services, where developers deploy self-contained functions to be run in portable and secure containerized environments; however, at the moment, these functions are limited to running in the cloud or in some cases at the "edge" of the network using resource rich, Linux-based systems. In this work, we investigate NanoLambda, a portable platform that brings FaaS, high-level language programming, and familiar cloud service APIs to non-Linux and microcontroller-based IoT devices. To enable this, NanoLambda couples a new, minimal Python runtime system that we have designed for the least capable end of the IoT device spectrum, with API compatibility for AWS Lambda and S3. NanoLambda transfers functions between IoT devices (sensors, edge, cloud), providing power and latency savings while retaining the programmer productivity benefits of high-level languages and FaaS. A key feature of NanoLambda is a scheduler that intelligently places function executions across multi-scale IoT deployments according to resource availability and power constraints. We evaluate a range of applications that use NanoLambda to run on devices as small as the ESP8266 with 64KB of ram and 512KB flash storage. 
    more » « less