Disaggregated memory is being proposed as a way to provide efficient memory scaling for data intensive applications. High performance interconnect technologies, such as CXL, make disaggregated, fabric-attached-memory (FAM) a viable secondary tier of memory. Previous work on remote memory relies on extending kernel level paging to utilize FAM as an additional storage tier after local memory. These approaches have the advantage of exposing remote memory in application transparent ways that do not require code changes, but they incur large overheads due to the mismatch between the abstraction of a flat virtual address space and the reality of the tiered nature of FAM. In this paper, we present an alternative approach to remote memory based on application-specific objects. We design FAM-Graph - a semi-external graph processing system that leverages application-level properties, such as read only edge data, to efficiently tier data between local and remote memory, and prefetch remote data for local computation. Using several graph algorithms and datasets, we demonstrate that FAM-Graph achieves end-to-end performance within factors of 1–6× of Galois, the state of the art shared memory graph processing system, while using up to 20× less local memory. When Galois is used in conjunction with an OS-level FAM solution, we show that FAM-Graph achieves better end-to-end performance by up to 9× when both systems are configured with the same amount of local memory.
more »
« less
Sinfonia: Cross-tier orchestration for edge-native applications
The convergence of 5G wireless networks and edge computing enables new edge-native applications that are simultaneously bandwidth-hungry, latency-sensitive, and compute-intensive. Examples include deeply immersive augmented reality, wearable cognitive assistance, privacy-preserving video analytics, edge-triggered serendipity, and autonomous swarms of featherweight drones. Such edge-native applications require network-aware and load-aware orchestration of resources across the cloud (Tier-1), cloudlets (Tier-2), and device (Tier-3). This paper describes the architecture of Sinfonia, an open-source system for such cross-tier orchestration. Key attributes of Sinfonia include: support for multiple vendor-specific Tier-1 roots of orchestration, providing end-to-end runtime control that spans technical and non-technical criteria; use of third-party Kubernetes clusters as cloudlets, with unified treatment of telco-managed, hyperconverged, and just-in-time variants of cloudlets; masking of orchestration complexity from applications, thus lowering the barrier to creation of new edge-native applications. We describe an initial release of Sinfonia ( https://github.com/cmusatyalab/sinfonia ), and share our thoughts on evolving it in the future.
more »
« less
- Award ID(s):
- 2106862
- PAR ID:
- 10409699
- Date Published:
- Journal Name:
- Frontiers in the Internet of Things
- Volume:
- 1
- ISSN:
- 2813-3110
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Upcoming AI-based and 5G applications are demanding new network management approaches that are capable to cope with unprecedented levels of flexibility, scalability and energy efficiency. In order to make these use cases tangible and feasible, network management solutions aim to rely on multi-domain, multi-tier architectures that permit complex end-to-end orchestration of network resources. However, current research on scheduling functions and task-offloading algorithms often focus on one single-domain, and the exploration of large-scale inter-operable solutions becomes a challenge. Fortunately for the networking research community, a number of available testing facilities deployed at different geographical location along the world can be integrated to be used as a single joint multi-domain infrastructure. In this demo paper, we present a hands-off experience of how to integrate different high-performance testbeds, located in USA, Belgium and The Netherlands, in order to enable multi-domain large-scale experimentation. We demonstrate end-to-end performance characteristics of the testbed integration and we describe the main takeaways and lessons learned to drive researchers towards successful deployments in such end-to-end global infrastructure.more » « less
-
Network slicing is a promising technology that allows mobile network operators to efficiently serve various emerging use cases in 5G. It is challenging to optimize the utilization of network infrastructures while guaranteeing the performance of network slices according to service level agreements (SLAs). To solve this problem, we propose SafeSlicing that introduces a new constraint-aware deep reinforcement learning (CaDRL) algorithm to learn the optimal resource orchestration policy within two steps, i.e., offline training in a simulated environment and online learning with the real network system. On optimizing the resource orchestration, we incorporate the constraints on the statistical performance of slices in the reward function using Lagrangian multipliers and solve the Lagrangian relaxed problem via a policy network. To satisfy the constraints on the system capacity, we design a constraint network to map the latent actions generated from the policy network to the orchestration actions such that the total resources allocated to network slices do not exceed the system capacity. We prototype SafeSlicing on an end-to-end testbed developed by using OpenAirInterface LTE, OpenDayLight-based SDN, and CUDA GPU computing platform. The experimental results show that SafeSlicing reduces more than 20% resource usage while meeting SLAs of network slices as compared with other solutions.more » « less
-
Data-intensive augmented information (AgI) services (e.g., metaverse applications such as virtual/augmented reality), designed to deliver highly interactive experiences resulting from the real-time combination of live data-streams and pre-stored digital content, are accelerating the need for distributed compute platforms with unprecedented storage, computation, and communication requirements. To this end, the integrated evolution of next-generation networks (5G/6G) and distributed cloud technologies (mobile/edge/cloud computing) have emerged as a promising paradigm to address the interaction- and resource-intensive nature of data-intensive AgI services. In this paper, we focus on the design of control policies for the joint orchestration of compute, caching, and communication (3C) resources in next-generation 3C networks for the delivery of data-intensive AgI services. We design the first throughput-optimal control policy that coordinates joint decisions around (i) routing paths and processing locations for live data streams, with (ii) cache selection and distribution paths for associated data objects. We then extend the proposed solution to include a max-throughput data placement policy and two efficient replacement policies. Numerical results demonstrate the superior performance obtained via the novel multi-pipeline flow control and 3C resource orchestration mechanisms of the proposed policy, compared with state-of-the-art algorithms that lack full 3C integrated control.more » « less
-
The applicability of the microservice architecture has extended beyond traditional web services, making steady inroads into the domains of IoT and edge computing. Due to dissimilar contexts in different execution environments and inherent mobility, edge and IoT applications suffer from low execution reliability. Replication, traditionally used to increase service reliability and scalability, is inapplicable in these resourcescarce environments. Alternately, programmers can orchestrate the parallel or sequential execution of equivalent microservices— microservices that provide the same functionality by different means. Unfortunately, the resulting orchestrations rely on parallelization, synchronization, and failure handing, all tedious and error-prone to implement. Although automated orchestration shifts the burden of generating workflows from the programmer to the compiler, existing programming models lack both syntactic and semantic support for equivalence. In this paper, we enhance compiler-generated execution orchestration with equivalence to efficiently increase reliability. We introduce a dataflow-based domain-specific language, whose dataflow specifications include the implicit declarations of equivalent microservices and their execution patterns. To automatically generate reliable workflows and execute them efficiently, we introduce new equivalence workflow constructs. Our evaluation results indicate that our solution can effectively and efficiently increase the reliability of microservice-based applications.more » « less
An official website of the United States government

