skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 9:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.


Title: GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing
Careful placement of a distributed computational application within a target device cluster is critical for achieving low application completion time. The problem is challenging due to its NP-hardness and combinatorial nature. In recent years, learning-based approaches have been proposed to learn a placement policy that can be applied to unseen applications, motivated by the problem of placing a neural network across cloud servers. These approaches, however, generally assume the device cluster is fixed, which is not the case in mobile or edge computing settings, where heterogeneous devices move in and out of range for a particular application. To address the challenge of scaling to different-sized device clusters and adapting to the addition of new devices, we propose a new learning approach called GiPH, which learns policies that generalize to dynamic device clusters via 1) a novel graph representation gpNet that efficiently encodes the information needed for choosing a good placement, and 2) a scalable graph neural network (GNN) that learns a summary of the gpNet information. GiPH turns the placement problem into that of finding a sequence of placement improvements, learning a policy for selecting this sequence that scales to problems of arbitrary size. We evaluate GiPH with a wide range of task graphs and device clusters and show that our learned policy rapidly finds good placements for new problem instances. GiPH finds placements that achieve up to 30.5% better makespan, searching up to 3× faster than other search-based placement policies.  more » « less
Award ID(s):
1645578
NSF-PAR ID:
10492333
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
MLSys
Date Published:
Journal Name:
MLSys
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Model predictive control (MPC) provides a useful means for controlling systems with constraints, but suffers from the computational burden of repeatedly solving an optimization problem in real time. Offline (explicit) solutions for MPC attempt to alleviate real time computational challenges using either multiparametric programming or machine learning. The multiparametric approaches are typically applied to linear or quadratic MPC problems, while learning-based approaches can be more flexible and are less memory-intensive. Existing learning-based approaches offer significant speedups, but the challenge becomes ensuring constraint satisfaction while maintaining good performance. In this paper, we provide a neural network parameterization of MPC policies that explicitly encodes the constraints of the problem. By exploring the interior of the MPC feasible set in an unsupervised learning paradigm, the neural network finds better policies faster than projection-based methods and exhibits substantially shorter solve times. We use the proposed policy to solve a robust MPC problem, and demonstrate the performance and computational gains on a standard test system. 
    more » « less
  2. models, it is difficult to fit and train a complete copy of the model on a single computational device with limited capability. Therefore, large neural networks are usually trained on a mixture of devices, including multiple CPUs and GPUs, of which the computational speed and efficiency are drastically affected by how these models are partitioned and placed on the devices. In this paper, we propose Mars, a novel design to find efficient placements for large models. Mars leverages a self-supervised graph neural network pre-training framework to generate node representations for operations, which is able to capture the topological properties of the computational graph. Then, a sequence-to-sequence neural network is applied to split large models into small segments so that Mars can predict the placements sequentially. Novel optimizations have been applied in the placer design to achieve the best possible performance in terms of the time needed to complete training the agent for placing models with very large sizes. We deployed and evaluated Mars on benchmarks involving Inception-V3, GNMT, and BERT models. Extensive experimental results show that Mars can achieve up to 27.2% and 2.7% speedup of per-step training time than the state-of-the-art for GNMT and BERT models, respectively. We also show that with self-supervised graph neural network pretraining, our design achieves the fastest speed in discovering the optimal placement for Inception-V3. 
    more » « less
  3. A crucial challenge for data-parallel clusters is achieving high application-level communication efficiency for structured traffic flows (a.k.a. Coflows) from distributed data processing applications. A range of recent works focus on designing network scheduling algorithms with predetermined Coflow placement, i.e. the endpoints of subflows within a Coflow are preset. However, the underlying Coflow placement problem and its decisive impact on scheduling efficiency have long been overlooked. It is hard to find good placements for Coflows. At the intra-Coflow level, constituent flows are related and therefore their placement decisions are dependent. Thus, strategies extended from flow-by-flow placement is sub-optimal due to negligence of the inter-flow relationship in a Coflow. At the inter-Coflow level, placing a new Coflow may introduce contentions with existing Coflows, which changes communication efficiency. This paper is the first to study the Coflow placement problem with careful considerations of the inter-flow relationship in Coflows. We formulate the Coflow placement problem and propose a Coflow placement algorithm. Under realistic traffic in various settings, our algorithm reduces the average completion time for Coflows by up to 26%. 
    more » « less
  4. Wallach, H (Ed.)
    We study the problem of programmatic reinforcement learning, in which policies are represented as short programs in a symbolic language. Programmatic policies can be more interpretable, generalizable, and amenable to formal verification than neural policies; however, designing rigorous learning approaches for such policies remains a challenge. Our approach to this challenge-a meta-algorithm called PROPEL-is based on three insights. First, we view our learning task as optimization in policy space, modulo the constraint that the desired policy has a programmatic representation, and solve this optimization problem using a form of mirror descent that takes a gradient step into the unconstrained policy space and then projects back onto the constrained space. Second, we view the unconstrained policy space as mixing neural and programmatic representations, which enables employing state-of-the-art deep policy gradient approaches. Third, we cast the projection step as program synthesis via imitation learning, and exploit contemporary combinatorial methods for this task. We present theoretical convergence results for PROPEL and empirically evaluate the approach in three continuous control domains. The experiments show that PROPEL can significantly outperform state-of-the-art approaches for learning programmatic policies. 
    more » « less
  5. We consider the problem of jammer placement to partition a wireless network, where the network nodes and jammers are located in the real plane. In previous research, we found optimal and suboptimal jammer placements by reducing the search space for the jammers to the locations of the network nodes. In this paper, we develop techniques to find optimal jammer placements over all possible jammer placements in the real plane. Our approach finds a set of candidate jammer locations (CJLs) such that a jammer-placement solution using the CJLs achieves the minimum possible cardinality among all possible jammer placements in the real plane. The CJLs can be used directly with the optimal and fast, suboptimal algorithms for jammer placement from our previous work. 
    more » « less