skip to main content


Title: FT-VMP: Fault-Tolerant Virtual Machine Placement in Cloud Data Centers
Virtual machine (VM) replication is an effective technique in cloud data centers to achieve fault-tolerance, load-balance, and quick-responsiveness to user requests. In this paper we study a new fault-tolerant VM placement problem referred to as FT-VMP. Given that different VM has different fault-tolerance requirement (i.e., different VM requires different number of replica copies) and compatibility requirement (i.e., some VMs and their replicas cannot be placed into some physical machines (PMs) due to software or platform incompatibility), FT-VMP studies how to place VM replica copies inside cloud data centers in order to minimize the number of PMs storing VM replicas, under the constraints that i) for fault-tolerant purpose, replica copies of the same VM cannot be placed inside the same PM and ii) each PM has a limited amount of storage capacity. We first prove that FT-VMP is NP-hard. We then design an integer linear programming (ILP)-based algorithm to solve it optimally. As ILP takes time to compute thus is not suitable for large scale cloud data centers, we design a suite of efficient and scalable heuristic fault-tolerant VM placement algorithms. We show that a) ILP-based algorithm outperforms the state-of-the-art VM replica placement in a wide range of network dynamics and b) that all our fault-tolerant VM placement algorithms are able to turn off significant number of PMs to save energy in cloud data centers. In particular, we show that our algorithms can consolidate (i.e., turn off) around 100 PMs in a small data center of 256 PMs and 700 PMs in a large data center of 1028PMs.  more » « less
Award ID(s):
1911191
NSF-PAR ID:
10174515
Author(s) / Creator(s):
;
Date Published:
Journal Name:
International Conference on Computer Communications and Networks (ICCCN 2020)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Fault-tolerant virtual machine (VM) placement refers to the process of placing multiple copies of the same VM cloud application inside cloud data centers. The challenge is how to place required number of VM replicas while minimizing the number of physical machines (PMs) that store them, in order to save energy consumption of cloud data centers. We refer to it as fault-tolerant VM placement problem. In our previous work, we have proposed a greedy algorithm to solve this problem. In this paper, we compare it with an existing research that is based on well-known Welsh Powell Graph-Coloring Algorithm to place items into bins while considering the conflicts between items and items and items and bins. Via extensive simulations, we show that our greedy algorithm can turn off 40-50% more PMs than existing work and can place upto four times as many VM replicas as existing work, achieving much stronger fault-tolerance with less energy consumption. We also compare both algorithms with the optimal integer lin- ear programming (ILP)-based algorithm, which serves as the benchmark of the comparison. 
    more » « less
  2. Service function chaining (SFC), consisting of a sequence of virtual network functions (VNFs) (i.e., firewalls and load balancers), is an effective service provision technique in modern data center networks. By requiring cloud user traffic to traverse the VNFs in order, SFC im- proves the security and performance of the cloud user applications. In this paper, we study how to place an SFC inside a data center to mini- mize the network traffic of the virtual machine (VM) communication. We take a cooperative multi-agent reinforcement learning approach, wherein multiple agents collaboratively figure out the traffic-efficient route for the VM communication. Underlying the SFC placement is a fundamental graph-theoretical prob- lem called the k-stroll problem. Given a weighted graph G(V, E), two nodes s, t ∈ V , and an integer k, the k-stroll problem is to find the shortest path from s to t that visits at least k other nodes in the graph. Our work is the first to take a multi-agent learning approach to solve k- stroll problem. We compare our learning algorithm with an optimal and exhaustive algorithm and an existing dynamic programming(DP)-based heuristic algorithm. We show that our learning algorithm, although lack- ing the complete knowledge of the network assumed by existing research, delivers comparable or even better VM communication time while taking two orders of magnitude of less execution time. 
    more » « less
  3. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  4. We propose a new algorithmic framework for traffic-optimal virtual network function (VNF) placement and migration for policy-preserving data centers (PPDCs). As dy- namic virtual machine (VM) traffic must traverse a sequence of VNFs in PPDCs, it generates more network traffic, consumes higher bandwidth, and causes additional traffic delays than a traditional data center. We design optimal, approximation, and heuristic traffic-aware VNF placement and migration algorithms to minimize the total network traffic in the PPDC. In particular, we propose the first traffic-aware constant-factor approximation algorithm for VNF placement, a Pareto-optimal solution for VNF migration, and a suite of efficient dynamic-programming (DP)-based heuristics that further improves the approximation solution. At the core of our framework are two new graph- theoretical problems that have not been studied. Using flow characteristics found in production data centers and realistic traffic patterns, we show that a) our VNF migration techniques are effective in mitigating dynamic traffic in PPDCs, reducing the total traffic cost by up to 73%, b) our VNF placement algorithms yield traffic costs 56% to 64% smaller than those by existing techniques, and c) our VNF migration algorithms outperform the state-of-the-art VM migration algorithms by up to 63% in reducing dynamic network traffic. 
    more » « less
  5. null (Ed.)
    Abstract Lipid structures affect membrane biophysical properties such as thickness, stability, permeability, curvature, fluidity, asymmetry, and interdigitation, contributing to membrane function. Sphingolipids are abundant in plant endomembranes and plasma membranes (PMs) and comprise four classes: ceramides, hydroxyceramides, glucosylceramides, and glycosylinositolphosphoceramides (GIPCs). They constitute an array of chemical structures whose distribution in plant membranes is unknown. With the aim of describing the hydrophobic portion of sphingolipids, 18 preparations from microsomal (MIC), vacuolar (VM), PM, and detergent-resistant membranes (DRM) were isolated from Arabidopsis (Arabidopsis thaliana) leaves. Sphingolipid species, encompassing pairing of long-chain bases and fatty acids, were identified and quantified in these membranes. Sphingolipid concentrations were compared using univariate and multivariate analysis to assess sphingolipid diversity, abundance, and predominance across membranes. The four sphingolipid classes were present at different levels in each membrane: VM was enriched in glucosylceramides, hydroxyceramides, and GIPCs; PM in GIPCs, in agreement with their key role in signal recognition and sensing; and DRM in GIPCs, as reported by their function in nanodomain formation. While a total of 84 sphingolipid species was identified in MIC, VM, PM, and DRM, only 34 were selectively distributed in the four membrane types. Conversely, every membrane contained a different number of predominant species (11 in VM, 6 in PM, and 17 in DRM). This study reveals that MIC, VM, PM, and DRM contain the same set of sphingolipid species but every membrane source contains its own specific assortment based on the proportion of sphingolipid classes and on the predominance of individual species. 
    more » « less